All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Alasdair Kergon <agk@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	dm-devel@lists.linux.dev, David Teigland <teigland@redhat.com>,
	Mike Snitzer <snitzer@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@lst.de>, Joe Thornber <ejt@redhat.com>
Subject: Re: [RFC 8/9] dm thin: add llseek(SEEK_HOLE/SEEK_DATA) support
Date: Wed, 3 Apr 2024 11:03:46 -0400	[thread overview]
Message-ID: <20240403150346.GH2524049@fedora> (raw)
In-Reply-To: <c4pit5qf3sgiynx3jcnngdj7d3m62c5fdsgmla7twxynh6wfai@7jvhgxya4xo6>

[-- Attachment #1: Type: text/plain, Size: 4107 bytes --]

On Thu, Mar 28, 2024 at 08:31:21PM -0500, Eric Blake wrote:
> On Thu, Mar 28, 2024 at 04:39:09PM -0400, Stefan Hajnoczi wrote:
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > Open issues:
> > - Locking?
> > - thin_seek_hole_data() does not run as a bio or request. This patch
> >   assumes dm_thin_find_mapped_range() synchronously performs I/O if
> >   metadata needs to be loaded from disk. Is that a valid assumption?
> > ---
> >  drivers/md/dm-thin.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 77 insertions(+)
> > 
> > diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> > index 4793ad2aa1f7e..3c5dc4f0fe8a3 100644
> > --- a/drivers/md/dm-thin.c
> > +++ b/drivers/md/dm-thin.c
> > @@ -4501,6 +4501,82 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits)
> >  	}
> >  }
> >  
> > +static dm_block_t loff_to_block(struct pool *pool, loff_t offset)
> > +{
> > +	sector_t offset_sectors = offset >> SECTOR_SHIFT;
> > +	dm_block_t ret;
> > +
> > +	if (block_size_is_power_of_two(pool))
> > +		ret = offset_sectors >> pool->sectors_per_block_shift;
> > +	else {
> > +		ret = offset_sectors;
> > +		(void) sector_div(ret, pool->sectors_per_block);
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +static loff_t block_to_loff(struct pool *pool, dm_block_t block)
> > +{
> > +	return block_to_sectors(pool, block) << SECTOR_SHIFT;
> > +}
> > +
> > +static loff_t thin_seek_hole_data(struct dm_target *ti, loff_t offset,
> > +		int whence)
> > +{
> > +	struct thin_c *tc = ti->private;
> > +	struct dm_thin_device *td = tc->td;
> > +	struct pool *pool = tc->pool;
> > +	dm_block_t begin;
> > +	dm_block_t end;
> > +	dm_block_t mapped_begin;
> > +	dm_block_t mapped_end;
> > +	dm_block_t pool_begin;
> > +	bool maybe_shared;
> > +	int ret;
> > +
> > +	/* TODO locking? */
> > +
> > +	if (block_size_is_power_of_two(pool))
> > +		end = ti->len >> pool->sectors_per_block_shift;
> > +	else {
> > +		end = ti->len;
> > +		(void) sector_div(end, pool->sectors_per_block);
> > +	}
> > +
> > +	offset -= ti->begin << SECTOR_SHIFT;
> > +
> > +	while (true) {
> > +		begin = loff_to_block(pool, offset);
> > +		ret = dm_thin_find_mapped_range(td, begin, end,
> > +						&mapped_begin, &mapped_end,
> > +						&pool_begin, &maybe_shared);
> > +		if (ret == -ENODATA) {
> > +			if (whence == SEEK_DATA)
> > +				return -ENXIO;
> > +			break;
> > +		} else if (ret < 0) {
> > +			/* TODO handle EWOULDBLOCK? */
> > +			return -ENXIO;
> 
> This should probably be -EIO, not -ENXIO.

Yes. XFS also returns -EIO, so I guess it's okay to do so.

I still need to get to the bottom of whether calling
dm_thin_find_mapped_range() is sane here and what to do when/if it
returns EWOULDBLOCK.

> > +		}
> > +
> > +		/* SEEK_DATA finishes here... */
> > +		if (whence == SEEK_DATA) {
> > +			if (mapped_begin != begin)
> > +				offset = block_to_loff(pool, mapped_begin);
> > +			break;
> > +		}
> > +
> > +		/* ...while SEEK_HOLE may need to look further */
> > +		if (mapped_begin != begin)
> > +			break; /* offset is in a hole */
> > +
> > +		offset = block_to_loff(pool, mapped_end);
> > +	}
> > +
> > +	return offset + (ti->begin << SECTOR_SHIFT);
> 
> It's hard to follow, but I'm fairly certain that if whence ==
> SEEK_HOLE, you end up returning ti->begin + ti->len instead of -ENXIO
> if the range from begin to end is fully mapped; which is inconsistent
> with the semantics you have in 4/9 (although in 6/9 I argue that
> having all of the dm callbacks return ti->begin + ti->len instead of
> -ENXIO might make logic easier for iterating through consecutive ti,
> and then convert to -ENXIO only in the caller).

Returning (ti->begin + ti->len) << SECTOR_SHIFT for SEEK_HOLE when there
is data at the end of the target is intentional. This matches the
semantics of lseek().

I agree there is adjustment necessary in dm.c, but I want to seek the
semantics of all lseek() functions identical to avoid confusion.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2024-04-03 15:04 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-28 20:39 [RFC 0/9] block: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 1/9] " Stefan Hajnoczi
2024-03-28 23:50   ` Eric Blake
2024-03-28 20:39 ` [RFC 2/9] loop: " Stefan Hajnoczi
2024-03-29  0:00   ` Eric Blake
2024-03-29 12:54     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 3/9] selftests: block_seek_hole: add loop block driver tests Stefan Hajnoczi
2024-03-29  0:11   ` Eric Blake
2024-04-03 13:50     ` Stefan Hajnoczi
2024-03-29 12:38   ` Eric Blake
2024-04-03 13:51     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 4/9] dm: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29  0:38   ` Eric Blake
2024-04-03 14:11     ` Stefan Hajnoczi
2024-04-03 17:02       ` Eric Blake
2024-04-03 17:58         ` Stefan Hajnoczi
2024-04-03 19:28           ` Eric Blake
2024-03-28 20:39 ` [RFC 5/9] selftests: block_seek_hole: add dm-zero test Stefan Hajnoczi
2024-03-28 22:19   ` Eric Blake
2024-03-28 22:32     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 6/9] dm-linear: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29  0:54   ` Eric Blake
2024-04-03 14:22     ` Stefan Hajnoczi
2024-03-31  7:35   ` kernel test robot
2024-04-03 14:14     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 7/9] selftests: block_seek_hole: add dm-linear test Stefan Hajnoczi
2024-03-29  0:59   ` Eric Blake
2024-04-03 14:23     ` Stefan Hajnoczi
2024-03-28 20:39 ` [RFC 8/9] dm thin: add llseek(SEEK_HOLE/SEEK_DATA) support Stefan Hajnoczi
2024-03-29  1:31   ` Eric Blake
2024-04-03 15:03     ` Stefan Hajnoczi [this message]
2024-03-28 20:39 ` [RFC 9/9] selftests: block_seek_hole: add dm-thin test Stefan Hajnoczi
2024-03-28 22:16 ` [RFC 0/9] block: add llseek(SEEK_HOLE/SEEK_DATA) support Eric Blake
2024-03-28 22:29   ` Eric Blake
2024-03-28 23:09   ` Stefan Hajnoczi
2024-04-02 12:26 ` Christoph Hellwig
2024-04-02 13:04   ` Stefan Hajnoczi
2024-04-05  7:02     ` Christoph Hellwig
2024-04-02 13:31   ` Eric Blake
2024-04-05  7:02     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240403150346.GH2524049@fedora \
    --to=stefanha@redhat.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@lists.linux.dev \
    --cc=eblake@redhat.com \
    --cc=ejt@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@kernel.org \
    --cc=teigland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.