From: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
To: Ryusuke Konishi
<konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs
Date: Wed, 19 Feb 2014 07:36:58 +0100 [thread overview]
Message-ID: <5304510A.8080006@gmx.net> (raw)
In-Reply-To: <20140219.033733.321498615.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
On 2014-02-18 19:37, Ryusuke Konishi wrote:
> On Mon, 17 Feb 2014 23:39:51 +0100, Andreas Rohner wrote:
>> This patch adds the nilfs_sufile_trim_fs function, which takes a
>> fstrim_range structure and calls blkdev_issue_discard for every
>> clean segment in the specified range. The range is truncated to sector
>> boundaries.
>>
>> Signed-off-by: Andreas Rohner <andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
>> ---
>> fs/nilfs2/sufile.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>> fs/nilfs2/sufile.h | 1 +
>> 2 files changed, 145 insertions(+)
>>
>> diff --git a/fs/nilfs2/sufile.c b/fs/nilfs2/sufile.c
>> index 3127e9f..3605cc9 100644
>> --- a/fs/nilfs2/sufile.c
>> +++ b/fs/nilfs2/sufile.c
>> @@ -870,6 +870,150 @@ ssize_t nilfs_sufile_get_suinfo(struct inode *sufile, __u64 segnum, void *buf,
>> }
>>
>> /**
>> + * nilfs_sufile_trim_fs() - trim ioctl handle function
>> + * @sufile: inode of segment usage file
>> + * @range: fstrim_range structure
>> + *
>> + * start: First Byte to trim
>> + * len: number of Bytes to trim from start
>> + * minlen: minimum extent length in Bytes
>> + *
>> + * Decription: nilfs_sufile_trim_fs goes through all segments containing bytes
>> + * from start to start+len. start is rounded up to the next sector boundary
>> + * and start+len is rounded down. For each clean segment blkdev_issue_discard
>> + * function is invoked to trim it.
>> + *
>> + * Return Value: On success, 0 is returned or negative error code, otherwise.
>> + */
>> +int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range)
>> +{
>> + struct the_nilfs *nilfs = sufile->i_sb->s_fs_info;
>> + struct buffer_head *su_bh;
>> + struct nilfs_segment_usage *su;
>> + void *kaddr;
>> + size_t n, i, susz = NILFS_MDT(sufile)->mi_entry_size;
>> + sector_t seg_start, seg_end, real_start, real_end,
>> + start = 0, nblocks = 0;
>> + u64 segnum, end, minlen, trimmed = 0;
>> + int ret = 0;
>> + unsigned int sect_size, sects_per_block;
>> +
>> + sect_size = bdev_logical_block_size(nilfs->ns_bdev);
>> + sects_per_block = (1 << nilfs->ns_blocksize_bits) / sect_size;
>
>> + real_start = (range->start + sect_size - 1) / sect_size;
>> + real_end = (range->start + range->len) / sect_size;
>
> Why not use start_sect, end_sect instead of real_start, real_end?
> real_{start,end} are not intuitive to me.
Yes that looks better.
> We need to use do_div() for these divisions, and DIV_ROUND_UP_ULL()
> macro should be applied to round up the start sector.
>
> Moreover, we have to avoid overflow in "range->start + range->len".
> Actually, range->len is usually set to UULONG_MAX.
Ah yes I forgot to test that case.
> So, these will be as follows:
>
> u64 len = range->len;
>
> ...
>
> do_div(len, sect_size);
> if (!len)
> goto out;
>
> ...
> start_sect = DIV_ROUND_UP_ULL(range->start, sect_size);
> end_sect = start_sect + len - 1; /* this end_sect is inclusive */
I don't get why this has to be inclusive. To me this seems to be a
matter of taste. I think it is much easier to reason about this stuff
and more "natural", if start_sect is inclusive and end_sect is
exclusive. Then segnum is inclusive and end is exclusive. It is just
like with pointer arithmetic.
> Note that, we also should care about large range->start to avoid
> overflow in substitution to start_sect (sector_t) since sector_t may
> be 32-bit. We should check it before the division.
>
> Here, I recant my earlier comment. We should do the following check
> in this function to clarify that the overflow issue is handled
> properly.
Ok.
> u64 max_byte =
> ((u64)nilfs->ns_nsegments * nilfs->ns_blocks_per_segments)
> << nilfs->ns_blocksize_bits;
>
> ...
> if (range.len < nilfs->ns_blocksize || range.start >= max_byte)
> return -EINVAL;
> ...
> (divisions)
>
>> + segnum = nilfs_get_segnum_of_block(nilfs, real_start / sects_per_block);
>> + end = nilfs_get_segnum_of_block(nilfs, ((real_end + sects_per_block - 1)
>> + / sects_per_block) + nilfs->ns_blocks_per_segment - 1);
>
> It would be better to use the following intermediate variables to
> improve readability of these calculations.
Ok.
> And, these calculations need sector_div() and DIV_ROUND_UP_SECTOR_T()
> macro:
>
> start_block = start_sect;
> sector_div(start_block, sects_per_block);
>
> end_block = DIV_ROUND_UP_SECTOR_T(end_sect, sects_per_block);
>
> segnum = nilfs_get_segnum_of_block(nilfs, start_block);
> end = nilfs_get_segnum_of_block(
> nilfs, end_block + nilfs->ns_blocks_per_segment - 1);
>
>> + minlen = range->minlen / sect_size;
>
> And, this one needs do_div():
>
> minlen = range->minlen;
> do_div(minlen, sect_size);
>
>> +
>> +
>> + if (end > nilfs->ns_nsegments)
>> + end = nilfs->ns_nsegments;
>> + if (segnum >= nilfs->ns_nsegments || end <= segnum)
>> + goto out;
>> +
>> + down_read(&NILFS_MDT(sufile)->mi_sem);
>> +
>> + while (segnum < end) {
>> + n = nilfs_sufile_segment_usages_in_block(sufile, segnum,
>> + end - 1);
>> +
>> + ret = nilfs_sufile_get_segment_usage_block(sufile, segnum, 0,
>> + &su_bh);
>> + if (ret < 0) {
>> + if (ret != -ENOENT)
>> + goto out_sem;
>> + /* hole */
>> + segnum += n;
>> + continue;
>> + }
>> +
>> + kaddr = kmap_atomic(su_bh->b_page);
>> + su = nilfs_sufile_block_get_segment_usage(sufile, segnum,
>> + su_bh, kaddr);
>> + for (i = 0; i < n; ++i, ++segnum, su = (void *)su + susz) {
>> + if (!nilfs_segment_usage_clean(su))
>> + continue;
>> +
>> + nilfs_get_segment_range(nilfs, segnum, &seg_start,
>> + &seg_end);
>> +
>> + if (!nblocks) {
>> + /* start new extent */
>> + start = seg_start;
>> + nblocks = seg_end - seg_start + 1;
>> + continue;
>> + }
>> +
>> + if (start + nblocks == seg_start) {
>> + /* add to previous extent */
>> + nblocks += seg_end - seg_start + 1;
>> + continue;
>> + }
>> +
>> + /* discard previous extent */
>> + start *= sects_per_block;
>> + nblocks *= sects_per_block;
>> + if (start < real_start) {
>> + nblocks -= real_start - start;
>> + start = real_start;
>> + }
>
>> + if (start + nblocks > real_end)
>> + nblocks = real_end - start;
>
> Why do you need this adjustment during discarding "previous" extent ?
You are right I don't need it.
>> + if (nblocks >= minlen) {
>> + kunmap_atomic(kaddr);
>> +
>> + ret = blkdev_issue_discard(nilfs->ns_bdev,
>> + start, nblocks, GFP_NOFS, 0);
>> + if (ret < 0) {
>> + put_bh(su_bh);
>> + goto out_sem;
>> + }
>> +
>> + trimmed += nblocks;
>> + kaddr = kmap_atomic(su_bh->b_page);
>> + su = nilfs_sufile_block_get_segment_usage(
>> + sufile, segnum, su_bh, kaddr);
>> + }
>> +
>> + /* start new extent */
>> + start = seg_start;
>> + nblocks = seg_end - seg_start + 1;
>> + }
>> + kunmap_atomic(kaddr);
>> + put_bh(su_bh);
>> + }
>> +
>> +
>> + if (nblocks) {
>> + /* discard last extent */
>> + start *= sects_per_block;
>> + nblocks *= sects_per_block;
>> + if (start < real_start) {
>> + nblocks -= real_start - start;
>> + start = real_start;
>> + }
>> + if (start + nblocks > real_end)
>> + nblocks = real_end - start;
>> +
>> + if (nblocks >= minlen) {
>> + ret = blkdev_issue_discard(nilfs->ns_bdev, start,
>> + nblocks, GFP_NOFS, 0);
>> + if (!ret)
>> + trimmed += nblocks;
>> + }
>> + }
>> +
>> +out_sem:
>> + up_read(&NILFS_MDT(sufile)->mi_sem);
>> +out:
>> + range->len = trimmed * sect_size;
>> + return ret;
>> +}
>> +
>> +/**
>> * nilfs_sufile_read - read or get sufile inode
>> * @sb: super block instance
>> * @susize: size of a segment usage entry
>> diff --git a/fs/nilfs2/sufile.h b/fs/nilfs2/sufile.h
>> index e84bc5b..2434abd 100644
>> --- a/fs/nilfs2/sufile.h
>> +++ b/fs/nilfs2/sufile.h
>> @@ -65,6 +65,7 @@ void nilfs_sufile_do_set_error(struct inode *, __u64, struct buffer_head *,
>> int nilfs_sufile_resize(struct inode *sufile, __u64 newnsegs);
>> int nilfs_sufile_read(struct super_block *sb, size_t susize,
>> struct nilfs_inode *raw_inode, struct inode **inodep);
>> +int nilfs_sufile_trim_fs(struct inode *sufile, struct fstrim_range *range);
>>
>> /**
>> * nilfs_sufile_scrap - make a segment garbage
>> --
>> 1.9.0
>
> Please try to compile this patch both for 32-bit kernel and 64-bit
> kernel to test if the patch is architecture independent.
Ok.
With all the proper division macros it gets very complicated. I think it
would simplify things if we just truncate to block size instead of
sector size. Then we could use simple bit shifts. It would look
something like this:
if (range->len < nilfs->ns_blocksize ||
range->start >= max_byte)
return -EINVAL;
/* sector_t could be 32 bit */
if (range->len > max_byte)
range->len = max_byte;
sects_per_block = (1 << nilfs->ns_blocksize_bits) /
bdev_logical_block_size(nilfs->ns_bdev);
start_block = (range->start + nilfs->ns_blocksize - 1) >>
nilfs->ns_blocksize_bits;
end_block = start_block + (range->len >>
nilfs->ns_blocksize_bits);
segnum = nilfs_get_segnum_of_block(nilfs, start_block);
end = nilfs_get_segnum_of_block(nilfs, end_block +
nilfs->ns_blocks_per_segment - 1);
minlen = range->minlen >> nilfs->ns_blocksize_bits;
What do you think?
Regards,
Andreas Rohner
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-02-19 6:36 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-17 22:39 [PATCH 0/2] nilfs2: add support for FITRIM ioctl Andreas Rohner
[not found] ` <cover.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-02-17 22:39 ` [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs Andreas Rohner
[not found] ` <0ba40f9a0504b2a228e0714032f82556e03b927e.1392676472.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-02-18 18:37 ` Ryusuke Konishi
[not found] ` <20140219.033733.321498615.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-02-19 6:36 ` Andreas Rohner [this message]
[not found] ` <5304510A.8080006-hi6Y0CQ0nG0@public.gmane.org>
2014-02-20 8:12 ` Ryusuke Konishi
2014-02-17 22:39 ` [PATCH 2/2] nilfs2: add FITRIM ioctl support for nilfs2 Andreas Rohner
-- strict thread matches above, loose matches on Subject: below --
2014-02-15 13:34 [PATCH 0/2] nilfs2: add support for FITRIM ioctl Andreas Rohner
[not found] ` <cover.1392470644.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-02-15 13:34 ` [PATCH 1/2] nilfs2: add nilfs_sufile_trim_fs to trim clean segs Andreas Rohner
[not found] ` <02d463460388a9bdc1413aa4d6fdfa5a402b9452.1392470644.git.andreas.rohner-hi6Y0CQ0nG0@public.gmane.org>
2014-02-17 3:00 ` Ryusuke Konishi
[not found] ` <20140217.120000.397306175.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-02-17 4:27 ` Ryusuke Konishi
2014-02-17 9:06 ` Andreas Rohner
[not found] ` <5301D130.9050000-hi6Y0CQ0nG0@public.gmane.org>
2014-02-17 10:42 ` Ryusuke Konishi
[not found] ` <20140217.194200.431478798.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-02-17 11:04 ` Andreas Rohner
[not found] ` <5301ECD4.5070200-hi6Y0CQ0nG0@public.gmane.org>
2014-02-17 11:39 ` Ryusuke Konishi
2014-02-17 9:17 ` Andreas Rohner
[not found] ` <5301D3C5.6040704-hi6Y0CQ0nG0@public.gmane.org>
2014-02-17 10:06 ` Ryusuke Konishi
[not found] ` <20140217.190610.135800072.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-02-17 10:34 ` Andreas Rohner
[not found] ` <5301E5A8.3080701-hi6Y0CQ0nG0@public.gmane.org>
2014-02-17 11:21 ` Ryusuke Konishi
[not found] ` <20140217.202143.05344630.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-02-17 11:53 ` Andreas Rohner
[not found] ` <5301F852.8010105-hi6Y0CQ0nG0@public.gmane.org>
2014-02-17 13:28 ` Ryusuke Konishi
[not found] ` <20140217.222811.27809410.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2014-02-17 13:51 ` Andreas Rohner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5304510A.8080006@gmx.net \
--to=andreas.rohner-hi6y0cq0ng0@public.gmane.org \
--cc=konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org \
--cc=linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox