From: Liu Bo <liubo2009@cn.fujitsu.com>
To: Lutz Euler <lutz.euler@freenet.de>
Cc: linux-btrfs@vger.kernel.org, Lukas Czerner <lczerner@redhat.com>
Subject: Re: Bulk discard doesn't work after add/delete of devices
Date: Mon, 13 Feb 2012 13:57:57 +0800 [thread overview]
Message-ID: <4F38A665.4030506@cn.fujitsu.com> (raw)
In-Reply-To: <20279.61547.888033.662423@localhost.localdomain>
On 02/13/2012 01:01 AM, Lutz Euler wrote:
> Hi,
>
> tl;dr: btrfs_trim_fs, IMHO, needs more thorough surgery.
>
> Thanks for providing the new patch. I think it will work in the case
> that "fstrim" is called without specifying a range to trim (that is,
> defaulting to the whole filesystem), but I didn't test that yet, sorry.
>
> Instead, I have been thinking about what happens if a range smaller than
> the whole filesystem is specified. After all, the observation that in my
> filesystem the smallest "objectid" is already larger than the filesystem
> size still holds even in that case, and wanting to trim only part of the
> filesystem seems to be a valid wish, too.
>
> So I dug into the code myself and came to the conclusion that the
> way "btrfs_trim_fs" interprets the range passed from the ioctl is
> fundamentally broken.
>
> Instead of papering over that I'd very much prefer a more thorough fix
> here, which in addition might make it unnecessary to treat the "trim the
> complete filesystem" case specially. Please read on for the details:
>
> The current implementation of "btrfs_trim_fs" simply compares the
> objectid's it finds in the chunk tree against the range passed from
> the ioctl, seemingly assuming that both kinds of numbers span the same
> range. But this is clearly not true: Even without adding and deleting
> of devices, in a simple mirror the objectids will span only about half
> the size of the filesystem. With suitable add/delete of devices I can
> construct a filesystem with a smallest objectid of 0 and a largest
> objectid much larger than the size of the filesystem (so, obviously,
> with large holes in the set of used objectid's), or, as in my filesystem
> mentioned above, with a smallest objectid larger than the size of the
> filesystem.
>
[...]
> So, to make bulk discard of ranges useful, it seems the incoming range
> should be interpreted relative to the size of the filesystem and not to
> the allocated chunks. As AFAIK the size of the filesystem is just the
> sum of the sizes of its devices it might be possible to map the range
> onto a virtual concatenation of the devices, these perhaps ordered by
> devid, and then to find the free space by searching for the resulting
> devid(s) and device-relative offsets in the device tree?
>
Actually I have no idea how to deal with this properly :(
Because btrfs supports multi-devices so that we have to set the filesystem
logical range to [0, (u64)-1] to get things to work well, while other
filesystems's logical range is [0, device's total_bytes].
What's more, in btrfs, devices can be mirrored to RAID, and the free space is
also ordered by logical addr [0, (u64)-1], so IMO it is impossible to interpreted the range.
I'd better pick up "treat the "trim the complete filesystem" case specially",
and drop the following commit:
commit f4c697e6406da5dd445eda8d923c53e1138793dd
Author: Lukas Czerner <lczerner@redhat.com>
Date: Mon Sep 5 16:34:54 2011 +0200
btrfs: return EINVAL if start > total_bytes in fitrim ioctl
We should retirn EINVAL if the start is beyond the end of the file
system in the btrfs_ioctl_fitrim(). Fix that by adding the appropriate
check for it.
Also in the btrfs_trim_fs() it is possible that len+start might overflow
if big values are passed. Fix it by decrementing the len so that start+len
is equal to the file system size in the worst case.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
thanks,
liubo
> I understand that currently unallocated space is never trimmed?
> To nevertheless do that might be useful after a balance or at file
> system creation time? If there is a way to find the unallocated space
> for a device, this could be improved at the same time, too.
>
> Kind regards,
>
> Lutz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2012-02-13 5:57 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-05 20:37 Bulk discard doesn't work after add/delete of devices Lutz Euler
2012-02-09 8:42 ` Liu Bo
2012-02-09 15:50 ` Lutz Euler
2012-02-10 1:56 ` Liu Bo
2012-02-12 17:01 ` Lutz Euler
2012-02-13 5:57 ` Liu Bo [this message]
2012-02-14 17:32 ` Lutz Euler
2012-02-29 0:17 ` Lutz Euler
2012-04-10 17:34 ` Lutz Euler
2012-11-14 21:10 ` Lutz Euler
2012-11-14 21:17 ` [PATCH] Btrfs: really fix trim 0 bytes after a device delete Lutz Euler
2015-01-03 15:30 ` [PATCH V2] " Lutz Euler
2015-01-05 16:59 ` Martin Steigerwald
2015-01-05 19:29 ` Lutz Euler
2015-05-01 10:43 ` Martin Steigerwald
-- strict thread matches above, loose matches on Subject: below --
2014-12-28 16:58 fstrim not working on one of three BTRFS filesystems Martin Steigerwald
2014-12-29 1:53 ` Robert White
2014-12-29 2:08 ` Duncan
2014-12-29 9:06 ` Martin Steigerwald
2014-12-29 13:23 ` Martin Steigerwald
2015-01-03 16:16 ` Lutz Euler
2015-05-19 15:18 ` Rich Freeman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F38A665.4030506@cn.fujitsu.com \
--to=liubo2009@cn.fujitsu.com \
--cc=lczerner@redhat.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lutz.euler@freenet.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.