* [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes
@ 2023-09-27 17:46 Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:46 UTC (permalink / raw)
To: linux-btrfs, kernel-team
Hello,
This is an update to my original patch
https://lore.kernel.org/linux-btrfs/b97e47ce0ce1d41d221878de7d6090b90aa7a597.1695065233.git.josef@toxicpanda.com/
This was causing failures with btrfs/156 and btrfs/177. The btrfs/156 problem
was caused by the fact that I was just subtracting 1g from the available space.
I missed a spot where we also limit the chunk size to 10% of the file system, so
I've updated the calculation to be exactly what we choose from the chunk
allocator, and this resovled the btrfs/156 failure.
The btrfs/177 failure was actually due to a underflow in ->free_chunk_space
where we weren't adding the new space we got from btrfs_grow_device. I also
noticed a slight issue with how we remove space from ->free_chunk_space in
btrfs_srhink_device so I fixed that as well.
With these 3 patches we're no longer seeing the above failures and the original
test case is also fixed. Thanks,
Josef
Josef Bacik (3):
btrfs: fix ->free_chunk_space math in btrfs_shrink_device
btrfs: increase ->free_chunk_space in btrfs_grow_device
btrfs: adjust overcommit logic when very close to full
fs/btrfs/space-info.c | 32 ++++++++++++++++++++++++++++++++
fs/btrfs/volumes.c | 21 ++++++++++++++++++---
2 files changed, 50 insertions(+), 3 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
@ 2023-09-27 17:46 ` Josef Bacik
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:46 UTC (permalink / raw)
To: linux-btrfs, kernel-team
There's two bugs in how we adjust ->free_chunk_space in
btrfs_shrink_device. First we're removing the entire diff between
new_size and old_size from ->free_chunk_space. This only works if we're
reducing the free area, which we could potentially not be. So adjust
the math to only subtract the diff in the free space from
->free_chunk_space.
Additionally in the error case we're unconditionally adding the diff
back into ->free_chunk_space, which we need to only do if this device is
writeable.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/volumes.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 733842136163..907ea775f4e4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4841,6 +4841,7 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
u64 old_size = btrfs_device_get_total_bytes(device);
u64 diff;
u64 start;
+ u64 free_diff = 0;
new_size = round_down(new_size, fs_info->sectorsize);
start = new_size;
@@ -4866,7 +4867,19 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
btrfs_device_set_total_bytes(device, new_size);
if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
device->fs_devices->total_rw_bytes -= diff;
- atomic64_sub(diff, &fs_info->free_chunk_space);
+
+ /*
+ * The new free_chunk_space is new_size - used, so we have to
+ * subtract the delta of the old free_chunk_space which included
+ * old_size - used. If used > new_size then just subtract this
+ * entire device's free space.
+ */
+ if (device->bytes_used < new_size)
+ free_diff = (old_size - device->bytes_used) -
+ (new_size - device->bytes_used);
+ else
+ free_diff = old_size - device->bytes_used;
+ atomic64_sub(free_diff, &fs_info->free_chunk_space);
}
/*
@@ -5001,9 +5014,10 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
if (ret) {
mutex_lock(&fs_info->chunk_mutex);
btrfs_device_set_total_bytes(device, old_size);
- if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
+ if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
device->fs_devices->total_rw_bytes += diff;
- atomic64_add(diff, &fs_info->free_chunk_space);
+ atomic64_add(free_diff, &fs_info->free_chunk_space);
+ }
mutex_unlock(&fs_info->chunk_mutex);
}
return ret;
--
2.41.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
@ 2023-09-27 17:47 ` Josef Bacik
2023-09-29 16:30 ` David Sterba
2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba
3 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:47 UTC (permalink / raw)
To: linux-btrfs, kernel-team
My overcommit patch exposed a bug with btrfs/177. The problem here is
that when we grow the device we're not adding to ->free_chunk_space, so
subsequent allocations can cause ->free_chunk_space to wrap, which
causes problems in can_overcommit because we add this to ->total_bytes,
which causes the counter to wrap and gives us an unexpected ENOSPC.
Fix this by properly updating ->free_chunk_space with the new available
space in btrfs_grow_device.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/volumes.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 907ea775f4e4..1aedd15d1db7 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2932,6 +2932,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans,
btrfs_set_super_total_bytes(super_copy,
round_down(old_total + diff, fs_info->sectorsize));
device->fs_devices->total_rw_bytes += diff;
+ atomic64_add(diff, &fs_info->free_chunk_space);
btrfs_device_set_total_bytes(device, new_size);
btrfs_device_set_disk_total_bytes(device, new_size);
--
2.41.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
@ 2023-09-27 17:47 ` Josef Bacik
2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba
3 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:47 UTC (permalink / raw)
To: linux-btrfs, kernel-team
A user reported some unpleasant behavior with very small file systems.
The reproducer is this
mkfs.btrfs -f -m single -b 8g /dev/vdb
mount /dev/vdb /mnt/test
dd if=/dev/zero of=/mnt/test/testfile bs=512M count=20
This will result in usage that looks like this
Overall:
Device size: 8.00GiB
Device allocated: 8.00GiB
Device unallocated: 1.00MiB
Device missing: 0.00B
Device slack: 2.00GiB
Used: 5.47GiB
Free (estimated): 2.52GiB (min: 2.52GiB)
Free (statfs, df): 0.00B
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 5.50MiB (used: 0.00B)
Multiple profiles: no
Data,single: Size:7.99GiB, Used:5.46GiB (68.41%)
/dev/vdb 7.99GiB
Metadata,single: Size:8.00MiB, Used:5.77MiB (72.07%)
/dev/vdb 8.00MiB
System,single: Size:4.00MiB, Used:16.00KiB (0.39%)
/dev/vdb 4.00MiB
Unallocated:
/dev/vdb 1.00MiB
As you can see we've gotten ourselves quite full with metadata, with all
of the disk being allocated for data.
On smaller file systems there's not a lot of time before we get full, so
our overcommit behavior bites us here. Generally speaking data
reservations result in chunk allocations as we assume reservation ==
actual use for data. This means at any point we could end up with a
chunk allocation for data, and if we're very close to full we could do
this before we have a chance to figure out that we need another metadata
chunk.
Address this by adjusting the overcommit logic. Simply put we need to
take away 1 chunk from the available chunk space in case of a data
reservation. This will allow us to stop overcommitting before we
potentially lose this space to a data allocation. With this fix in
place we properly allocate a metadata chunk before we're completely
full, allowing for enough slack space in metadata.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/space-info.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index d7e8cd4f140c..f1cc3ea4553c 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -345,8 +345,10 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
struct btrfs_space_info *space_info,
enum btrfs_reserve_flush_enum flush)
{
+ struct btrfs_space_info *data_sinfo;
u64 profile;
u64 avail;
+ u64 data_chunk_size;
int factor;
if (space_info->flags & BTRFS_BLOCK_GROUP_SYSTEM)
@@ -364,6 +366,36 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
*/
factor = btrfs_bg_type_to_factor(profile);
avail = div_u64(avail, factor);
+ if (avail == 0)
+ return 0;
+
+ /*
+ * Calculate the data_chunk_size, space_info->chunk_size is the
+ * "optimal" chunk size based on the fs size. However when we actually
+ * allocate the chunk we will strip this down further, making it no more
+ * than 10% of the disk or 1G, whichever is smaller.
+ */
+ data_sinfo = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_DATA);
+ data_chunk_size = min(data_sinfo->chunk_size,
+ mult_perc(fs_info->fs_devices->total_rw_bytes, 10));
+ data_chunk_size = min_t(u64, data_chunk_size, SZ_1G);
+
+ /*
+ * Since data allocations immediately use block groups as part of the
+ * reservation, because we assume that data reservations will == actual
+ * usage, we could potentially overcommit and then immediately have that
+ * available space used by a data allocation, which could put us in a
+ * bind when we get close to filling the file system.
+ *
+ * To handle this simply remove the data_chunk_size from the available
+ * space. If we are relatively empty this won't affect our ability to
+ * overcommit much, and if we're very close to full it'll keep us from
+ * getting into a position where we've given ourselves very little
+ * metadata wiggle room.
+ */
+ if (avail <= data_chunk_size)
+ return 0;
+ avail -= data_chunk_size;
/*
* If we aren't flushing all things, let us overcommit up to
--
2.41.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
@ 2023-09-29 16:30 ` David Sterba
2023-10-02 20:44 ` Josef Bacik
0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2023-09-29 16:30 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-btrfs, kernel-team
On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> My overcommit patch exposed a bug with btrfs/177. The problem here is
Which patch is that?
> that when we grow the device we're not adding to ->free_chunk_space, so
> subsequent allocations can cause ->free_chunk_space to wrap, which
> causes problems in can_overcommit because we add this to ->total_bytes,
> which causes the counter to wrap and gives us an unexpected ENOSPC.
>
> Fix this by properly updating ->free_chunk_space with the new available
> space in btrfs_grow_device.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
> fs/btrfs/volumes.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 907ea775f4e4..1aedd15d1db7 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -2932,6 +2932,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans,
> btrfs_set_super_total_bytes(super_copy,
> round_down(old_total + diff, fs_info->sectorsize));
> device->fs_devices->total_rw_bytes += diff;
> + atomic64_add(diff, &fs_info->free_chunk_space);
>
> btrfs_device_set_total_bytes(device, new_size);
> btrfs_device_set_disk_total_bytes(device, new_size);
> --
> 2.41.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
` (2 preceding siblings ...)
2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
@ 2023-10-02 13:09 ` David Sterba
3 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2023-10-02 13:09 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-btrfs, kernel-team
On Wed, Sep 27, 2023 at 01:46:58PM -0400, Josef Bacik wrote:
> Hello,
>
> This is an update to my original patch
>
> https://lore.kernel.org/linux-btrfs/b97e47ce0ce1d41d221878de7d6090b90aa7a597.1695065233.git.josef@toxicpanda.com/
>
> This was causing failures with btrfs/156 and btrfs/177. The btrfs/156 problem
> was caused by the fact that I was just subtracting 1g from the available space.
> I missed a spot where we also limit the chunk size to 10% of the file system, so
> I've updated the calculation to be exactly what we choose from the chunk
> allocator, and this resovled the btrfs/156 failure.
>
> The btrfs/177 failure was actually due to a underflow in ->free_chunk_space
> where we weren't adding the new space we got from btrfs_grow_device. I also
> noticed a slight issue with how we remove space from ->free_chunk_space in
> btrfs_srhink_device so I fixed that as well.
>
> With these 3 patches we're no longer seeing the above failures and the original
> test case is also fixed. Thanks,
>
> Josef
>
> Josef Bacik (3):
> btrfs: fix ->free_chunk_space math in btrfs_shrink_device
> btrfs: increase ->free_chunk_space in btrfs_grow_device
> btrfs: adjust overcommit logic when very close to full
Added to misc-next, thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
2023-09-29 16:30 ` David Sterba
@ 2023-10-02 20:44 ` Josef Bacik
2023-10-03 18:47 ` David Sterba
0 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2023-10-02 20:44 UTC (permalink / raw)
To: David Sterba; +Cc: linux-btrfs, kernel-team
On Fri, Sep 29, 2023 at 06:30:35PM +0200, David Sterba wrote:
> On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> > My overcommit patch exposed a bug with btrfs/177. The problem here is
>
> Which patch is that?
>
Sorry the V1 version of this series. Thanks,
Josef
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
2023-10-02 20:44 ` Josef Bacik
@ 2023-10-03 18:47 ` David Sterba
0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2023-10-03 18:47 UTC (permalink / raw)
To: Josef Bacik; +Cc: David Sterba, linux-btrfs, kernel-team
On Mon, Oct 02, 2023 at 04:44:18PM -0400, Josef Bacik wrote:
> On Fri, Sep 29, 2023 at 06:30:35PM +0200, David Sterba wrote:
> > On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> > > My overcommit patch exposed a bug with btrfs/177. The problem here is
> >
> > Which patch is that?
> >
>
> Sorry the V1 version of this series. Thanks,
Ok thanks, I've added a link to the changelog.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-10-03 18:53 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
2023-09-29 16:30 ` David Sterba
2023-10-02 20:44 ` Josef Bacik
2023-10-03 18:47 ` David Sterba
2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox