* [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes
@ 2023-09-27 17:46 Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:46 UTC (permalink / raw)
To: linux-btrfs, kernel-team
Hello,
This is an update to my original patch
https://lore.kernel.org/linux-btrfs/b97e47ce0ce1d41d221878de7d6090b90aa7a597.1695065233.git.josef@toxicpanda.com/
This was causing failures with btrfs/156 and btrfs/177. The btrfs/156 problem
was caused by the fact that I was just subtracting 1g from the available space.
I missed a spot where we also limit the chunk size to 10% of the file system, so
I've updated the calculation to be exactly what we choose from the chunk
allocator, and this resovled the btrfs/156 failure.
The btrfs/177 failure was actually due to a underflow in ->free_chunk_space
where we weren't adding the new space we got from btrfs_grow_device. I also
noticed a slight issue with how we remove space from ->free_chunk_space in
btrfs_srhink_device so I fixed that as well.
With these 3 patches we're no longer seeing the above failures and the original
test case is also fixed. Thanks,
Josef
Josef Bacik (3):
btrfs: fix ->free_chunk_space math in btrfs_shrink_device
btrfs: increase ->free_chunk_space in btrfs_grow_device
btrfs: adjust overcommit logic when very close to full
fs/btrfs/space-info.c | 32 ++++++++++++++++++++++++++++++++
fs/btrfs/volumes.c | 21 ++++++++++++++++++---
2 files changed, 50 insertions(+), 3 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
@ 2023-09-27 17:46 ` Josef Bacik
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:46 UTC (permalink / raw)
To: linux-btrfs, kernel-team
There's two bugs in how we adjust ->free_chunk_space in
btrfs_shrink_device. First we're removing the entire diff between
new_size and old_size from ->free_chunk_space. This only works if we're
reducing the free area, which we could potentially not be. So adjust
the math to only subtract the diff in the free space from
->free_chunk_space.
Additionally in the error case we're unconditionally adding the diff
back into ->free_chunk_space, which we need to only do if this device is
writeable.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/volumes.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 733842136163..907ea775f4e4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4841,6 +4841,7 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
u64 old_size = btrfs_device_get_total_bytes(device);
u64 diff;
u64 start;
+ u64 free_diff = 0;
new_size = round_down(new_size, fs_info->sectorsize);
start = new_size;
@@ -4866,7 +4867,19 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
btrfs_device_set_total_bytes(device, new_size);
if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
device->fs_devices->total_rw_bytes -= diff;
- atomic64_sub(diff, &fs_info->free_chunk_space);
+
+ /*
+ * The new free_chunk_space is new_size - used, so we have to
+ * subtract the delta of the old free_chunk_space which included
+ * old_size - used. If used > new_size then just subtract this
+ * entire device's free space.
+ */
+ if (device->bytes_used < new_size)
+ free_diff = (old_size - device->bytes_used) -
+ (new_size - device->bytes_used);
+ else
+ free_diff = old_size - device->bytes_used;
+ atomic64_sub(free_diff, &fs_info->free_chunk_space);
}
/*
@@ -5001,9 +5014,10 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
if (ret) {
mutex_lock(&fs_info->chunk_mutex);
btrfs_device_set_total_bytes(device, old_size);
- if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
+ if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
device->fs_devices->total_rw_bytes += diff;
- atomic64_add(diff, &fs_info->free_chunk_space);
+ atomic64_add(free_diff, &fs_info->free_chunk_space);
+ }
mutex_unlock(&fs_info->chunk_mutex);
}
return ret;
--
2.41.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
@ 2023-09-27 17:47 ` Josef Bacik
2023-09-29 16:30 ` David Sterba
2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba
3 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:47 UTC (permalink / raw)
To: linux-btrfs, kernel-team
My overcommit patch exposed a bug with btrfs/177. The problem here is
that when we grow the device we're not adding to ->free_chunk_space, so
subsequent allocations can cause ->free_chunk_space to wrap, which
causes problems in can_overcommit because we add this to ->total_bytes,
which causes the counter to wrap and gives us an unexpected ENOSPC.
Fix this by properly updating ->free_chunk_space with the new available
space in btrfs_grow_device.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/volumes.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 907ea775f4e4..1aedd15d1db7 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2932,6 +2932,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans,
btrfs_set_super_total_bytes(super_copy,
round_down(old_total + diff, fs_info->sectorsize));
device->fs_devices->total_rw_bytes += diff;
+ atomic64_add(diff, &fs_info->free_chunk_space);
btrfs_device_set_total_bytes(device, new_size);
btrfs_device_set_disk_total_bytes(device, new_size);
--
2.41.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
@ 2023-09-27 17:47 ` Josef Bacik
2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba
3 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:47 UTC (permalink / raw)
To: linux-btrfs, kernel-team
A user reported some unpleasant behavior with very small file systems.
The reproducer is this
mkfs.btrfs -f -m single -b 8g /dev/vdb
mount /dev/vdb /mnt/test
dd if=/dev/zero of=/mnt/test/testfile bs=512M count=20
This will result in usage that looks like this
Overall:
Device size: 8.00GiB
Device allocated: 8.00GiB
Device unallocated: 1.00MiB
Device missing: 0.00B
Device slack: 2.00GiB
Used: 5.47GiB
Free (estimated): 2.52GiB (min: 2.52GiB)
Free (statfs, df): 0.00B
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 5.50MiB (used: 0.00B)
Multiple profiles: no
Data,single: Size:7.99GiB, Used:5.46GiB (68.41%)
/dev/vdb 7.99GiB
Metadata,single: Size:8.00MiB, Used:5.77MiB (72.07%)
/dev/vdb 8.00MiB
System,single: Size:4.00MiB, Used:16.00KiB (0.39%)
/dev/vdb 4.00MiB
Unallocated:
/dev/vdb 1.00MiB
As you can see we've gotten ourselves quite full with metadata, with all
of the disk being allocated for data.
On smaller file systems there's not a lot of time before we get full, so
our overcommit behavior bites us here. Generally speaking data
reservations result in chunk allocations as we assume reservation ==
actual use for data. This means at any point we could end up with a
chunk allocation for data, and if we're very close to full we could do
this before we have a chance to figure out that we need another metadata
chunk.
Address this by adjusting the overcommit logic. Simply put we need to
take away 1 chunk from the available chunk space in case of a data
reservation. This will allow us to stop overcommitting before we
potentially lose this space to a data allocation. With this fix in
place we properly allocate a metadata chunk before we're completely
full, allowing for enough slack space in metadata.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
fs/btrfs/space-info.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index d7e8cd4f140c..f1cc3ea4553c 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -345,8 +345,10 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
struct btrfs_space_info *space_info,
enum btrfs_reserve_flush_enum flush)
{
+ struct btrfs_space_info *data_sinfo;
u64 profile;
u64 avail;
+ u64 data_chunk_size;
int factor;
if (space_info->flags & BTRFS_BLOCK_GROUP_SYSTEM)
@@ -364,6 +366,36 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
*/
factor = btrfs_bg_type_to_factor(profile);
avail = div_u64(avail, factor);
+ if (avail == 0)
+ return 0;
+
+ /*
+ * Calculate the data_chunk_size, space_info->chunk_size is the
+ * "optimal" chunk size based on the fs size. However when we actually
+ * allocate the chunk we will strip this down further, making it no more
+ * than 10% of the disk or 1G, whichever is smaller.
+ */
+ data_sinfo = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_DATA);
+ data_chunk_size = min(data_sinfo->chunk_size,
+ mult_perc(fs_info->fs_devices->total_rw_bytes, 10));
+ data_chunk_size = min_t(u64, data_chunk_size, SZ_1G);
+
+ /*
+ * Since data allocations immediately use block groups as part of the
+ * reservation, because we assume that data reservations will == actual
+ * usage, we could potentially overcommit and then immediately have that
+ * available space used by a data allocation, which could put us in a
+ * bind when we get close to filling the file system.
+ *
+ * To handle this simply remove the data_chunk_size from the available
+ * space. If we are relatively empty this won't affect our ability to
+ * overcommit much, and if we're very close to full it'll keep us from
+ * getting into a position where we've given ourselves very little
+ * metadata wiggle room.
+ */
+ if (avail <= data_chunk_size)
+ return 0;
+ avail -= data_chunk_size;
/*
* If we aren't flushing all things, let us overcommit up to
--
2.41.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
@ 2023-09-29 16:30 ` David Sterba
2023-10-02 20:44 ` Josef Bacik
0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2023-09-29 16:30 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-btrfs, kernel-team
On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> My overcommit patch exposed a bug with btrfs/177. The problem here is
Which patch is that?
> that when we grow the device we're not adding to ->free_chunk_space, so
> subsequent allocations can cause ->free_chunk_space to wrap, which
> causes problems in can_overcommit because we add this to ->total_bytes,
> which causes the counter to wrap and gives us an unexpected ENOSPC.
>
> Fix this by properly updating ->free_chunk_space with the new available
> space in btrfs_grow_device.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
> fs/btrfs/volumes.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 907ea775f4e4..1aedd15d1db7 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -2932,6 +2932,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans,
> btrfs_set_super_total_bytes(super_copy,
> round_down(old_total + diff, fs_info->sectorsize));
> device->fs_devices->total_rw_bytes += diff;
> + atomic64_add(diff, &fs_info->free_chunk_space);
>
> btrfs_device_set_total_bytes(device, new_size);
> btrfs_device_set_disk_total_bytes(device, new_size);
> --
> 2.41.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
` (2 preceding siblings ...)
2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
@ 2023-10-02 13:09 ` David Sterba
3 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2023-10-02 13:09 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-btrfs, kernel-team
On Wed, Sep 27, 2023 at 01:46:58PM -0400, Josef Bacik wrote:
> Hello,
>
> This is an update to my original patch
>
> https://lore.kernel.org/linux-btrfs/b97e47ce0ce1d41d221878de7d6090b90aa7a597.1695065233.git.josef@toxicpanda.com/
>
> This was causing failures with btrfs/156 and btrfs/177. The btrfs/156 problem
> was caused by the fact that I was just subtracting 1g from the available space.
> I missed a spot where we also limit the chunk size to 10% of the file system, so
> I've updated the calculation to be exactly what we choose from the chunk
> allocator, and this resovled the btrfs/156 failure.
>
> The btrfs/177 failure was actually due to a underflow in ->free_chunk_space
> where we weren't adding the new space we got from btrfs_grow_device. I also
> noticed a slight issue with how we remove space from ->free_chunk_space in
> btrfs_srhink_device so I fixed that as well.
>
> With these 3 patches we're no longer seeing the above failures and the original
> test case is also fixed. Thanks,
>
> Josef
>
> Josef Bacik (3):
> btrfs: fix ->free_chunk_space math in btrfs_shrink_device
> btrfs: increase ->free_chunk_space in btrfs_grow_device
> btrfs: adjust overcommit logic when very close to full
Added to misc-next, thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
2023-09-29 16:30 ` David Sterba
@ 2023-10-02 20:44 ` Josef Bacik
2023-10-03 18:47 ` David Sterba
0 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2023-10-02 20:44 UTC (permalink / raw)
To: David Sterba; +Cc: linux-btrfs, kernel-team
On Fri, Sep 29, 2023 at 06:30:35PM +0200, David Sterba wrote:
> On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> > My overcommit patch exposed a bug with btrfs/177. The problem here is
>
> Which patch is that?
>
Sorry the V1 version of this series. Thanks,
Josef
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
2023-10-02 20:44 ` Josef Bacik
@ 2023-10-03 18:47 ` David Sterba
0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2023-10-03 18:47 UTC (permalink / raw)
To: Josef Bacik; +Cc: David Sterba, linux-btrfs, kernel-team
On Mon, Oct 02, 2023 at 04:44:18PM -0400, Josef Bacik wrote:
> On Fri, Sep 29, 2023 at 06:30:35PM +0200, David Sterba wrote:
> > On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> > > My overcommit patch exposed a bug with btrfs/177. The problem here is
> >
> > Which patch is that?
> >
>
> Sorry the V1 version of this series. Thanks,
Ok thanks, I've added a link to the changelog.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-10-03 18:53 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
2023-09-29 16:30 ` David Sterba
2023-10-02 20:44 ` Josef Bacik
2023-10-03 18:47 ` David Sterba
2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.