Linux Btrfs filesystem development
 help / color / mirror / Atom feed
* [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes
@ 2023-09-27 17:46 Josef Bacik
  2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:46 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

Hello,

This is an update to my original patch

https://lore.kernel.org/linux-btrfs/b97e47ce0ce1d41d221878de7d6090b90aa7a597.1695065233.git.josef@toxicpanda.com/

This was causing failures with btrfs/156 and btrfs/177.  The btrfs/156 problem
was caused by the fact that I was just subtracting 1g from the available space.
I missed a spot where we also limit the chunk size to 10% of the file system, so
I've updated the calculation to be exactly what we choose from the chunk
allocator, and this resovled the btrfs/156 failure.

The btrfs/177 failure was actually due to a underflow in ->free_chunk_space
where we weren't adding the new space we got from btrfs_grow_device.  I also
noticed a slight issue with how we remove space from ->free_chunk_space in
btrfs_srhink_device so I fixed that as well.

With these 3 patches we're no longer seeing the above failures and the original
test case is also fixed.  Thanks,

Josef

Josef Bacik (3):
  btrfs: fix ->free_chunk_space math in btrfs_shrink_device
  btrfs: increase ->free_chunk_space in btrfs_grow_device
  btrfs: adjust overcommit logic when very close to full

 fs/btrfs/space-info.c | 32 ++++++++++++++++++++++++++++++++
 fs/btrfs/volumes.c    | 21 ++++++++++++++++++---
 2 files changed, 50 insertions(+), 3 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device
  2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
@ 2023-09-27 17:46 ` Josef Bacik
  2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:46 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

There's two bugs in how we adjust ->free_chunk_space in
btrfs_shrink_device.  First we're removing the entire diff between
new_size and old_size from ->free_chunk_space.  This only works if we're
reducing the free area, which we could potentially not be.  So adjust
the math to only subtract the diff in the free space from
->free_chunk_space.

Additionally in the error case we're unconditionally adding the diff
back into ->free_chunk_space, which we need to only do if this device is
writeable.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 733842136163..907ea775f4e4 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4841,6 +4841,7 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
 	u64 old_size = btrfs_device_get_total_bytes(device);
 	u64 diff;
 	u64 start;
+	u64 free_diff = 0;
 
 	new_size = round_down(new_size, fs_info->sectorsize);
 	start = new_size;
@@ -4866,7 +4867,19 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
 	btrfs_device_set_total_bytes(device, new_size);
 	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 		device->fs_devices->total_rw_bytes -= diff;
-		atomic64_sub(diff, &fs_info->free_chunk_space);
+
+		/*
+		 * The new free_chunk_space is new_size - used, so we have to
+		 * subtract the delta of the old free_chunk_space which included
+		 * old_size - used.  If used > new_size then just subtract this
+		 * entire device's free space.
+		 */
+		if (device->bytes_used < new_size)
+			free_diff = (old_size - device->bytes_used) -
+				(new_size - device->bytes_used);
+		else
+			free_diff = old_size - device->bytes_used;
+		atomic64_sub(free_diff, &fs_info->free_chunk_space);
 	}
 
 	/*
@@ -5001,9 +5014,10 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
 	if (ret) {
 		mutex_lock(&fs_info->chunk_mutex);
 		btrfs_device_set_total_bytes(device, old_size);
-		if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
+		if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 			device->fs_devices->total_rw_bytes += diff;
-		atomic64_add(diff, &fs_info->free_chunk_space);
+			atomic64_add(free_diff, &fs_info->free_chunk_space);
+		}
 		mutex_unlock(&fs_info->chunk_mutex);
 	}
 	return ret;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
  2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
  2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
@ 2023-09-27 17:47 ` Josef Bacik
  2023-09-29 16:30   ` David Sterba
  2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
  2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba
  3 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:47 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

My overcommit patch exposed a bug with btrfs/177.  The problem here is
that when we grow the device we're not adding to ->free_chunk_space, so
subsequent allocations can cause ->free_chunk_space to wrap, which
causes problems in can_overcommit because we add this to ->total_bytes,
which causes the counter to wrap and gives us an unexpected ENOSPC.

Fix this by properly updating ->free_chunk_space with the new available
space in btrfs_grow_device.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 907ea775f4e4..1aedd15d1db7 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2932,6 +2932,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans,
 	btrfs_set_super_total_bytes(super_copy,
 			round_down(old_total + diff, fs_info->sectorsize));
 	device->fs_devices->total_rw_bytes += diff;
+	atomic64_add(diff, &fs_info->free_chunk_space);
 
 	btrfs_device_set_total_bytes(device, new_size);
 	btrfs_device_set_disk_total_bytes(device, new_size);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full
  2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
  2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
  2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
@ 2023-09-27 17:47 ` Josef Bacik
  2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba
  3 siblings, 0 replies; 8+ messages in thread
From: Josef Bacik @ 2023-09-27 17:47 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

A user reported some unpleasant behavior with very small file systems.
The reproducer is this

mkfs.btrfs -f -m single -b 8g /dev/vdb
mount /dev/vdb /mnt/test
dd if=/dev/zero of=/mnt/test/testfile bs=512M count=20

This will result in usage that looks like this

Overall:
    Device size:                   8.00GiB
    Device allocated:              8.00GiB
    Device unallocated:            1.00MiB
    Device missing:                  0.00B
    Device slack:                  2.00GiB
    Used:                          5.47GiB
    Free (estimated):              2.52GiB      (min: 2.52GiB)
    Free (statfs, df):               0.00B
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:                5.50MiB      (used: 0.00B)
    Multiple profiles:                  no

Data,single: Size:7.99GiB, Used:5.46GiB (68.41%)
   /dev/vdb        7.99GiB

Metadata,single: Size:8.00MiB, Used:5.77MiB (72.07%)
   /dev/vdb        8.00MiB

System,single: Size:4.00MiB, Used:16.00KiB (0.39%)
   /dev/vdb        4.00MiB

Unallocated:
   /dev/vdb        1.00MiB

As you can see we've gotten ourselves quite full with metadata, with all
of the disk being allocated for data.

On smaller file systems there's not a lot of time before we get full, so
our overcommit behavior bites us here.  Generally speaking data
reservations result in chunk allocations as we assume reservation ==
actual use for data.  This means at any point we could end up with a
chunk allocation for data, and if we're very close to full we could do
this before we have a chance to figure out that we need another metadata
chunk.

Address this by adjusting the overcommit logic.  Simply put we need to
take away 1 chunk from the available chunk space in case of a data
reservation.  This will allow us to stop overcommitting before we
potentially lose this space to a data allocation.  With this fix in
place we properly allocate a metadata chunk before we're completely
full, allowing for enough slack space in metadata.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/space-info.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index d7e8cd4f140c..f1cc3ea4553c 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -345,8 +345,10 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
 			  struct btrfs_space_info *space_info,
 			  enum btrfs_reserve_flush_enum flush)
 {
+	struct btrfs_space_info *data_sinfo;
 	u64 profile;
 	u64 avail;
+	u64 data_chunk_size;
 	int factor;
 
 	if (space_info->flags & BTRFS_BLOCK_GROUP_SYSTEM)
@@ -364,6 +366,36 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info,
 	 */
 	factor = btrfs_bg_type_to_factor(profile);
 	avail = div_u64(avail, factor);
+	if (avail == 0)
+		return 0;
+
+	/*
+	 * Calculate the data_chunk_size, space_info->chunk_size is the
+	 * "optimal" chunk size based on the fs size.  However when we actually
+	 * allocate the chunk we will strip this down further, making it no more
+	 * than 10% of the disk or 1G, whichever is smaller.
+	 */
+	data_sinfo = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_DATA);
+	data_chunk_size = min(data_sinfo->chunk_size,
+			      mult_perc(fs_info->fs_devices->total_rw_bytes, 10));
+	data_chunk_size = min_t(u64, data_chunk_size, SZ_1G);
+
+	/*
+	 * Since data allocations immediately use block groups as part of the
+	 * reservation, because we assume that data reservations will == actual
+	 * usage, we could potentially overcommit and then immediately have that
+	 * available space used by a data allocation, which could put us in a
+	 * bind when we get close to filling the file system.
+	 *
+	 * To handle this simply remove the data_chunk_size from the available
+	 * space.  If we are relatively empty this won't affect our ability to
+	 * overcommit much, and if we're very close to full it'll keep us from
+	 * getting into a position where we've given ourselves very little
+	 * metadata wiggle room.
+	 */
+	if (avail <= data_chunk_size)
+		return 0;
+	avail -= data_chunk_size;
 
 	/*
 	 * If we aren't flushing all things, let us overcommit up to
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
  2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
@ 2023-09-29 16:30   ` David Sterba
  2023-10-02 20:44     ` Josef Bacik
  0 siblings, 1 reply; 8+ messages in thread
From: David Sterba @ 2023-09-29 16:30 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> My overcommit patch exposed a bug with btrfs/177.  The problem here is

Which patch is that?

> that when we grow the device we're not adding to ->free_chunk_space, so
> subsequent allocations can cause ->free_chunk_space to wrap, which
> causes problems in can_overcommit because we add this to ->total_bytes,
> which causes the counter to wrap and gives us an unexpected ENOSPC.
> 
> Fix this by properly updating ->free_chunk_space with the new available
> space in btrfs_grow_device.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/volumes.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 907ea775f4e4..1aedd15d1db7 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -2932,6 +2932,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans,
>  	btrfs_set_super_total_bytes(super_copy,
>  			round_down(old_total + diff, fs_info->sectorsize));
>  	device->fs_devices->total_rw_bytes += diff;
> +	atomic64_add(diff, &fs_info->free_chunk_space);
>  
>  	btrfs_device_set_total_bytes(device, new_size);
>  	btrfs_device_set_disk_total_bytes(device, new_size);
> -- 
> 2.41.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes
  2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
                   ` (2 preceding siblings ...)
  2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
@ 2023-10-02 13:09 ` David Sterba
  3 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2023-10-02 13:09 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Wed, Sep 27, 2023 at 01:46:58PM -0400, Josef Bacik wrote:
> Hello,
> 
> This is an update to my original patch
> 
> https://lore.kernel.org/linux-btrfs/b97e47ce0ce1d41d221878de7d6090b90aa7a597.1695065233.git.josef@toxicpanda.com/
> 
> This was causing failures with btrfs/156 and btrfs/177.  The btrfs/156 problem
> was caused by the fact that I was just subtracting 1g from the available space.
> I missed a spot where we also limit the chunk size to 10% of the file system, so
> I've updated the calculation to be exactly what we choose from the chunk
> allocator, and this resovled the btrfs/156 failure.
> 
> The btrfs/177 failure was actually due to a underflow in ->free_chunk_space
> where we weren't adding the new space we got from btrfs_grow_device.  I also
> noticed a slight issue with how we remove space from ->free_chunk_space in
> btrfs_srhink_device so I fixed that as well.
> 
> With these 3 patches we're no longer seeing the above failures and the original
> test case is also fixed.  Thanks,
> 
> Josef
> 
> Josef Bacik (3):
>   btrfs: fix ->free_chunk_space math in btrfs_shrink_device
>   btrfs: increase ->free_chunk_space in btrfs_grow_device
>   btrfs: adjust overcommit logic when very close to full

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
  2023-09-29 16:30   ` David Sterba
@ 2023-10-02 20:44     ` Josef Bacik
  2023-10-03 18:47       ` David Sterba
  0 siblings, 1 reply; 8+ messages in thread
From: Josef Bacik @ 2023-10-02 20:44 UTC (permalink / raw)
  To: David Sterba; +Cc: linux-btrfs, kernel-team

On Fri, Sep 29, 2023 at 06:30:35PM +0200, David Sterba wrote:
> On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> > My overcommit patch exposed a bug with btrfs/177.  The problem here is
> 
> Which patch is that?
> 

Sorry the V1 version of this series.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device
  2023-10-02 20:44     ` Josef Bacik
@ 2023-10-03 18:47       ` David Sterba
  0 siblings, 0 replies; 8+ messages in thread
From: David Sterba @ 2023-10-03 18:47 UTC (permalink / raw)
  To: Josef Bacik; +Cc: David Sterba, linux-btrfs, kernel-team

On Mon, Oct 02, 2023 at 04:44:18PM -0400, Josef Bacik wrote:
> On Fri, Sep 29, 2023 at 06:30:35PM +0200, David Sterba wrote:
> > On Wed, Sep 27, 2023 at 01:47:00PM -0400, Josef Bacik wrote:
> > > My overcommit patch exposed a bug with btrfs/177.  The problem here is
> > 
> > Which patch is that?
> > 
> 
> Sorry the V1 version of this series.  Thanks,

Ok thanks, I've added a link to the changelog.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-10-03 18:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-27 17:46 [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes Josef Bacik
2023-09-27 17:46 ` [PATCH v2 1/3] btrfs: fix ->free_chunk_space math in btrfs_shrink_device Josef Bacik
2023-09-27 17:47 ` [PATCH v2 2/3] btrfs: increase ->free_chunk_space in btrfs_grow_device Josef Bacik
2023-09-29 16:30   ` David Sterba
2023-10-02 20:44     ` Josef Bacik
2023-10-03 18:47       ` David Sterba
2023-09-27 17:47 ` [PATCH v2 3/3] btrfs: adjust overcommit logic when very close to full Josef Bacik
2023-10-02 13:09 ` [PATCH v2 0/3] btrfs: adjust overcommit logic and fixes David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox