Linux kernel -stable discussions
 help / color / mirror / Atom feed
From: Joseph Qi <joseph.qi@linux.alibaba.com>
To: Heming Zhao <heming.zhao@suse.com>, ocfs2-devel@lists.linux.dev
Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	gregkh@linuxfoundation.org, stable@vger.kernel.org
Subject: Re: [PATCH 1/2] ocfs2: Revert "ocfs2: fix the la space leak when unmounting an ocfs2 volume"
Date: Wed, 4 Dec 2024 17:28:20 +0800	[thread overview]
Message-ID: <58618b80-f1ab-4b22-a3fc-9d29969615a0@linux.alibaba.com> (raw)
In-Reply-To: <79b86b7b-8a65-49b8-aa33-bb73de47ad37@suse.com>



On 12/4/24 2:46 PM, Heming Zhao wrote:
> On 12/4/24 11:47, Joseph Qi wrote:
>>
>>
>> On 12/4/24 11:32 AM, Heming Zhao wrote:
>>> This reverts commit dfe6c5692fb5 ("ocfs2: fix the la space leak when
>>> unmounting an ocfs2 volume").
>>>
>>> In commit dfe6c5692fb5, the commit log stating "This bug has existed
>>> since the initial OCFS2 code." is incorrect. The correct introduction
>>> commit is 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()").
>>>
>>
>> Could you please elaborate more how it happens?
>> And it seems no difference with the new version. So we may submit a
>> standalone revert patch to those backported stable kernels (< 6.10).
> 
> commit log from patch [2/2] should be revised.
> change: This bug has existed since the initial OCFS2 code.
> to    : This bug was introduced by commit 30dd3478c3cd ("ocfs2: correctly use ocfs2_find_next_zero_bit()")
> 
> ----
> See below for the details of patch [1/2].
> 
> following is "the code before commit 30dd3478c3cd7" + "commit dfe6c5692fb525e".
> 
>    static int ocfs2_sync_local_to_main()
>    {
>        ... ...
>  1      while ((bit_off = ocfs2_find_next_zero_bit(bitmap, left, start))
>  2             != -1) {
>  3          if ((bit_off < left) && (bit_off == start)) {
>  4              count++;
>  5              start++;
>  6              continue;
>  7          }
>  8          if (count) {
>  9              blkno = la_start_blk +
> 10                   ocfs2_clusters_to_blocks(osb->sb,
> 11                                start - count);
> 12
> 13               trace_ocfs2_sync_local_to_main_free();
> 14
> 15               status = ocfs2_release_clusters(handle,
> 16                               main_bm_inode,
> 17                               main_bm_bh, blkno,
> 18                               count);
> 19               if (status < 0) {
> 20                   mlog_errno(status);
> 21                   goto bail;
> 22               }
> 23           }
> 24           if (bit_off >= left)
> 25               break;
> 26           count = 1;
> 27           start = bit_off + 1;
> 28       }
> 29
> 30     /* clear the contiguous bits until the end boundary */
> 31     if (count) {
> 32         blkno = la_start_blk +
> 33             ocfs2_clusters_to_blocks(osb->sb,
> 34                     start - count);
> 35
> 36         trace_ocfs2_sync_local_to_main_free();
> 37
> 38         status = ocfs2_release_clusters(handle,
> 39                 main_bm_inode,
> 40                 main_bm_bh, blkno,
> 41                 count);
> 42         if (status < 0)
> 43             mlog_errno(status);
> 44      }
>        ... ...
>    }
> 
> bug flow:
> 1. the left:10000, start:0, bit_off:9000, and there are zeros from 9000 to the end of bitmap.
> 2. when 'start' is 9999, code runs to line 3, where bit_off is 10000 (the 'left' value), it doesn't trigger line 3.
> 3. code runs to line 8 (where 'count' is 9999), this area releases 9999 bytes of space to main_bm.
> 4. code runs to line 24, triggering "bit_off == left" and 'break' the loop. at this time, the 'count' still retains its old value 9999.
> 5. code runs to line 31, this area code releases space to main_bm for the same gd again.
> 
> kernel will report the following likely error:
> OCFS2: ERROR (device dm-0): ocfs2_block_group_clear_bits: Group descriptor # 349184 has bit count 15872 but claims 19871 are freed. num_bits 7878
> 

Okay, IIUC, it seems we have to:
1. revert commit dfe6c5692fb5 (so does stable kernel).
2. fix 30dd3478c3cd in following way:

diff --git a/fs/ocfs2/localalloc.c b/fs/ocfs2/localalloc.c
index 5df34561c551..f0feadac2ef1 100644
--- a/fs/ocfs2/localalloc.c
+++ b/fs/ocfs2/localalloc.c
@@ -971,9 +971,9 @@ static int ocfs2_sync_local_to_main(struct ocfs2_super *osb,
 	start = count = 0;
 	left = le32_to_cpu(alloc->id1.bitmap1.i_total);
 
-	while ((bit_off = ocfs2_find_next_zero_bit(bitmap, left, start)) <
+	while ((bit_off = ocfs2_find_next_zero_bit(bitmap, left, start)) <=
 	       left) {
-		if (bit_off == start) {
+		if ((bit_off < left) && (bit_off == start)) {
 			count++;
 			start++;
 			continue;
@@ -997,7 +997,8 @@ static int ocfs2_sync_local_to_main(struct ocfs2_super *osb,
 				goto bail;
 			}
 		}
-
+		if (bit_off >= left)
+			break;
 		count = 1;
 		start = bit_off + 1;
 	}

Thanks,
Joseph



  reply	other threads:[~2024-12-04  9:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20241204033243.8273-1-heming.zhao@suse.com>
2024-12-04  3:32 ` [PATCH 1/2] ocfs2: Revert "ocfs2: fix the la space leak when unmounting an ocfs2 volume" Heming Zhao
2024-12-04  3:47   ` Joseph Qi
2024-12-04  6:46     ` Heming Zhao
2024-12-04  9:28       ` Joseph Qi [this message]
2024-12-04 11:11         ` Heming Zhao
2024-12-04 11:34         ` Heming Zhao
2024-12-04 12:09           ` Joseph Qi
2024-12-12  8:18       ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58618b80-f1ab-4b22-a3fc-9d29969615a0@linux.alibaba.com \
    --to=joseph.qi@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=heming.zhao@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ocfs2-devel@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox