[PATCH] ext4: fix race condition when loading block or inode bitmaps

public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] ext4: fix race condition when loading block or inode bitmaps
@ 2011-11-25  1:08 Theodore Ts'o
  2011-11-25  2:34 ` Tao Ma
  2011-11-25  3:41 ` Yongqiang Yang
  0 siblings, 2 replies; 4+ messages in thread
From: Theodore Ts'o @ 2011-11-25  1:08 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, stable

We use an separate flag in buffer head to determine whether the bitmap
has been valid.  This is distinct from it being uptodate, due to the
uninit_bg feature.  More details about the rationale for this flag can
be found in commit 2ccb5fb9f1.  We set this bitmap_uptodate bit before
issuing the read request, so if another CPU attempts to load the same
block or inode bitmap, since ext4_read_{block,inode}_bitmap() checks
the bitmap_uptodate flag without locking the buffer head, hilarity
ensues.

This result of this bug is that occasionally a block or inode gets
allocated twice, which gets noticed when the second user of the block
gets deleted, or when an directory suddenly becomes a regular file or
a symlink.  I'm *really* surprised this doesn't happen more often; but
in actual practice the fact that we tend to search for a zero bit in
the bitmap without taking a lock, and then taking the block group lock
and double checking to see if we actually got the allocation tends to
protect us.

This bug was introduced in commit 2ccb5fb9f1, which dates back to
January 2009 and 2.6.29.  So this bug has been around for a *long*
time.  (We've seen it for over a year, but rarely enough that it we
could never find a repro case so we could study it in controlled
circumstances.)

Google-Bug-Id: 2828254
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
---
 fs/ext4/balloc.c |   12 ++++++------
 fs/ext4/ialloc.c |   12 ++++++------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 12ccacd..4501aab 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -372,7 +372,7 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
 	ext4_unlock_group(sb, block_group);
 	if (buffer_uptodate(bh)) {
 		/*
-		 * if not uninit if bh is uptodate,
+		 * if not uninit && bh is uptodate,
 		 * bitmap is also uptodate
 		 */
 		set_bitmap_uptodate(bh);
@@ -380,13 +380,12 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
 		return bh;
 	}
 	/*
-	 * submit the buffer_head for read. We can
-	 * safely mark the bitmap as uptodate now.
-	 * We do it here so the bitmap uptodate bit
-	 * get set with buffer lock held.
+	 * submit the buffer_head for read.  It's important that we
+	 * *not* mark the bitmap up to date until the read is
+	 * completed, since we check bitmap_update() above without
+	 * locking the buffer for speed reasons.
 	 */
 	trace_ext4_read_block_bitmap_load(sb, block_group);
-	set_bitmap_uptodate(bh);
 	if (bh_submit_read(bh) < 0) {
 		put_bh(bh);
 		ext4_error(sb, "Cannot read block bitmap - "
@@ -394,6 +393,7 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
 			    block_group, bitmap_blk);
 		return NULL;
 	}
+	set_bitmap_uptodate(bh);
 	ext4_valid_block_bitmap(sb, desc, block_group, bh);
 	/*
 	 * file system mounted not to panic on error,
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 00beb4f..6fbae6d 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -139,7 +139,7 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
 
 	if (buffer_uptodate(bh)) {
 		/*
-		 * if not uninit if bh is uptodate,
+		 * if not uninit && bh is uptodate,
 		 * bitmap is also uptodate
 		 */
 		set_bitmap_uptodate(bh);
@@ -147,13 +147,12 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
 		return bh;
 	}
 	/*
-	 * submit the buffer_head for read. We can
-	 * safely mark the bitmap as uptodate now.
-	 * We do it here so the bitmap uptodate bit
-	 * get set with buffer lock held.
+	 * submit the buffer_head for read.  It's important that we
+	 * *not* mark the bitmap up to date until the read is
+	 * completed, since we check bitmap_update() above without
+	 * locking the buffer for speed reasons.
 	 */
 	trace_ext4_load_inode_bitmap(sb, block_group);
-	set_bitmap_uptodate(bh);
 	if (bh_submit_read(bh) < 0) {
 		put_bh(bh);
 		ext4_error(sb, "Cannot read inode bitmap - "
@@ -161,6 +160,7 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
 			    block_group, bitmap_blk);
 		return NULL;
 	}
+	set_bitmap_uptodate(bh);
 	return bh;
 }
 
-- 
1.7.4.1.22.gec8e1.dirty


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: fix race condition when loading block or inode bitmaps
  2011-11-25  1:08 [PATCH] ext4: fix race condition when loading block or inode bitmaps Theodore Ts'o
@ 2011-11-25  2:34 ` Tao Ma
  2011-11-25 16:19   ` Ted Ts'o
  2011-11-25  3:41 ` Yongqiang Yang
  1 sibling, 1 reply; 4+ messages in thread
From: Tao Ma @ 2011-11-25  2:34 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, stable

Hi Ted,
On 11/25/2011 09:08 AM, Theodore Ts'o wrote:
> We use an separate flag in buffer head to determine whether the bitmap
> has been valid.  This is distinct from it being uptodate, due to the
> uninit_bg feature.  More details about the rationale for this flag can
> be found in commit 2ccb5fb9f1.  We set this bitmap_uptodate bit before
> issuing the read request, so if another CPU attempts to load the same
> block or inode bitmap, since ext4_read_{block,inode}_bitmap() checks
> the bitmap_uptodate flag without locking the buffer head, hilarity
> ensues.
> 
> This result of this bug is that occasionally a block or inode gets
> allocated twice, which gets noticed when the second user of the block
> gets deleted, or when an directory suddenly becomes a regular file or
> a symlink.  I'm *really* surprised this doesn't happen more often; but
> in actual practice the fact that we tend to search for a zero bit in
> the bitmap without taking a lock, and then taking the block group lock
> and double checking to see if we actually got the allocation tends to
> protect us.
Sorry, but I don't get your meaning here.
In bitmap_uptodate, we check both the flag of BH_uptodate and
BH_BITMAP_UPTODATE. And in your patch below, we just move the set of
bitmap_uptodate after bh_uptodate. So I don't think the above scenario
would ever happen. Could you please explain it in more detail?

Thanks
Tao
> 
> This bug was introduced in commit 2ccb5fb9f1, which dates back to
> January 2009 and 2.6.29.  So this bug has been around for a *long*
> time.  (We've seen it for over a year, but rarely enough that it we
> could never find a repro case so we could study it in controlled
> circumstances.)
> 
> Google-Bug-Id: 2828254
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> Cc: stable@kernel.org
> ---
>  fs/ext4/balloc.c |   12 ++++++------
>  fs/ext4/ialloc.c |   12 ++++++------
>  2 files changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
> index 12ccacd..4501aab 100644
> --- a/fs/ext4/balloc.c
> +++ b/fs/ext4/balloc.c
> @@ -372,7 +372,7 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
>  	ext4_unlock_group(sb, block_group);
>  	if (buffer_uptodate(bh)) {
>  		/*
> -		 * if not uninit if bh is uptodate,
> +		 * if not uninit && bh is uptodate,
>  		 * bitmap is also uptodate
>  		 */
>  		set_bitmap_uptodate(bh);
> @@ -380,13 +380,12 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
>  		return bh;
>  	}
>  	/*
> -	 * submit the buffer_head for read. We can
> -	 * safely mark the bitmap as uptodate now.
> -	 * We do it here so the bitmap uptodate bit
> -	 * get set with buffer lock held.
> +	 * submit the buffer_head for read.  It's important that we
> +	 * *not* mark the bitmap up to date until the read is
> +	 * completed, since we check bitmap_update() above without
> +	 * locking the buffer for speed reasons.
>  	 */
>  	trace_ext4_read_block_bitmap_load(sb, block_group);
> -	set_bitmap_uptodate(bh);
>  	if (bh_submit_read(bh) < 0) {
>  		put_bh(bh);
>  		ext4_error(sb, "Cannot read block bitmap - "
> @@ -394,6 +393,7 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
>  			    block_group, bitmap_blk);
>  		return NULL;
>  	}
> +	set_bitmap_uptodate(bh);
>  	ext4_valid_block_bitmap(sb, desc, block_group, bh);
>  	/*
>  	 * file system mounted not to panic on error,
> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
> index 00beb4f..6fbae6d 100644
> --- a/fs/ext4/ialloc.c
> +++ b/fs/ext4/ialloc.c
> @@ -139,7 +139,7 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
>  
>  	if (buffer_uptodate(bh)) {
>  		/*
> -		 * if not uninit if bh is uptodate,
> +		 * if not uninit && bh is uptodate,
>  		 * bitmap is also uptodate
>  		 */
>  		set_bitmap_uptodate(bh);
> @@ -147,13 +147,12 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
>  		return bh;
>  	}
>  	/*
> -	 * submit the buffer_head for read. We can
> -	 * safely mark the bitmap as uptodate now.
> -	 * We do it here so the bitmap uptodate bit
> -	 * get set with buffer lock held.
> +	 * submit the buffer_head for read.  It's important that we
> +	 * *not* mark the bitmap up to date until the read is
> +	 * completed, since we check bitmap_update() above without
> +	 * locking the buffer for speed reasons.
>  	 */
>  	trace_ext4_load_inode_bitmap(sb, block_group);
> -	set_bitmap_uptodate(bh);
>  	if (bh_submit_read(bh) < 0) {
>  		put_bh(bh);
>  		ext4_error(sb, "Cannot read inode bitmap - "
> @@ -161,6 +160,7 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
>  			    block_group, bitmap_blk);
>  		return NULL;
>  	}
> +	set_bitmap_uptodate(bh);
>  	return bh;
>  }
>  


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: fix race condition when loading block or inode bitmaps
  2011-11-25  1:08 [PATCH] ext4: fix race condition when loading block or inode bitmaps Theodore Ts'o
  2011-11-25  2:34 ` Tao Ma
@ 2011-11-25  3:41 ` Yongqiang Yang
  1 sibling, 0 replies; 4+ messages in thread
From: Yongqiang Yang @ 2011-11-25  3:41 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, stable

On Fri, Nov 25, 2011 at 9:08 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> We use an separate flag in buffer head to determine whether the bitmap
> has been valid.  This is distinct from it being uptodate, due to the
> uninit_bg feature.  More details about the rationale for this flag can
> be found in commit 2ccb5fb9f1.  We set this bitmap_uptodate bit before
> issuing the read request, so if another CPU attempts to load the same
> block or inode bitmap, since ext4_read_{block,inode}_bitmap() checks
> the bitmap_uptodate flag without locking the buffer head, hilarity
> ensues.
>
> This result of this bug is that occasionally a block or inode gets
> allocated twice, which gets noticed when the second user of the block
> gets deleted, or when an directory suddenly becomes a regular file or
> a symlink.  I'm *really* surprised this doesn't happen more often; but
> in actual practice the fact that we tend to search for a zero bit in
> the bitmap without taking a lock, and then taking the block group lock
> and double checking to see if we actually got the allocation tends to
> protect us.
It is true for inode bitmap,  but block bitmap is another story,
blocks are allocated from buddy allocator and mb_load_buddy does not
call block_read_bitmap at all.   block_read_bitmap is called in
mark_space_used, free_blocks and free_inode_pa.   So I am guessing if
mb_load_buddy calls block_read_bitmap, the bug will reproduced easily.


BTW:  It seems that we should factor code reading bitmaps.   Now there
is one function for each every bitmap.  and we can pass
read_block/inode_bitmap a flag indicates sync or async read and a flag
indicates which kind of bitmap.   Thus mb_load_buddy can use
read_bimap as well and the code will be much more maintainable.

ext4-snapshot has exclude bitmap, so if we read all bitmaps via only
one function, it would be much better.

Any opinion?

Yongqiang.


>
> This bug was introduced in commit 2ccb5fb9f1, which dates back to
> January 2009 and 2.6.29.  So this bug has been around for a *long*
> time.  (We've seen it for over a year, but rarely enough that it we
> could never find a repro case so we could study it in controlled
> circumstances.)
>
> Google-Bug-Id: 2828254
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> Cc: stable@kernel.org
> ---
>  fs/ext4/balloc.c |   12 ++++++------
>  fs/ext4/ialloc.c |   12 ++++++------
>  2 files changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
> index 12ccacd..4501aab 100644
> --- a/fs/ext4/balloc.c
> +++ b/fs/ext4/balloc.c
> @@ -372,7 +372,7 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
>        ext4_unlock_group(sb, block_group);
>        if (buffer_uptodate(bh)) {
>                /*
> -                * if not uninit if bh is uptodate,
> +                * if not uninit && bh is uptodate,
>                 * bitmap is also uptodate
>                 */
>                set_bitmap_uptodate(bh);
> @@ -380,13 +380,12 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
>                return bh;
>        }
>        /*
> -        * submit the buffer_head for read. We can
> -        * safely mark the bitmap as uptodate now.
> -        * We do it here so the bitmap uptodate bit
> -        * get set with buffer lock held.
> +        * submit the buffer_head for read.  It's important that we
> +        * *not* mark the bitmap up to date until the read is
> +        * completed, since we check bitmap_update() above without
> +        * locking the buffer for speed reasons.
>         */
>        trace_ext4_read_block_bitmap_load(sb, block_group);
> -       set_bitmap_uptodate(bh);
>        if (bh_submit_read(bh) < 0) {
>                put_bh(bh);
>                ext4_error(sb, "Cannot read block bitmap - "
> @@ -394,6 +393,7 @@ ext4_read_block_bitmap(struct super_block *sb, ext4_group_t block_group)
>                            block_group, bitmap_blk);
>                return NULL;
>        }
> +       set_bitmap_uptodate(bh);
>        ext4_valid_block_bitmap(sb, desc, block_group, bh);
>        /*
>         * file system mounted not to panic on error,
> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
> index 00beb4f..6fbae6d 100644
> --- a/fs/ext4/ialloc.c
> +++ b/fs/ext4/ialloc.c
> @@ -139,7 +139,7 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
>
>        if (buffer_uptodate(bh)) {
>                /*
> -                * if not uninit if bh is uptodate,
> +                * if not uninit && bh is uptodate,
>                 * bitmap is also uptodate
>                 */
>                set_bitmap_uptodate(bh);
> @@ -147,13 +147,12 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
>                return bh;
>        }
>        /*
> -        * submit the buffer_head for read. We can
> -        * safely mark the bitmap as uptodate now.
> -        * We do it here so the bitmap uptodate bit
> -        * get set with buffer lock held.
> +        * submit the buffer_head for read.  It's important that we
> +        * *not* mark the bitmap up to date until the read is
> +        * completed, since we check bitmap_update() above without
> +        * locking the buffer for speed reasons.
>         */
>        trace_ext4_load_inode_bitmap(sb, block_group);
> -       set_bitmap_uptodate(bh);
>        if (bh_submit_read(bh) < 0) {
>                put_bh(bh);
>                ext4_error(sb, "Cannot read inode bitmap - "
> @@ -161,6 +160,7 @@ ext4_read_inode_bitmap(struct super_block *sb, ext4_group_t block_group)
>                            block_group, bitmap_blk);
>                return NULL;
>        }
> +       set_bitmap_uptodate(bh);
>        return bh;
>  }
>
> --
> 1.7.4.1.22.gec8e1.dirty
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: fix race condition when loading block or inode bitmaps
  2011-11-25  2:34 ` Tao Ma
@ 2011-11-25 16:19   ` Ted Ts'o
  0 siblings, 0 replies; 4+ messages in thread
From: Ted Ts'o @ 2011-11-25 16:19 UTC (permalink / raw)
  To: Tao Ma; +Cc: Ext4 Developers List, stable

On Fri, Nov 25, 2011 at 10:34:54AM +0800, Tao Ma wrote:
> Sorry, but I don't get your meaning here.
> In bitmap_uptodate, we check both the flag of BH_uptodate and
> BH_BITMAP_UPTODATE. And in your patch below, we just move the set of
> bitmap_uptodate after bh_uptodate. So I don't think the above scenario
> would ever happen. Could you please explain it in more detail?

Yes, you're right.  I didn't realize bitmap_uptodate() checked both
bits.  I had assumed it used the same convention as the other
functions that set/git bits in bh_state.

Rats, and I thought I had finally nailed the bug...

						- Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-11-25 16:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-25  1:08 [PATCH] ext4: fix race condition when loading block or inode bitmaps Theodore Ts'o
2011-11-25  2:34 ` Tao Ma
2011-11-25 16:19   ` Ted Ts'o
2011-11-25  3:41 ` Yongqiang Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox