Linux bcache driver list
 help / color / mirror / Atom feed
From: colyli <colyli@suse.de>
To: 邹明哲 <mingzhe.zou@easystack.cn>
Cc: linux-bcache <linux-bcache@vger.kernel.org>,
	zoumingzhe <zoumingzhe@qq.com>
Subject: Re: [PATCH v2 2/3] bcache: fix io error during cache read race
Date: Sun, 22 Dec 2024 00:17:06 +0800	[thread overview]
Message-ID: <4da838e985ec3f7263a4ad4a9b307d15@suse.de> (raw)
In-Reply-To: <AMcAQwBHLpIksKhhq8fkdqro.3.1734672038253.Hmail.mingzhe.zou@easystack.cn>

在 2024-12-20 13:20,邹明哲 写道:
> Hi, Coly:
> 
> Our users have reported this issue to us in their generation 
> environment!
> 
> Please review these patches and provide feedback.
> 
> Thank you very much.

Hi Mingzhe,

Yes it is planed for next week, I will start to look at this series.

BTW, I don't see change log from the v1 to v2 series. Can I assume the 
v2 series fix warning reported by kernel test robot?

Thanks.

Coly Li

> 
> mingzhe
> 
> Original:
> From:mingzhe.zou<mingzhe.zou@easystack.cn>
> Date:2024-11-19 15:40:30(中国 (GMT+08:00))
> To:colyli<colyli@suse.de>
> Cc:linux-bcache<linux-bcache@vger.kernel.org> ,
> dongsheng.yang<dongsheng.yang@easystack.cn> ,
> zoumingzhe<zoumingzhe@qq.com>
> Subject:[PATCH v2 2/3] bcache: fix io error during cache read race
> From: Mingzhe Zou <mingzhe.zou@easystack.cn>
> 
> In our production environment, bcache returned IO_ERROR(errno=-5).
> These errors always happen during 1M read IO under high pressure
> and without any message log. When the error occurred, we stopped
> all reading and writing and used 1M read IO to read the entire disk
> without any errors. Later we found that cache_read_races of cache_set
> is non-zero.
> 
> If a large (1M) read bio is split into two or more bios, when one bio
> reads dirty data, s-&gt;read_dirty_data will be set to true and remain.
> If the bucket was reused while our subsequent read bio was in flight,
> the read will be unrecoverable(cannot read data from backing).
> 
> This patch increases the count for bucket-&gt;pin to prevent the bucket
> from being reclaimed and reused.
> 
> Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
> ---
>  drivers/md/bcache/request.c | 39 ++++++++++++++++++++++++-------------
>  1 file changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
> index af345dc6fde1..6c41957138e5 100644
> --- a/drivers/md/bcache/request.c
> +++ b/drivers/md/bcache/request.c
> @@ -502,12 +502,8 @@ static void bch_cache_read_endio(struct bio *bio)
>  	struct closure *cl = bio-&gt;bi_private;
>  	struct search *s = container_of(cl, struct search, cl);
> 
> -	/*
> -	 * If the bucket was reused while our bio was in flight, we might 
> have
> -	 * read the wrong data. Set s-&gt;error but not error so it doesn't 
> get
> -	 * counted against the cache device, but we'll still reread the data
> -	 * from the backing device.
> -	 */
> +	BUG_ON(ptr_stale(s-&gt;iop.c, &amp;b-&gt;key, 0)); // bucket should
> not be reused
> +	atomic_dec(&amp;PTR_BUCKET(s-&gt;iop.c, &amp;b-&gt;key, 0)-&gt;pin);
> 
>  	if (bio-&gt;bi_status)
>  		s-&gt;iop.status = bio-&gt;bi_status;
> @@ -520,6 +516,8 @@ static void bch_cache_read_endio(struct bio *bio)
>  	bch_bbio_endio(s-&gt;iop.c, bio, bio-&gt;bi_status, "reading from 
> cache");
>  }
> 
> +static void backing_request_endio(struct bio *bio);
> +
>  /*
>   * Read from a single key, handling the initial cache miss if the key 
> starts in
>   * the middle of the bio
> @@ -529,7 +527,6 @@ static int cache_lookup_fn(struct btree_op *op,
> struct btree *b, struct bkey *k)
>  	struct search *s = container_of(op, struct search, op);
>  	struct bio *n, *bio = &amp;s-&gt;bio.bio;
>  	struct bkey *bio_key;
> -	unsigned int ptr;
> 
>  	if (bkey_cmp(k, &amp;KEY(s-&gt;iop.inode, bio-&gt;bi_iter.bi_sector,
> 0)) &lt;= 0)
>  		return MAP_CONTINUE;
> @@ -553,20 +550,36 @@ static int cache_lookup_fn(struct btree_op *op,
> struct btree *b, struct bkey *k)
>  	if (!KEY_SIZE(k))
>  		return MAP_CONTINUE;
> 
> -	/* XXX: figure out best pointer - for multiple cache devices */
> -	ptr = 0;
> +	/*
> +	 * If the bucket was reused while our bio was in flight, we might 
> have
> +	 * read the wrong data. Set s-&gt;cache_read_races and reread the 
> data
> +	 * from the backing device.
> +	 */
> +	spin_lock(&amp;PTR_BUCKET(b-&gt;c, k, 0)-&gt;lock);
> +	if (ptr_stale(s-&gt;iop.c, k, 0)) {
> +		spin_unlock(&amp;PTR_BUCKET(b-&gt;c, k, 0)-&gt;lock);
> +		atomic_long_inc(&amp;s-&gt;iop.c-&gt;cache_read_races);
> +		pr_warn("%pU cache read race count: %lu", 
> s-&gt;iop.c-&gt;sb.set_uuid,
> +			atomic_long_read(&amp;s-&gt;iop.c-&gt;cache_read_races));
> 
> -	PTR_BUCKET(b-&gt;c, k, ptr)-&gt;prio = INITIAL_PRIO;
> +		n-&gt;bi_end_io	= backing_request_endio;
> +		n-&gt;bi_private	= &amp;s-&gt;cl;
> +
> +		/* I/O request sent to backing device */
> +		closure_bio_submit(s-&gt;iop.c, n, &amp;s-&gt;cl);
> +		return n == bio ? MAP_DONE : MAP_CONTINUE;
> +	}
> +	atomic_inc(&amp;PTR_BUCKET(s-&gt;iop.c, k, 0)-&gt;pin);
> +	spin_unlock(&amp;PTR_BUCKET(b-&gt;c, k, 0)-&gt;lock);
> 
> -	if (KEY_DIRTY(k))
> -		s-&gt;read_dirty_data = true;
> +	PTR_BUCKET(b-&gt;c, k, 0)-&gt;prio = INITIAL_PRIO;
> 
>  	n = bio_next_split(bio, min_t(uint64_t, INT_MAX,
>  				      KEY_OFFSET(k) - bio-&gt;bi_iter.bi_sector),
>  			   GFP_NOIO, &amp;s-&gt;d-&gt;bio_split);
> 
>  	bio_key = &amp;container_of(n, struct bbio, bio)-&gt;key;
> -	bch_bkey_copy_single_ptr(bio_key, k, ptr);
> +	bch_bkey_copy_single_ptr(bio_key, k, 0);
> 
>  	bch_cut_front(&amp;KEY(s-&gt;iop.inode, n-&gt;bi_iter.bi_sector, 0), 
> bio_key);
>  	bch_cut_back(&amp;KEY(s-&gt;iop.inode, bio_end_sector(n), 0), 
> bio_key);

  reply	other threads:[~2024-12-21 16:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-19  7:40 [PATCH v2 1/3] bcache: avoid invalidating buckets in use mingzhe.zou
2024-11-19  7:40 ` [PATCH v2 2/3] bcache: fix io error during cache read race mingzhe.zou
2024-12-20  5:20   ` 邹明哲
2024-12-21 16:17     ` colyli [this message]
2024-12-24  2:38       ` [PATCH " 邹明哲
2024-11-19  7:40 ` [PATCH v2 3/3] bcache: remove unused parameters mingzhe.zou
2025-01-22  4:42 ` [PATCH v2 1/3] bcache: avoid invalidating buckets in use Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4da838e985ec3f7263a4ad4a9b307d15@suse.de \
    --to=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=mingzhe.zou@easystack.cn \
    --cc=zoumingzhe@qq.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox