Linux bcache driver list
 help / color / mirror / Atom feed
From: Ankit Kapoor <ankitkap@google.com>
To: colyli@fygo.io
Cc: ankitkap@google.com, kent.overstreet@linux.dev,
	 linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/1] bcache: fix stale data race between read cache miss and bypass write
Date: Wed, 27 May 2026 13:41:08 +0000	[thread overview]
Message-ID: <20260527134109.2659134-1-ankitkap@google.com> (raw)
In-Reply-To: <ahRNUlU60mx9ESd1@studio.local>

Hi Coly,

Thank you for the feedback, for confirming the issue, and for the guidance.

> Hi Ankit,
> 
> Yes, I confirm this is an issue that must be solved. Nice catch!
> 
> On Thu, May 21, 2026 at 04:39:25PM +0800, Ankit Kapoor wrote:
>> A race condition exists between a read cache miss and a bypass write
>> due to either congestion or sequential bypass, that causes stale data
>> to be cached when the read cache miss runs concurrently with a bypass
>> write targeting the same sectors.
> 
> This patch fixes the stale data issue in run time, but if power failure
> happens inside the race window, after boot up again, the stale data
> still exists in cache for following read hits.
> 
> And your fix invalidate the key after on-disk bio completed, which makes
> such stale data window by power failure longer.

While I initially hoped that serializing the operations would suffice, I
completely agree with your point regarding the power-failure risk
which shall be addressed.

> To solve all the stale data race both for run time and power failure
> condition, could you please consider the following proposal.
> 
> Maintain a data structure to hold all invalidate range from by-pass
> write, record/insert the invalidation range before bch_data_insert(),
> and after cached_dev_write_complete(), clear/remove the invalidation
> range.
> 
> For a cache-miss read, if there is any invalidation range refcount
> exists, check all non-zero refcount ranges, if any range overlaps with
> the cache-miss read range, do NOT update the missing bkey back to btree
> and only read data from backing device.

I am now working on a new implementation to track the in-flight 
sectors currently being written, exactly as you suggested here.

> Here you need to design a efficient data structure both for performance
> and memory consumption. I would sugguest to maintain chunk refcounts
> which mapping multiple 32MB ranges on cache device (current max key size
> if I remember correctly) range. You may look at how md raid maintains
> the legacy bitmap refcount, hope that code can give you any hint.

Thanks, I will look into the md raid legacy bitmap reference implementation for
hints. In the meantime, could you please recommend any specific fio
configurations or workloads you prefer for evaluating the memory
overhead and performance impact of this change?

I will send a v2 patch series as soon as the tracking mechanism is ready
and thoroughly tested.

Best regards,
Ankit Kapoor

  reply	other threads:[~2026-05-27 13:41 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-21 16:39 [PATCH 0/1] bcache: fix stale data race between read cache miss and bypass write Ankit Kapoor
2026-05-21 16:39 ` [PATCH 1/1] " Ankit Kapoor
2026-05-25 13:41   ` Coly Li
2026-05-27 13:41     ` Ankit Kapoor [this message]
2026-05-27 15:27       ` Coly Li
2026-05-24 16:12 ` [PATCH 0/1] " Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260527134109.2659134-1-ankitkap@google.com \
    --to=ankitkap@google.com \
    --cc=colyli@fygo.io \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox