From: "Eddie Chapman" <eddie@ehuk.net>
To: "Coly Li" <colyli@suse.de>, "Mingzhe Zou" <mingzhe.zou@easystack.cn>
Cc: "Eric Wheeler" <bcache@lists.ewheeler.net>,
linux-bcache@vger.kernel.org, zoumingzhe@qq.com
Subject: Re: [PATCH] bcache: fixup init dirty data errors
Date: Thu, 7 Sep 2023 12:14:10 +0100 [thread overview]
Message-ID: <de1e6ddd2e8534eeead33273ff3d4026.squirrel@ukinbox.ecrypt.net> (raw)
In-Reply-To: <81A90714-59BC-47F6-BFD5-26A8B90A7326@suse.de>
Coly Li wrote:
>
>> 2023年8月24日 20:49,邹明哲 <mingzhe.zou@easystack.cn> 写道:
>>
>> From: Coly Li <colyli@suse.de>
>> Date: 2023-08-23 01:49:32
>> To: Mingzhe Zou <mingzhe.zou@easystack.cn>
>> Cc:
>> bcache@lists.ewheeler.net,linux-bcache@vger.kernel.org,zoumingzhe@qq.co
>> m Subject: Re: [PATCH] bcache: fixup init dirty data errors>Hi Mingzhe,
>>
>>> On Tue, Aug 22, 2023 at 06:19:58PM +0800, Mingzhe Zou wrote:
>>>
>>>> We found that after long run, the dirty_data of the bcache device
>>>> will have errors. This error cannot be eliminated unless
>>>> re-register.
>>>
>>> Could you explain what the error was?
>>>
>> Hi, Coly
>>
>> We discovered dirty_data was inaccurate a long time ago.
>> When writeback thread flushes all dirty data, dirty_data via sysfs shows
>> that there are still several K to tens of M of dirty data.
>>
>> At that time, I thought it might be a calculation error at runtime, but
>> after reviewing the relevant code, no error was found.
>>
>> Last month, our online environment found that a certain device had more
>> than 200G of dirty_data. This brings us back to the question.
>>
>>>> We also found that reattach after detach, this error can
>>>> accumulate.
>>>
>>> Could you elaborate how the error can accumulate?
>>>
>> I found that when dirty_data, error, detach and then re-attach can not
>> eliminate the error, the error will continue.
>>
>> In bch_cached_dev_attach(), after bch_sectors_dirty_init(), attach may
>> still fail, but dirty_data is not cleared when it fails ```
>> bch_sectors_dirty_init(&dc->disk);
>>
>> ret = bch_cached_dev_run(dc); if (ret && (ret != -EBUSY)) {
>> up_write(&dc->writeback_lock); /*
>> * bch_register_lock is held, bcache_device_stop() is not
>> * able to be directly called. The kthread and kworker
>> * created previously in bch_cached_dev_writeback_start()
>> * have to be stopped manually here.
>> */
>> kthread_stop(dc->writeback_thread); dc->writeback_thread = NULL;
>> cancel_writeback_rate_update_dwork(dc); pr_err("Couldn't run cached
>> device %s", dc->backing_dev_name); return ret; }
>> ```
>>>
>>>> In bch_sectors_dirty_init(), all inode <= d->id keys will be
>>>> recounted again. This is wrong, we only need to count the keys of
>>>> the current device.
>>>>
>>>> Fixes: b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be
>>>> multithreaded") Signed-off-by: Mingzhe Zou
>>>> <mingzhe.zou@easystack.cn>
>>>> ---
>>>> drivers/md/bcache/writeback.c | 7 ++++++- 1 file changed, 6
>>>> insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/md/bcache/writeback.c
>>>> b/drivers/md/bcache/writeback.c index 24c049067f61..71d0dabcbf9d
>>>> 100644
>>>> --- a/drivers/md/bcache/writeback.c
>>>> +++ b/drivers/md/bcache/writeback.c
>>>> @@ -983,6 +983,8 @@ void bch_sectors_dirty_init(struct bcache_device
>>>> *d)
>>>> struct cache_set *c = d->c; struct bch_dirty_init_state state;
>>>>
>>>> + atomic_long_set(&d->dirty_sectors, 0);
>>>> +
>>>
>>> The above change is not upstreamed yet, if you are fixing upstream
>>> code please avoid to use d->dirty_sectors here.
>>
>> Yes, dirty_sectors is a set of resize patches submitted to the
>> community before, these patches have not been merged into upstream, I
>> will remove this line in v2.
>>
>> In fact, the change about dirty_sectors is only a prerequisite for
>> resize, and it can be promoted first. It will greatly reduce the memory
>> requirements of high-capacity devices.
>>
>>>> /* Just count root keys if no leaf node */
>>>> rw_lock(0, c->root, c->root->level); if (c->root->level == 0) { @@
>>>> -991,8 +993,11 @@ void bch_sectors_dirty_init(struct bcache_device
>>>> *d)
>>>> op.count = 0;
>>>>
>>>> for_each_key_filter(&c->root->keys, - k, &iter, bch_ptr_invalid)
>>>> + k, &iter, bch_ptr_invalid) {
>>>> + if (KEY_INODE(k) != op.inode)
>>>> + continue;
>>>> sectors_dirty_init_fn(&op.op, c->root, k); + }
>>>>
>>> Nice catch! IMHO if I take the above change, setting d->dirty_sectors
>>> by 0 might be unncessary in ideal condition, am I right?
>>
>> In bch_cached_dev_attach () may still fail and exit, I think it is
>> necessary to clear 0.
>
> Copied. Thanks for the information, I will take the v2 patch.
>
> Coly Li
>
Hi Coly, Mingzhe,
Can I ask, how far back would this fix be needed, in terms of stable
versions?
Thanks,
Eddie
next prev parent reply other threads:[~2023-09-07 19:37 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-22 10:19 [PATCH] bcache: fixup init dirty data errors Mingzhe Zou
2023-08-22 17:49 ` Coly Li
2023-08-24 12:49 ` 邹明哲
2023-08-24 16:36 ` Coly Li
2023-09-07 11:14 ` Eddie Chapman [this message]
2023-09-08 3:33 ` 邹明哲
2023-08-22 19:02 ` kernel test robot
2023-08-23 2:22 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=de1e6ddd2e8534eeead33273ff3d4026.squirrel@ukinbox.ecrypt.net \
--to=eddie@ehuk.net \
--cc=bcache@lists.ewheeler.net \
--cc=colyli@suse.de \
--cc=linux-bcache@vger.kernel.org \
--cc=mingzhe.zou@easystack.cn \
--cc=zoumingzhe@qq.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox