public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Coly Li <colyli@suse.de>
Cc: linux-block@vger.kernel.org, linux-bcache@vger.kernel.org
Subject: Re: [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate()
Date: Sat, 28 May 2022 06:23:08 -0600	[thread overview]
Message-ID: <e0002f36-fb39-5d27-ce10-2278fd4a77bb@kernel.dk> (raw)
In-Reply-To: <1BD307CE-9390-4A4B-B917-676104DA77F1@suse.de>

On 5/28/22 6:22 AM, Coly Li wrote:
> 
> 
>> 2022?5?28? 20:20?Jens Axboe <axboe@kernel.dk> ???
>>
>> On 5/28/22 12:19 AM, Coly Li wrote:
>>> The kworker routine update_writeback_rate() is schedued to update the
>>> writeback rate in every 5 seconds by default. Before calling
>>> __update_writeback_rate() to do real job, semaphore dc->writeback_lock
>>> should be held by the kworker routine.
>>>
>>> At the same time, bcache writeback thread routine bch_writeback_thread()
>>> also needs to hold dc->writeback_lock before flushing dirty data back
>>> into the backing device. If the dirty data set is large, it might be
>>> very long time for bch_writeback_thread() to scan all dirty buckets and
>>> releases dc->writeback_lock. In such case update_writeback_rate() can be
>>> starved for long enough time so that kernel reports a soft lockup warn-
>>> ing started like:
>>>  watchdog: BUG: soft lockup - CPU#246 stuck for 23s! [kworker/246:31:179713]
>>>
>>> Such soft lockup condition is unnecessary, because after the writeback
>>> thread finishes its job and releases dc->writeback_lock, the kworker
>>> update_writeback_rate() may continue to work and everything is fine
>>> indeed.
>>>
>>> This patch avoids the unnecessary soft lockup by the following method,
>>> - Add new member to struct cached_dev
>>>  - dc->rate_update_retry (0 by default)
>>> - In update_writeback_rate() call down_read_trylock(&dc->writeback_lock)
>>>  firstly, if it fails then lock contention happens.
>>> - If dc->rate_update_retry <= BCH_WBRATE_UPDATE_RETRY_MAX (15), doesn't
>>>  acquire the lock and reschedules the kworker for next try.
>>> - If dc->rate_update_retry > BCH_WBRATE_UPDATE_RETRY_MAX, no retry
>>>  anymore and call down_read(&dc->writeback_lock) to wait for the lock.
>>>
>>> By the above method, at worst case update_writeback_rate() may retry for
>>> 1+ minutes before blocking on dc->writeback_lock by calling down_read().
>>> For a 4TB cache device with 1TB dirty data, 90%+ of the unnecessary soft
>>> lockup warning message can be avoided.
>>>
>>> When retrying to acquire dc->writeback_lock in update_writeback_rate(),
>>> of course the writeback rate cannot be updated. It is fair, because when
>>> the kworker is blocked on the lock contention of dc->writeback_lock, the
>>> writeback rate cannot be updated neither.
>>>
>>> This change follows Jens Axboe's suggestion to a more clear and simple
>>> version.
>>
>> This looks fine, but it doesn't apply to my current for-5.19/drivers
>> branch which the previous ones did. Did you spin this one without the
>> other patches, perhaps?
>>
>> One minor thing we might want to change if you're respinning it -
>> BCH_WBRATE_UPDATE_RETRY_MAX isn't really named for what it does, since
>> it doesn't retry anything, it simply allows updates to be skipped. Why
>> not call it BCH_WBRATE_UPDATE_MAX_SKIPS instead? I think that'd be
>> better convey what it does.
> 
> Naming is often challenge for me. Sure, _MAX_SKIPS is better. I will
> post another modified version.

It's hard for everyone :-)

Sounds good, I'll get it applied when it shows up.

-- 
Jens Axboe


  reply	other threads:[~2022-05-28 12:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-28  6:19 [PATCH 0/1] bcache fix for Linux v5.19 (3rd wave) Coly Li
2022-05-28  6:19 ` [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate() Coly Li
2022-05-28 12:20   ` Jens Axboe
2022-05-28 12:22     ` Coly Li
2022-05-28 12:23       ` Jens Axboe [this message]
  -- strict thread matches above, loose matches on Subject: below --
2022-05-28 12:45 [PATCH v2 0/1] bcache fix for Linux v5.19 (3rd wave) Coly Li
2022-05-28 12:45 ` [PATCH 1/1] bcache: avoid unnecessary soft lockup in kworker update_writeback_rate() Coly Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e0002f36-fb39-5d27-ce10-2278fd4a77bb@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox