From: Qi Zheng <zhengqi.arch@bytedance.com>
To: Dave Chinner <david@fromorbit.com>,
Vlastimil Babka <vbabka@suse.cz>,
paulmck@kernel.org
Cc: akpm@linux-foundation.org, tkhai@ya.ru, roman.gushchin@linux.dev,
djwong@kernel.org, brauner@kernel.org, tytso@mit.edu,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
linux-arm-msm@vger.kernel.org, dm-devel@redhat.com,
linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org,
virtualization@lists.linux-foundation.org,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 24/29] mm: vmscan: make global slab shrink lockless
Date: Fri, 23 Jun 2023 21:10:57 +0800 [thread overview]
Message-ID: <a21047bb-3b87-a50a-94a7-f3fa4847bc08@bytedance.com> (raw)
In-Reply-To: <ZJU708VIyJ/3StAX@dread.disaster.area>
On 2023/6/23 14:29, Dave Chinner wrote:
> On Thu, Jun 22, 2023 at 05:12:02PM +0200, Vlastimil Babka wrote:
>> On 6/22/23 10:53, Qi Zheng wrote:
>>> @@ -1067,33 +1068,27 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>>> if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg))
>>> return shrink_slab_memcg(gfp_mask, nid, memcg, priority);
>>>
>>> - if (!down_read_trylock(&shrinker_rwsem))
>>> - goto out;
>>> -
>>> - list_for_each_entry(shrinker, &shrinker_list, list) {
>>> + rcu_read_lock();
>>> + list_for_each_entry_rcu(shrinker, &shrinker_list, list) {
>>> struct shrink_control sc = {
>>> .gfp_mask = gfp_mask,
>>> .nid = nid,
>>> .memcg = memcg,
>>> };
>>>
>>> + if (!shrinker_try_get(shrinker))
>>> + continue;
>>> + rcu_read_unlock();
>>
>> I don't think you can do this unlock?
>>
>>> +
>>> ret = do_shrink_slab(&sc, shrinker, priority);
>>> if (ret == SHRINK_EMPTY)
>>> ret = 0;
>>> freed += ret;
>>> - /*
>>> - * Bail out if someone want to register a new shrinker to
>>> - * prevent the registration from being stalled for long periods
>>> - * by parallel ongoing shrinking.
>>> - */
>>> - if (rwsem_is_contended(&shrinker_rwsem)) {
>>> - freed = freed ? : 1;
>>> - break;
>>> - }
>>> - }
>>>
>>> - up_read(&shrinker_rwsem);
>>> -out:
>>> + rcu_read_lock();
>>
>> That new rcu_read_lock() won't help AFAIK, the whole
>> list_for_each_entry_rcu() needs to be under the single rcu_read_lock() to be
>> safe.
>
> Yeah, that's the pattern we've been taught and the one we can look
> at and immediately say "this is safe".
>
> This is a different pattern, as has been explained bi Qi, and I
> think it *might* be safe.
>
> *However.*
>
> Right now I don't have time to go through a novel RCU list iteration
> pattern it one step at to determine the correctness of the
> algorithm. I'm mostly worried about list manipulations that can
> occur outside rcu_read_lock() section bleeding into the RCU
> critical section because rcu_read_lock() by itself is not a memory
> barrier.
>
> Maybe Paul has seen this pattern often enough he could simply tell
> us what conditions it is safe in. But for me to work that out from
> first principles? I just don't have the time to do that right now.
Hi Paul, can you help to confirm this?
>
>> IIUC this is why Dave in [4] suggests unifying shrink_slab() with
>> shrink_slab_memcg(), as the latter doesn't iterate the list but uses IDR.
>
> Yes, I suggested the IDR route because radix tree lookups under RCU
> with reference counted objects are a known safe pattern that we can
> easily confirm is correct or not. Hence I suggested the unification
> + IDR route because it makes the life of reviewers so, so much
> easier...
In fact, I originally planned to try the unification + IDR method you
suggested at the beginning. But in the case of CONFIG_MEMCG disabled,
the struct mem_cgroup is not even defined, and root_mem_cgroup and
shrinker_info will not be allocated. This required more code changes, so
I ended up keeping the shrinker_list and implementing the above pattern.
If the above pattern is not safe, I will go back to the unification +
IDR method.
Thanks,
Qi
>
> Cheers,
>
> Dave.
next prev parent reply other threads:[~2023-06-23 13:11 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-22 8:53 [PATCH 00/29] use refcount+RCU method to implement lockless slab shrink Qi Zheng
2023-06-22 8:53 ` [PATCH 01/29] mm: shrinker: add shrinker::private_data field Qi Zheng
2023-06-22 14:47 ` Vlastimil Babka
2023-06-23 12:50 ` [External] " Qi Zheng
2023-06-22 8:53 ` [PATCH 02/29] mm: vmscan: introduce some helpers for dynamically allocating shrinker Qi Zheng
2023-06-23 6:12 ` Dave Chinner
2023-06-23 12:49 ` Qi Zheng
2023-06-22 8:53 ` [PATCH 03/29] drm/i915: dynamically allocate the i915_gem_mm shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 04/29] drm/msm: dynamically allocate the drm-msm_gem shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 05/29] drm/panfrost: dynamically allocate the drm-panfrost shrinker Qi Zheng
2023-06-23 13:33 ` Qi Zheng
2023-06-22 8:53 ` [PATCH 06/29] dm: dynamically allocate the dm-bufio shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 07/29] dm zoned: dynamically allocate the dm-zoned-meta shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 08/29] md/raid5: dynamically allocate the md-raid5 shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 09/29] bcache: dynamically allocate the md-bcache shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 10/29] vmw_balloon: dynamically allocate the vmw-balloon shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 11/29] virtio_balloon: dynamically allocate the virtio-balloon shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 12/29] mbcache: dynamically allocate the mbcache shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 13/29] ext4: dynamically allocate the ext4-es shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 14/29] jbd2,ext4: dynamically allocate the jbd2-journal shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 15/29] NFSD: dynamically allocate the nfsd-client shrinker Qi Zheng
2023-06-23 21:49 ` Chuck Lever
2023-06-24 11:17 ` Qi Zheng
2023-06-22 8:53 ` [PATCH 16/29] NFSD: dynamically allocate the nfsd-reply shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 17/29] xfs: dynamically allocate the xfs-buf shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 18/29] xfs: dynamically allocate the xfs-inodegc shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 19/29] xfs: dynamically allocate the xfs-qm shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 20/29] zsmalloc: dynamically allocate the mm-zspool shrinker Qi Zheng
2023-06-22 8:53 ` [PATCH 21/29] fs: super: dynamically allocate the s_shrink Qi Zheng
2023-06-22 8:53 ` [PATCH 22/29] drm/ttm: introduce pool_shrink_rwsem Qi Zheng
2023-06-22 8:53 ` [PATCH 23/29] mm: shrinker: add refcount and completion_wait fields Qi Zheng
2023-06-22 8:53 ` [PATCH 24/29] mm: vmscan: make global slab shrink lockless Qi Zheng
2023-06-22 15:12 ` Vlastimil Babka
2023-06-22 16:42 ` Qi Zheng
2023-06-22 17:41 ` Alan Huang
2023-06-22 18:18 ` Qi Zheng
2023-06-23 6:29 ` Dave Chinner
2023-06-23 13:10 ` Qi Zheng [this message]
2023-06-23 22:19 ` Dave Chinner
2023-06-24 11:08 ` Qi Zheng
2023-06-25 3:15 ` Qi Zheng
2023-07-04 4:20 ` Qi Zheng
2023-07-03 16:39 ` Paul E. McKenney
2023-07-04 3:45 ` Qi Zheng
2023-07-05 3:27 ` Qi Zheng
2023-06-22 8:53 ` [PATCH 25/29] mm: vmscan: make memcg " Qi Zheng
2023-06-22 8:53 ` [PATCH 26/29] mm: shrinker: make count and scan in shrinker debugfs lockless Qi Zheng
2023-06-22 8:53 ` [PATCH 27/29] mm: vmscan: hold write lock to reparent shrinker nr_deferred Qi Zheng
2023-06-22 8:53 ` [PATCH 28/29] mm: shrinkers: convert shrinker_rwsem to mutex Qi Zheng
2023-06-22 8:53 ` [PATCH 29/29] mm: shrinker: move shrinker-related code into a separate file Qi Zheng
2023-06-22 14:53 ` Vlastimil Babka
2023-06-23 13:12 ` Qi Zheng
2023-06-23 5:25 ` Sergey Senozhatsky
2023-06-23 13:24 ` Qi Zheng
2023-06-22 9:02 ` [PATCH 00/29] use refcount+RCU method to implement lockless slab shrink Qi Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a21047bb-3b87-a50a-94a7-f3fa4847bc08@bytedance.com \
--to=zhengqi.arch@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=dm-devel@redhat.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-gfx@lists.freedesktop.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=tkhai@ya.ru \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox