From: kernel test robot <lkp@intel.com>
To: "Shakeel Butt" <shakeelb@google.com>,
"Michal Koutný" <mkoutny@suse.com>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Hocko" <mhocko@kernel.org>,
"Roman Gushchin" <roman.gushchin@linux.dev>
Cc: kbuild-all@lists.01.org, Ivan Babrou <ivan@cloudflare.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux Memory Management List <linux-mm@kvack.org>,
cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
Shakeel Butt <shakeelb@google.com>,
Daniel Dao <dqminh@cloudflare.com>,
stable@vger.kernel.org
Subject: Re: [PATCH] memcg: async flush memcg stats from perf sensitive codepaths
Date: Sat, 26 Feb 2022 10:32:55 +0800 [thread overview]
Message-ID: <202202261045.FAsMZlyD-lkp@intel.com> (raw)
In-Reply-To: <20220226002412.113819-1-shakeelb@google.com>
Hi Shakeel,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on hnaz-mm/master]
[also build test ERROR on linus/master v5.17-rc5 next-20220224]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Shakeel-Butt/memcg-async-flush-memcg-stats-from-perf-sensitive-codepaths/20220226-082444
base: https://github.com/hnaz/linux-mm master
config: um-i386_defconfig (https://download.01.org/0day-ci/archive/20220226/202202261045.FAsMZlyD-lkp@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/5dffeb24975bc4cbe99af650d833eb0183a4882f
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Shakeel-Butt/memcg-async-flush-memcg-stats-from-perf-sensitive-codepaths/20220226-082444
git checkout 5dffeb24975bc4cbe99af650d833eb0183a4882f
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=um SUBARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
mm/vmscan.c: In function 'shrink_node':
>> mm/vmscan.c:3191:2: error: implicit declaration of function 'mem_cgroup_flush_stats_async'; did you mean 'mem_cgroup_flush_stats'? [-Werror=implicit-function-declaration]
3191 | mem_cgroup_flush_stats_async();
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| mem_cgroup_flush_stats
cc1: some warnings being treated as errors
--
mm/workingset.c: In function 'workingset_refault':
>> mm/workingset.c:358:2: error: implicit declaration of function 'mem_cgroup_flush_stats_async'; did you mean 'mem_cgroup_flush_stats'? [-Werror=implicit-function-declaration]
358 | mem_cgroup_flush_stats_async();
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| mem_cgroup_flush_stats
cc1: some warnings being treated as errors
vim +3191 mm/vmscan.c
3175
3176 static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
3177 {
3178 struct reclaim_state *reclaim_state = current->reclaim_state;
3179 unsigned long nr_reclaimed, nr_scanned;
3180 struct lruvec *target_lruvec;
3181 bool reclaimable = false;
3182 unsigned long file;
3183
3184 target_lruvec = mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat);
3185
3186 again:
3187 /*
3188 * Flush the memory cgroup stats, so that we read accurate per-memcg
3189 * lruvec stats for heuristics.
3190 */
> 3191 mem_cgroup_flush_stats_async();
3192
3193 memset(&sc->nr, 0, sizeof(sc->nr));
3194
3195 nr_reclaimed = sc->nr_reclaimed;
3196 nr_scanned = sc->nr_scanned;
3197
3198 /*
3199 * Determine the scan balance between anon and file LRUs.
3200 */
3201 spin_lock_irq(&target_lruvec->lru_lock);
3202 sc->anon_cost = target_lruvec->anon_cost;
3203 sc->file_cost = target_lruvec->file_cost;
3204 spin_unlock_irq(&target_lruvec->lru_lock);
3205
3206 /*
3207 * Target desirable inactive:active list ratios for the anon
3208 * and file LRU lists.
3209 */
3210 if (!sc->force_deactivate) {
3211 unsigned long refaults;
3212
3213 refaults = lruvec_page_state(target_lruvec,
3214 WORKINGSET_ACTIVATE_ANON);
3215 if (refaults != target_lruvec->refaults[0] ||
3216 inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
3217 sc->may_deactivate |= DEACTIVATE_ANON;
3218 else
3219 sc->may_deactivate &= ~DEACTIVATE_ANON;
3220
3221 /*
3222 * When refaults are being observed, it means a new
3223 * workingset is being established. Deactivate to get
3224 * rid of any stale active pages quickly.
3225 */
3226 refaults = lruvec_page_state(target_lruvec,
3227 WORKINGSET_ACTIVATE_FILE);
3228 if (refaults != target_lruvec->refaults[1] ||
3229 inactive_is_low(target_lruvec, LRU_INACTIVE_FILE))
3230 sc->may_deactivate |= DEACTIVATE_FILE;
3231 else
3232 sc->may_deactivate &= ~DEACTIVATE_FILE;
3233 } else
3234 sc->may_deactivate = DEACTIVATE_ANON | DEACTIVATE_FILE;
3235
3236 /*
3237 * If we have plenty of inactive file pages that aren't
3238 * thrashing, try to reclaim those first before touching
3239 * anonymous pages.
3240 */
3241 file = lruvec_page_state(target_lruvec, NR_INACTIVE_FILE);
3242 if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FILE))
3243 sc->cache_trim_mode = 1;
3244 else
3245 sc->cache_trim_mode = 0;
3246
3247 /*
3248 * Prevent the reclaimer from falling into the cache trap: as
3249 * cache pages start out inactive, every cache fault will tip
3250 * the scan balance towards the file LRU. And as the file LRU
3251 * shrinks, so does the window for rotation from references.
3252 * This means we have a runaway feedback loop where a tiny
3253 * thrashing file LRU becomes infinitely more attractive than
3254 * anon pages. Try to detect this based on file LRU size.
3255 */
3256 if (!cgroup_reclaim(sc)) {
3257 unsigned long total_high_wmark = 0;
3258 unsigned long free, anon;
3259 int z;
3260
3261 free = sum_zone_node_page_state(pgdat->node_id, NR_FREE_PAGES);
3262 file = node_page_state(pgdat, NR_ACTIVE_FILE) +
3263 node_page_state(pgdat, NR_INACTIVE_FILE);
3264
3265 for (z = 0; z < MAX_NR_ZONES; z++) {
3266 struct zone *zone = &pgdat->node_zones[z];
3267 if (!managed_zone(zone))
3268 continue;
3269
3270 total_high_wmark += high_wmark_pages(zone);
3271 }
3272
3273 /*
3274 * Consider anon: if that's low too, this isn't a
3275 * runaway file reclaim problem, but rather just
3276 * extreme pressure. Reclaim as per usual then.
3277 */
3278 anon = node_page_state(pgdat, NR_INACTIVE_ANON);
3279
3280 sc->file_is_tiny =
3281 file + free <= total_high_wmark &&
3282 !(sc->may_deactivate & DEACTIVATE_ANON) &&
3283 anon >> sc->priority;
3284 }
3285
3286 shrink_node_memcgs(pgdat, sc);
3287
3288 if (reclaim_state) {
3289 sc->nr_reclaimed += reclaim_state->reclaimed_slab;
3290 reclaim_state->reclaimed_slab = 0;
3291 }
3292
3293 /* Record the subtree's reclaim efficiency */
3294 vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true,
3295 sc->nr_scanned - nr_scanned,
3296 sc->nr_reclaimed - nr_reclaimed);
3297
3298 if (sc->nr_reclaimed - nr_reclaimed)
3299 reclaimable = true;
3300
3301 if (current_is_kswapd()) {
3302 /*
3303 * If reclaim is isolating dirty pages under writeback,
3304 * it implies that the long-lived page allocation rate
3305 * is exceeding the page laundering rate. Either the
3306 * global limits are not being effective at throttling
3307 * processes due to the page distribution throughout
3308 * zones or there is heavy usage of a slow backing
3309 * device. The only option is to throttle from reclaim
3310 * context which is not ideal as there is no guarantee
3311 * the dirtying process is throttled in the same way
3312 * balance_dirty_pages() manages.
3313 *
3314 * Once a node is flagged PGDAT_WRITEBACK, kswapd will
3315 * count the number of pages under pages flagged for
3316 * immediate reclaim and stall if any are encountered
3317 * in the nr_immediate check below.
3318 */
3319 if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken)
3320 set_bit(PGDAT_WRITEBACK, &pgdat->flags);
3321
3322 /* Allow kswapd to start writing pages during reclaim.*/
3323 if (sc->nr.unqueued_dirty == sc->nr.file_taken)
3324 set_bit(PGDAT_DIRTY, &pgdat->flags);
3325
3326 /*
3327 * If kswapd scans pages marked for immediate
3328 * reclaim and under writeback (nr_immediate), it
3329 * implies that pages are cycling through the LRU
3330 * faster than they are written so forcibly stall
3331 * until some pages complete writeback.
3332 */
3333 if (sc->nr.immediate)
3334 reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
3335 }
3336
3337 /*
3338 * Tag a node/memcg as congested if all the dirty pages were marked
3339 * for writeback and immediate reclaim (counted in nr.congested).
3340 *
3341 * Legacy memcg will stall in page writeback so avoid forcibly
3342 * stalling in reclaim_throttle().
3343 */
3344 if ((current_is_kswapd() ||
3345 (cgroup_reclaim(sc) && writeback_throttling_sane(sc))) &&
3346 sc->nr.dirty && sc->nr.dirty == sc->nr.congested)
3347 set_bit(LRUVEC_CONGESTED, &target_lruvec->flags);
3348
3349 /*
3350 * Stall direct reclaim for IO completions if the lruvec is
3351 * node is congested. Allow kswapd to continue until it
3352 * starts encountering unqueued dirty pages or cycling through
3353 * the LRU too quickly.
3354 */
3355 if (!current_is_kswapd() && current_may_throttle() &&
3356 !sc->hibernation_mode &&
3357 test_bit(LRUVEC_CONGESTED, &target_lruvec->flags))
3358 reclaim_throttle(pgdat, VMSCAN_THROTTLE_CONGESTED);
3359
3360 if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed,
3361 sc))
3362 goto again;
3363
3364 /*
3365 * Kswapd gives up on balancing particular nodes after too
3366 * many failures to reclaim anything from them and goes to
3367 * sleep. On reclaim progress, reset the failure counter. A
3368 * successful direct reclaim run will revive a dormant kswapd.
3369 */
3370 if (reclaimable)
3371 pgdat->kswapd_failures = 0;
3372 }
3373
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
next prev parent reply other threads:[~2022-02-26 2:32 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-26 0:24 [PATCH] memcg: async flush memcg stats from perf sensitive codepaths Shakeel Butt
2022-02-26 0:24 ` Shakeel Butt
2022-02-26 0:58 ` Andrew Morton
2022-02-26 1:20 ` Andrew Morton
[not found] ` <20220225165842.561d3a475310aeab86a2d653-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2022-02-26 1:42 ` Shakeel Butt
2022-02-26 1:42 ` Shakeel Butt
2022-02-28 18:46 ` Michal Koutný
2022-02-28 22:46 ` Shakeel Butt
2022-02-28 22:46 ` Shakeel Butt
2022-02-26 2:32 ` kernel test robot [this message]
2022-02-26 12:43 ` kernel test robot
2022-03-01 9:05 ` Michal Hocko
[not found] ` <Yh3h33W45+YaMo92-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2022-03-01 17:21 ` Shakeel Butt
2022-03-01 17:21 ` Shakeel Butt
[not found] ` <CALvZod7aF9xRc+XvY7GPN7OnDyPitt1H6Q4yrwzAXTFzv1LzWQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-03-01 17:57 ` Michal Koutný
2022-03-01 17:57 ` Michal Koutný
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=202202261045.FAsMZlyD-lkp@intel.com \
--to=lkp@intel.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=dqminh@cloudflare.com \
--cc=hannes@cmpxchg.org \
--cc=ivan@cloudflare.com \
--cc=kbuild-all@lists.01.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mkoutny@suse.com \
--cc=roman.gushchin@linux.dev \
--cc=shakeelb@google.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.