From: Johannes Weiner <hannes@cmpxchg.org>
To: Greg Thelen <gthelen@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Tejun Heo <tj@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] writeback: sum memcg dirty counters as needed
Date: Thu, 28 Mar 2019 10:20:16 -0400 [thread overview]
Message-ID: <20190328142016.GA15763@cmpxchg.org> (raw)
In-Reply-To: <20190307165632.35810-1-gthelen@google.com>
On Thu, Mar 07, 2019 at 08:56:32AM -0800, Greg Thelen wrote:
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3880,6 +3880,7 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
> * @pheadroom: out parameter for number of allocatable pages according to memcg
> * @pdirty: out parameter for number of dirty pages
> * @pwriteback: out parameter for number of pages under writeback
> + * @exact: determines exact counters are required, indicates more work.
> *
> * Determine the numbers of file, headroom, dirty, and writeback pages in
> * @wb's memcg. File, dirty and writeback are self-explanatory. Headroom
> @@ -3890,18 +3891,29 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
> * ancestors. Note that this doesn't consider the actual amount of
> * available memory in the system. The caller should further cap
> * *@pheadroom accordingly.
> + *
> + * Return value is the error precision associated with *@pdirty
> + * and *@pwriteback. When @exact is set this a minimal value.
> */
> -void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
> - unsigned long *pheadroom, unsigned long *pdirty,
> - unsigned long *pwriteback)
> +unsigned long
> +mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
> + unsigned long *pheadroom, unsigned long *pdirty,
> + unsigned long *pwriteback, bool exact)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
> struct mem_cgroup *parent;
> + unsigned long precision;
>
> - *pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
> -
> + if (exact) {
> + precision = 0;
> + *pdirty = memcg_exact_page_state(memcg, NR_FILE_DIRTY);
> + *pwriteback = memcg_exact_page_state(memcg, NR_WRITEBACK);
> + } else {
> + precision = MEMCG_CHARGE_BATCH * num_online_cpus();
> + *pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
> + *pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
> + }
> /* this should eventually include NR_UNSTABLE_NFS */
> - *pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
> *pfilepages = mem_cgroup_nr_lru_pages(memcg, (1 << LRU_INACTIVE_FILE) |
> (1 << LRU_ACTIVE_FILE));
> *pheadroom = PAGE_COUNTER_MAX;
> @@ -3913,6 +3925,8 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
> *pheadroom = min(*pheadroom, ceiling - min(ceiling, used));
> memcg = parent;
> }
> +
> + return precision;
Have you considered unconditionally using the exact version here?
It does for_each_online_cpu(), but until very, very recently we did
this per default for all stats, for years. It only became a problem in
conjunction with the for_each_memcg loops when frequently reading
memory stats at the top of a very large hierarchy.
balance_dirty_pages() is called against memcgs that actually own the
inodes/memory and doesn't do the additional recursive tree collection.
It's also not *that* hot of a function, and in the io path...
It would simplify this patch immensely.
next prev parent reply other threads:[~2019-03-28 14:20 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-07 16:56 [PATCH] writeback: sum memcg dirty counters as needed Greg Thelen
2019-03-21 23:44 ` Andrew Morton
2019-03-29 17:47 ` Greg Thelen
2019-03-22 18:15 ` Roman Gushchin
2019-03-27 22:29 ` Greg Thelen
2019-03-28 14:05 ` Johannes Weiner
2019-03-28 14:20 ` Johannes Weiner [this message]
2019-03-29 17:50 ` Greg Thelen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190328142016.GA15763@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.