linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Josef Bacik <josef@toxicpanda.com>
Cc: axboe@kernel.dk, kernel-team@fb.com, linux-block@vger.kernel.org,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, tj@kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 07/13] memcontrol: schedule throttling if we are congested
Date: Wed, 30 May 2018 10:15:33 -0400	[thread overview]
Message-ID: <20180530141533.GC4035@cmpxchg.org> (raw)
In-Reply-To: <20180529211724.4531-8-josef@toxicpanda.com>

On Tue, May 29, 2018 at 05:17:18PM -0400, Josef Bacik wrote:
> @@ -5458,6 +5458,30 @@ int mem_cgroup_try_charge(struct page *page, struct mm_struct *mm,
>  	return ret;
>  }
>  
> +int mem_cgroup_try_charge_delay(struct page *page, struct mm_struct *mm,
> +			  gfp_t gfp_mask, struct mem_cgroup **memcgp,
> +			  bool compound)
> +{
> +	struct mem_cgroup *memcg;
> +	struct block_device *bdev;
> +	int ret;
> +
> +	ret = mem_cgroup_try_charge(page, mm, gfp_mask, memcgp, compound);
> +	memcg = *memcgp;
> +
> +	if (!(gfp_mask & __GFP_IO) || !memcg)
> +		return ret;
> +#if defined(CONFIG_BLOCK) && defined(CONFIG_SWAP)
> +	if (atomic_read(&memcg->css.cgroup->congestion_count) &&
> +	    has_usable_swap()) {
> +		map_swap_page(page, &bdev);

This doesn't work, unfortunately - or only works on accident.

It goes through page_private(), which is only valid for pages in the
swapcache. The newly allocated pages you call it against aren't in the
swapcache, but their page_private() is 0, which is incorrectly
interpreted as "first swap slot on the first swap device" - which
happens to make sense if you have only one swap device.

> +		blkcg_schedule_throttle(bdev_get_queue(bdev), true);

By the time we allocate, we simply cannot know which swap device the
page will end up on. However, we know what's likely: swap_avail_heads
is sorted by order in which we try to allocate swap slots; the first
device on there is where swap io will go. If we walk this list and
throttle on the first device that has built-up delay debt, we'll
throttle against the device that probably gets the current bulk of the
swap writes.

Also, if we have two swap devices with the same priority, swap
allocation will re-order the list for us automatically in order to do
round-robin loading of the devices. See get_swap_pages(). That should
work out nicely for throttling as well.

You can use page_to_nid() on the newly allocated page to index into
swap_avail_heads[].

On an unrelated note, mem_cgroup_try_charge_delay() isn't the most
descriptive name. Since it's not too page specific, we might want to
move the throttling part out of the charge function and do something
simliar to a stand-alone balance_dirty_pages() function.

mem_cgroup_balance_anon_pages()?

mem_cgroup_throttle_swaprate()?

mem_cgroup_anon_throttle()?

mem_cgroup_anon_allocwait()?

Something like that. I personally like balance_anon_pages the best;
not because it is the best name by itself, but because in the MM it
has the notion of throttling the creation of IO liabilities to the
write rate, which is what we're doing here as well.

  reply	other threads:[~2018-05-30 14:13 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-29 21:17 [PATCH 00/13] Introdue io.latency io controller for cgroups Josef Bacik
2018-05-29 21:17 ` [PATCH 01/13] block: add bi_blkg to the bio " Josef Bacik
2018-05-30 15:49   ` Tejun Heo
2018-05-29 21:17 ` [PATCH 02/13] block: introduce bio_issue_as_root_blkg Josef Bacik
2018-05-30 15:53   ` Tejun Heo
2018-05-29 21:17 ` [PATCH 03/13] blk-cgroup: allow controllers to output their own stats Josef Bacik
2018-05-30 15:54   ` Tejun Heo
2018-05-29 21:17 ` [PATCH 04/13] blk: introduce REQ_SWAP Josef Bacik
2018-05-30 15:58   ` Tejun Heo
2018-05-29 21:17 ` [PATCH 05/13] swap,blkcg: issue swap io with the appropriate context Josef Bacik
2018-05-30 13:06   ` Johannes Weiner
2018-05-30 16:05   ` Tejun Heo
2018-05-29 21:17 ` [PATCH 06/13] blkcg: add generic throttling mechanism Josef Bacik
2018-05-30 13:11   ` Johannes Weiner
2018-05-30 16:26   ` Tejun Heo
2018-05-29 21:17 ` [PATCH 07/13] memcontrol: schedule throttling if we are congested Josef Bacik
2018-05-30 14:15   ` Johannes Weiner [this message]
2018-05-29 21:17 ` [PATCH 08/13] blk-stat: export helpers for modifying blk_rq_stat Josef Bacik
2018-05-30 16:31   ` Tejun Heo
2018-05-29 21:17 ` [PATCH 09/13] blk-rq-qos: refactor out common elements of blk-wbt Josef Bacik
2018-05-29 21:17 ` [PATCH 10/13] block: remove external dependency on wbt_flags Josef Bacik
2018-05-29 21:17 ` [PATCH 11/13] rq-qos: introduce dio_bio callback Josef Bacik
2018-05-29 21:17 ` [PATCH 12/13] block: introduce blk-iolatency io controller Josef Bacik
2018-05-29 22:07   ` Randy Dunlap
2018-05-30 16:40   ` Tejun Heo
2018-05-29 21:17 ` [PATCH 13/13] Documentation: add a doc for blk-iolatency Josef Bacik
2018-05-30 16:44   ` Tejun Heo
2018-05-30 18:32   ` Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180530141533.GC4035@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).