Linux block layer
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Yu Kuai <yukuai1@huaweicloud.com>
Cc: hch@infradead.org, josef@toxicpanda.com, axboe@kernel.dk,
	cgroups@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, yi.zhang@huawei.com,
	"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: [PATCH v2 1/2] blk-iocost: add refcounting for iocg
Date: Mon, 9 Jan 2023 08:23:34 -1000	[thread overview]
Message-ID: <Y7xbpidpq7+DqJan@slm.duckdns.org> (raw)
In-Reply-To: <875eb43e-202d-5b81-0bff-ef0434358d99@huaweicloud.com>

Hello,

On Mon, Jan 09, 2023 at 09:32:46AM +0800, Yu Kuai wrote:
> >   59b57717fff8 ("blkcg: delay blkg destruction until after writeback has finished")
> >   d866dbf61787 ("blkcg: rename blkcg->cgwb_refcnt to ->online_pin and always use it")
> 
> These two commits are applied for three years, I don't check the details
> yet but they seem can't guarantee that no io will be handled by
> rq_qos_throttle() after pd_offline_fn(), because I just reproduced this
> in another problem:
> 
> f02be9002c48 ("block, bfq: fix null pointer dereference in bfq_bio_bfqg()")
> 
> User thread can issue async io, and io can be throttled by
> blk-throttle(not writeback), then user thread can exit and cgroup can be
> removed before such io is dispatched to rq_qos_throttle.

I see.

> > After the above two commits, ->pd_offline_fn() is called only after all
> > possible writebacks are complete, so it shouldn't allow mass escapes to
> > root. With writebacks out of the picture, it might be that there can be no
> > further IOs once ->pd_offline_fn() is called too as there can be no tasks
> > left in it and no dirty pages, but best to confirm that.
> > 
> > So, yeah, the original approach you took should work although I'm not sure
> > the patches that you added to make offline blkg to bypass are necessary
> > (that also contributed to my assumption that there will be more IOs on those
> > blkg's). Have you seen more IOs coming down the pipeline after offline? If
> > so, can you dump some backtraces and see where they're coming from?
> 
> Currently I'm sure such IOs can come from blk-throttle, and I'm not sure
> yet but I also suspect io_uring can do this.

Yeah, that's unfortunate. There are several options here:

1. Do what you originally suggested - bypass to root after offline. I feel
   uneasy about this. Both iolatency and throtl clear their configs on
   offline but that's punting to the parent. For iocost it'd be bypassing
   all controls, which can actually be exploited.

2. Make all possible IO issuers use blkcg_[un]pin_online() and shift the
   iocost shutdown to pd_offline_fn(). This likely is the most canonical
   solution given the current situation but it's kinda nasty to add another
   layer of refcnting all over the place.

3. Order blkg free so that parents are never freed before children. You did
   this by adding refcnts in iocost but shouldn't it be possible to simply
   shift blkg_put(blkg->parent) in __blkg_release() to blkg_free_workfn()?

#3 seems the most logical to me. What do you thinK?

Thanks.

-- 
tejun

  reply	other threads:[~2023-01-09 18:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-27 12:55 [PATCH v2 0/2] blk-iocost: add refcounting for iocg and ioc Yu Kuai
2022-12-27 12:55 ` [PATCH v2 1/2] blk-iocost: add refcounting for iocg Yu Kuai
2023-01-04 21:44   ` Tejun Heo
2023-01-05  1:14     ` Yu Kuai
2023-01-05 18:32       ` Tejun Heo
2023-01-06  1:08         ` Yu Kuai
2023-01-06 20:18           ` Tejun Heo
2023-01-09  1:32             ` Yu Kuai
2023-01-09 18:23               ` Tejun Heo [this message]
2023-01-10  1:39                 ` Yu Kuai
2023-01-10 18:36                   ` Tejun Heo
2023-01-11  1:36                     ` Yu Kuai
2023-01-11 17:07                       ` Tejun Heo
2023-01-12  6:18                         ` Yu Kuai
2023-01-13  0:53                           ` Tejun Heo
2023-01-13  1:10                             ` Yu Kuai
2023-01-13  1:15                               ` Tejun Heo
2023-01-13  1:25                                 ` Yu Kuai
2023-01-13 17:16                                   ` Tejun Heo
2023-01-16  3:25                                     ` Yu Kuai
2022-12-27 12:55 ` [PATCH v2 2/2] blk-iocost: add refcounting for ioc Yu Kuai
2023-01-04 21:45   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7xbpidpq7+DqJan@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=cgroups@vger.kernel.org \
    --cc=hch@infradead.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai1@huaweicloud.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox