From: Johannes Weiner <hannes@cmpxchg.org>
To: Dan Schatzberg <schatzberg.dan@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>, Tejun Heo <tj@kernel.org>,
Li Zefan <lizefan@huawei.com>, Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>, Roman Gushchin <guro@fb.com>,
Shakeel Butt <shakeelb@google.com>,
Chris Down <chris@chrisdown.name>,
Yang Shi <yang.shi@linux.alibaba.com>,
Thomas Gleixner <tglx@linutronix.de>,
"open list:BLOCK LAYER" <linux-block@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>,
"open list:CONTROL GROUP (CGROUP)" <cgroups@vger.kernel.org>,
"open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)"
<linux-mm@kvack.org>
Subject: Re: [PATCH v3 1/3] loop: Use worker per cgroup instead of kworker
Date: Thu, 20 Feb 2020 17:00:09 -0500 [thread overview]
Message-ID: <20200220220009.GA68937@cmpxchg.org> (raw)
In-Reply-To: <118a1bd99d12f1980c7fc01ab732b40ffd8f0537.1582216294.git.schatzberg.dan@gmail.com>
On Thu, Feb 20, 2020 at 11:51:51AM -0500, Dan Schatzberg wrote:
> Existing uses of loop device may have multiple cgroups reading/writing
> to the same device. Simply charging resources for I/O to the backing
> file could result in priority inversion where one cgroup gets
> synchronously blocked, holding up all other I/O to the loop device.
>
> In order to avoid this priority inversion, we use a single workqueue
> where each work item is a "struct loop_worker" which contains a queue of
> struct loop_cmds to issue. The loop device maintains a tree mapping blk
> css_id -> loop_worker. This allows each cgroup to independently make
> forward progress issuing I/O to the backing file.
>
> There is also a single queue for I/O associated with the rootcg which
> can be used in cases of extreme memory shortage where we cannot allocate
> a loop_worker.
>
> The locking for the tree and queues is fairly heavy handed - we acquire
> the per-loop-device spinlock any time either is accessed. The existing
> implementation serializes all I/O through a single thread anyways, so I
> don't believe this is any worse.
>
> Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>
FWIW, this looks good to me, please feel free to include:
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
I have only some minor style nitpicks (along with the other email I
sent earlier on this patch), that would be nice to get fixed:
> +static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd)
> +{
> + struct rb_node **node = &(lo->worker_tree.rb_node), *parent = NULL;
> + struct loop_worker *cur_worker, *worker = NULL;
> + struct work_struct *work;
> + struct list_head *cmd_list;
> +
> + spin_lock_irq(&lo->lo_lock);
> +
> + if (!cmd->css)
> + goto queue_work;
> +
> + node = &(lo->worker_tree.rb_node);
-> and . are > &, the parentheses aren't necessary.
> + while (*node) {
> + parent = *node;
> + cur_worker = container_of(*node, struct loop_worker, rb_node);
> + if ((long)cur_worker->css == (long)cmd->css) {
The casts aren't necessary, but they made me doubt myself and look up
the types. I wouldn't add them just to be symmetrical with the other
arm of the branch.
> + worker = cur_worker;
> + break;
> + } else if ((long)cur_worker->css < (long)cmd->css) {
> + node = &((*node)->rb_left);
> + } else {
> + node = &((*node)->rb_right);
The outer parentheses aren't necessary.
> + }
> + }
> + if (worker)
> + goto queue_work;
> +
> + worker = kzalloc(sizeof(struct loop_worker),
> + GFP_NOWAIT | __GFP_NOWARN);
This fits on an 80 character line.
next prev parent reply other threads:[~2020-02-20 22:00 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-20 16:51 [PATCH v3 0/3] Charge loop device i/o to issuing cgroup Dan Schatzberg
2020-02-20 16:51 ` [PATCH v3 1/3] loop: Use worker per cgroup instead of kworker Dan Schatzberg
2020-02-20 17:50 ` Johannes Weiner
2020-02-20 22:00 ` Johannes Weiner [this message]
2020-02-20 16:51 ` [PATCH v3 2/3] mm: Charge active memcg when no mm is set Dan Schatzberg
2020-02-20 18:14 ` Shakeel Butt
2020-02-20 21:03 ` Johannes Weiner
2020-02-20 21:14 ` Shakeel Butt
2020-02-20 18:35 ` Chris Down
2020-02-20 21:15 ` Shakeel Butt
2020-02-23 19:08 ` Hugh Dickins
2020-02-24 1:11 ` Hugh Dickins
2020-02-24 21:37 ` Dan Schatzberg
2020-02-20 16:51 ` [PATCH v3 3/3] loop: Charge i/o to mem and blk cg Dan Schatzberg
2020-02-20 22:02 ` Johannes Weiner
-- strict thread matches above, loose matches on Subject: below --
2020-02-24 22:17 [PATCH v3 0/3] Charge loop device i/o to issuing cgroup Dan Schatzberg
2020-02-24 22:17 ` [PATCH v3 1/3] loop: Use worker per cgroup instead of kworker Dan Schatzberg
2020-02-26 17:02 ` Qian Cai
2020-02-27 18:14 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200220220009.GA68937@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=cgroups@vger.kernel.org \
--cc=chris@chrisdown.name \
--cc=guro@fb.com \
--cc=hughd@google.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan@huawei.com \
--cc=mhocko@kernel.org \
--cc=schatzberg.dan@gmail.com \
--cc=shakeelb@google.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).