From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EC83C11D25 for ; Thu, 20 Feb 2020 22:00:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D5A43207FD for ; Thu, 20 Feb 2020 22:00:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="Kmj9oigm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D5A43207FD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7161A6B0005; Thu, 20 Feb 2020 17:00:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C6E46B0006; Thu, 20 Feb 2020 17:00:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DAAF6B0007; Thu, 20 Feb 2020 17:00:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0236.hostedemail.com [216.40.44.236]) by kanga.kvack.org (Postfix) with ESMTP id 453006B0005 for ; Thu, 20 Feb 2020 17:00:13 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 02637180AD80F for ; Thu, 20 Feb 2020 22:00:13 +0000 (UTC) X-FDA: 76511874306.03.class28_1cd244321109 X-HE-Tag: class28_1cd244321109 X-Filterd-Recvd-Size: 6305 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Thu, 20 Feb 2020 22:00:12 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id p34so2254063qtb.6 for ; Thu, 20 Feb 2020 14:00:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4DEb+MFtlfOkYpWfs64nM33/4kdQBlL8UrNBhh3ectk=; b=Kmj9oigm0jMnd4HOhL2kj5adwwG8XTmckWqG+btnWkfj+uDXxwZkdqrxOQ7PGHF76d UQIhwBsksdU7D+Tr0CtIQnWeoDw2S0FzAeaOZkKFaGQjh+4tG+7lcFKp1TJdyxwnWBvz 0xMlnq2i02KkLLLUS3e6GXooE84fiTc0bIa0MWZiFa5PTTiLv4F1ka6+J2bubbPVkwSF irbntREuc6iemK8X0WglElHqFXvXyME57ic83jBhi08/Z35rZMunYsQcPNE+C4gQG3TA oGV2OVyd97f+UK61/rT8CudV1HpMf+Y52tCRrkdnvwvKDGVKpsp7LtK+f5lf8R+fTbSD 4wUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4DEb+MFtlfOkYpWfs64nM33/4kdQBlL8UrNBhh3ectk=; b=R1nNdIV39uD6TWDA0pX+RkiNXBHNiiTO6ONB2LO+rBCsfI8YYWnP6Adty4FniH2fgN jgHLM88nPrurDLwgFUws/kZbDHAaC82P5+4JnSK4SSSXClKuHErWyoHeVgsAdFz91R6d svyXS8bF6G/OlEZW+FQQqhoe/ACWOSreonIw1xSdebseexEGH90Aj25VUZC9Li9BnnBY sMFrVUEUB0A36BILCwSFYG62S8MBRNWmDiCgKQXF2zFycAwgTHmB2lUHgurCC882JTwF fxX/rcpzlnWQOQ8I7HtYhPnMhNwcL3ri0H78MNPhM18J/iLnqHgr9vHIBCd7naM0iYfl 9ksQ== X-Gm-Message-State: APjAAAVfZZnDnJrVSpqduwCp6RxzMYgUuW1uvBuGS23ZofobwjIFD2pc 6bhW2wdFV/dqlSGZkoEFzq8L9Q== X-Google-Smtp-Source: APXvYqzZ2IaTG8X1flSdo3+y5ECROeJuXsj6tBQM78xzv56U3N32QFxc89XdU6QS/J+WLZO9RtAtKw== X-Received: by 2002:ac8:72d6:: with SMTP id o22mr28347095qtp.174.1582236011362; Thu, 20 Feb 2020 14:00:11 -0800 (PST) Received: from localhost ([2620:10d:c091:500::1:3504]) by smtp.gmail.com with ESMTPSA id t3sm513004qtc.8.2020.02.20.14.00.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Feb 2020 14:00:10 -0800 (PST) Date: Thu, 20 Feb 2020 17:00:09 -0500 From: Johannes Weiner To: Dan Schatzberg Cc: Jens Axboe , Tejun Heo , Li Zefan , Michal Hocko , Vladimir Davydov , Andrew Morton , Hugh Dickins , Roman Gushchin , Shakeel Butt , Chris Down , Yang Shi , Thomas Gleixner , "open list:BLOCK LAYER" , open list , "open list:CONTROL GROUP (CGROUP)" , "open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)" Subject: Re: [PATCH v3 1/3] loop: Use worker per cgroup instead of kworker Message-ID: <20200220220009.GA68937@cmpxchg.org> References: <118a1bd99d12f1980c7fc01ab732b40ffd8f0537.1582216294.git.schatzberg.dan@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <118a1bd99d12f1980c7fc01ab732b40ffd8f0537.1582216294.git.schatzberg.dan@gmail.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 20, 2020 at 11:51:51AM -0500, Dan Schatzberg wrote: > Existing uses of loop device may have multiple cgroups reading/writing > to the same device. Simply charging resources for I/O to the backing > file could result in priority inversion where one cgroup gets > synchronously blocked, holding up all other I/O to the loop device. > > In order to avoid this priority inversion, we use a single workqueue > where each work item is a "struct loop_worker" which contains a queue of > struct loop_cmds to issue. The loop device maintains a tree mapping blk > css_id -> loop_worker. This allows each cgroup to independently make > forward progress issuing I/O to the backing file. > > There is also a single queue for I/O associated with the rootcg which > can be used in cases of extreme memory shortage where we cannot allocate > a loop_worker. > > The locking for the tree and queues is fairly heavy handed - we acquire > the per-loop-device spinlock any time either is accessed. The existing > implementation serializes all I/O through a single thread anyways, so I > don't believe this is any worse. > > Signed-off-by: Dan Schatzberg FWIW, this looks good to me, please feel free to include: Acked-by: Johannes Weiner I have only some minor style nitpicks (along with the other email I sent earlier on this patch), that would be nice to get fixed: > +static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd) > +{ > + struct rb_node **node = &(lo->worker_tree.rb_node), *parent = NULL; > + struct loop_worker *cur_worker, *worker = NULL; > + struct work_struct *work; > + struct list_head *cmd_list; > + > + spin_lock_irq(&lo->lo_lock); > + > + if (!cmd->css) > + goto queue_work; > + > + node = &(lo->worker_tree.rb_node); -> and . are > &, the parentheses aren't necessary. > + while (*node) { > + parent = *node; > + cur_worker = container_of(*node, struct loop_worker, rb_node); > + if ((long)cur_worker->css == (long)cmd->css) { The casts aren't necessary, but they made me doubt myself and look up the types. I wouldn't add them just to be symmetrical with the other arm of the branch. > + worker = cur_worker; > + break; > + } else if ((long)cur_worker->css < (long)cmd->css) { > + node = &((*node)->rb_left); > + } else { > + node = &((*node)->rb_right); The outer parentheses aren't necessary. > + } > + } > + if (worker) > + goto queue_work; > + > + worker = kzalloc(sizeof(struct loop_worker), > + GFP_NOWAIT | __GFP_NOWARN); This fits on an 80 character line.