From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tejun Heo Subject: Re: [PATCH v2] make dm and dm-crypt forward cgroup context (was: dm-crypt parallelization patches) Date: Thu, 11 Apr 2013 17:22:52 -0700 Message-ID: <20130412002252.GD11956@mtj.dyndns.org> References: <20130409210735.GR6320@redhat.com> <20130410192427.GA14911@redhat.com> <20130410235009.GI17641@mtj.dyndns.org> <20130411195203.GA11956@mtj.dyndns.org> <20130411200005.GB11956@mtj.dyndns.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Mikulas Patocka Cc: Vivek Goyal , Jens Axboe , Mike Snitzer , Milan Broz , dm-devel@redhat.com, Andi Kleen , dm-crypt@saout.de, linux-kernel@vger.kernel.org, Christoph Hellwig , Christian Schmidt , "Alasdair G. Kergon" List-Id: dm-devel.ids On Thu, Apr 11, 2013 at 08:06:10PM -0400, Mikulas Patocka wrote: > All that I can tell you is that adding an empty atomic operation > "cmpxchg(&bio->bi_css->refcnt, bio->bi_css->refcnt, bio->bi_css->refcnt);" > to bio_clone_context and bio_disassociate_task increases the time to run a > benchmark from 23 to 40 seconds. Right, linear target on ramdisk, very realistic, and you know what, hell with dm, let's just hand code everything into submit_bio(). I'm sure it will speed up your test case significantly. If this actually matters, improve it in *sane* way. Make the refcnts per-cpu and not use atomic ops. In fact, we already have proposed implementation of percpu refcnt which is being used by aio restructure patches and likely to be included in some form. It's not quite ready yet, so please work on something useful like that instead of continuing this non-sense. -- tejun