From: Jens Axboe <axboe@kernel.dk>
To: Shaohua Li <shli@kernel.org>
Cc: "Matias Bjørling" <m@bjorling.me>,
sbradshaw@micron.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] block: per-cpu counters for in-flight IO accounting
Date: Wed, 04 Jun 2014 08:29:07 -0600 [thread overview]
Message-ID: <538F2D33.2070106@kernel.dk> (raw)
In-Reply-To: <20140604103901.GA14383@kernel.org>
On 2014-06-04 04:39, Shaohua Li wrote:
> On Fri, May 30, 2014 at 07:49:52AM -0600, Jens Axboe wrote:
>> On 2014-05-30 06:11, Shaohua Li wrote:
>>> On Fri, May 09, 2014 at 10:41:27AM -0600, Jens Axboe wrote:
>>>> On 05/09/2014 08:12 AM, Jens Axboe wrote:
>>>>> On 05/09/2014 03:17 AM, Matias Bjørling wrote:
>>>>>> With multi-million IOPS and multi-node workloads, the atomic_t in_flight
>>>>>> tracking becomes a bottleneck. Change the in-flight accounting to per-cpu
>>>>>> counters to elevate.
>>>>>
>>>>> The part stats are a pain in the butt, I've tried to come up with a
>>>>> great fix for them too. But I don't think the percpu conversion is
>>>>> necessarily the right one. The summing is part of the hotpath, so percpu
>>>>> counters aren't necessarily the right way to go. I don't have a better
>>>>> answer right now, otherwise it would have been fixed :-)
>>>>
>>>> Actual data point - this slows my test down ~14% compared to the stock
>>>> kernel. Also, if you experiment with this, you need to watch for the
>>>> out-of-core users of the part stats (like DM).
>>>
>>> I had a try with Matias's patch. Performance actually boost significantly.
>>> (there are other cache line issue though, eg, hd_struct_get). Jens, what did
>>> you run? part_in_flight() has 3 usages. 2 are for status output, which are cold
>>> path. part_round_stats_single() uses it too, but it's a cold path too as we
>>> simple data every jiffy. Are you using HZ=1000? maybe we should simple the data
>>> every 10ms instead of every jiffy?
>>
>> I ran peak and normal benchmarks on a p320, on a 4 socket box (64
>> cores). The problem is the one hot path of part_in_flight(), summing
>> percpu for that is too expensive. On bigger systems than mine, it'd
>> be even worse.
>
> I run a null_blk test with 4 sockets, Matias has improvement. And I didn't find
> part_in_flight() is called in any hot path.
It's done for every IO completion, that is (by definition) a hot path. I
tested on two devices here, and it was definitely slower. And my system
only had just the right number of NR_CPUS, I suspect it'd be much worse
on bigger systems.
--
Jens Axboe
prev parent reply other threads:[~2014-06-04 14:29 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-09 9:17 [PATCH] block: small performance optimization Matias Bjørling
2014-05-09 9:17 ` [PATCH] block: per-cpu counters for in-flight IO accounting Matias Bjørling
2014-05-09 14:12 ` Jens Axboe
2014-05-09 16:41 ` Jens Axboe
2014-05-30 12:11 ` Shaohua Li
2014-05-30 13:49 ` Jens Axboe
2014-06-04 10:39 ` Shaohua Li
2014-06-04 11:29 ` Matias Bjørling
2014-06-04 20:08 ` Jens Axboe
2014-06-05 2:09 ` Shaohua Li
2014-06-05 2:16 ` Jens Axboe
2014-06-05 2:33 ` Shaohua Li
2014-06-05 2:42 ` Jens Axboe
2014-06-04 14:29 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=538F2D33.2070106@kernel.dk \
--to=axboe@kernel.dk \
--cc=linux-kernel@vger.kernel.org \
--cc=m@bjorling.me \
--cc=sbradshaw@micron.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.