From: Kashyap Desai <kashyap.desai@broadcom.com>
To: Ming Lei <ming.lei@redhat.com>, Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Christoph Hellwig <hch@infradead.org>,
Mike Snitzer <snitzer@redhat.com>,
linux-scsi@vger.kernel.org, Arun Easi <arun.easi@cavium.com>,
Omar Sandoval <osandov@fb.com>,
"Martin K . Petersen" <martin.petersen@oracle.com>,
James Bottomley <james.bottomley@hansenpartnership.com>,
Christoph Hellwig <hch@lst.de>,
Don Brace <don.brace@microsemi.com>,
Peter Rivera <peter.rivera@broadcom.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Laurence Oberman <loberman@redhat.com>
Subject: RE: [PATCH 0/5] blk-mq/scsi-mq: support global tags & introduce force_blk_mq
Date: Fri, 9 Feb 2018 10:28:23 +0530 [thread overview]
Message-ID: <14c596572dcf99c4343c7eba65a7c427@mail.gmail.com> (raw)
In-Reply-To: <20180208165313.GA31725@ming.t460p>
> -----Original Message-----
> From: Ming Lei [mailto:ming.lei@redhat.com]
> Sent: Thursday, February 8, 2018 10:23 PM
> To: Hannes Reinecke
> Cc: Kashyap Desai; Jens Axboe; linux-block@vger.kernel.org; Christoph
> Hellwig; Mike Snitzer; linux-scsi@vger.kernel.org; Arun Easi; Omar
Sandoval;
> Martin K . Petersen; James Bottomley; Christoph Hellwig; Don Brace;
Peter
> Rivera; Paolo Bonzini; Laurence Oberman
> Subject: Re: [PATCH 0/5] blk-mq/scsi-mq: support global tags & introduce
> force_blk_mq
>
> On Thu, Feb 08, 2018 at 08:00:29AM +0100, Hannes Reinecke wrote:
> > On 02/07/2018 03:14 PM, Kashyap Desai wrote:
> > >> -----Original Message-----
> > >> From: Ming Lei [mailto:ming.lei@redhat.com]
> > >> Sent: Wednesday, February 7, 2018 5:53 PM
> > >> To: Hannes Reinecke
> > >> Cc: Kashyap Desai; Jens Axboe; linux-block@vger.kernel.org;
> > >> Christoph Hellwig; Mike Snitzer; linux-scsi@vger.kernel.org; Arun
> > >> Easi; Omar
> > > Sandoval;
> > >> Martin K . Petersen; James Bottomley; Christoph Hellwig; Don Brace;
> > > Peter
> > >> Rivera; Paolo Bonzini; Laurence Oberman
> > >> Subject: Re: [PATCH 0/5] blk-mq/scsi-mq: support global tags &
> > >> introduce force_blk_mq
> > >>
> > >> On Wed, Feb 07, 2018 at 07:50:21AM +0100, Hannes Reinecke wrote:
> > >>> Hi all,
> > >>>
> > >>> [ .. ]
> > >>>>>
> > >>>>> Could you share us your patch for enabling global_tags/MQ on
> > >>>> megaraid_sas
> > >>>>> so that I can reproduce your test?
> > >>>>>
> > >>>>>> See below perf top data. "bt_iter" is consuming 4 times more
CPU.
> > >>>>>
> > >>>>> Could you share us what the IOPS/CPU utilization effect is after
> > >>>> applying the
> > >>>>> patch V2? And your test script?
> > >>>> Regarding CPU utilization, I need to test one more time.
> > >>>> Currently system is in used.
> > >>>>
> > >>>> I run below fio test on total 24 SSDs expander attached.
> > >>>>
> > >>>> numactl -N 1 fio jbod.fio --rw=randread --iodepth=64 --bs=4k
> > >>>> --ioengine=libaio --rw=randread
> > >>>>
> > >>>> Performance dropped from 1.6 M IOPs to 770K IOPs.
> > >>>>
> > >>> This is basically what we've seen with earlier iterations.
> > >>
> > >> Hi Hannes,
> > >>
> > >> As I mentioned in another mail[1], Kashyap's patch has a big issue,
> > > which
> > >> causes only reply queue 0 used.
> > >>
> > >> [1] https://marc.info/?l=linux-scsi&m=151793204014631&w=2
> > >>
> > >> So could you guys run your performance test again after fixing the
> > > patch?
> > >
> > > Ming -
> > >
> > > I tried after change you requested. Performance drop is still
unresolved.
> > > From 1.6 M IOPS to 770K IOPS.
> > >
> > > See below data. All 24 reply queue is in used correctly.
> > >
> > > IRQs / 1 second(s)
> > > IRQ# TOTAL NODE0 NODE1 NAME
> > > 360 16422 0 16422 IR-PCI-MSI 70254653-edge megasas
> > > 364 15980 0 15980 IR-PCI-MSI 70254657-edge megasas
> > > 362 15979 0 15979 IR-PCI-MSI 70254655-edge megasas
> > > 345 15696 0 15696 IR-PCI-MSI 70254638-edge megasas
> > > 341 15659 0 15659 IR-PCI-MSI 70254634-edge megasas
> > > 369 15656 0 15656 IR-PCI-MSI 70254662-edge megasas
> > > 359 15650 0 15650 IR-PCI-MSI 70254652-edge megasas
> > > 358 15596 0 15596 IR-PCI-MSI 70254651-edge megasas
> > > 350 15574 0 15574 IR-PCI-MSI 70254643-edge megasas
> > > 342 15532 0 15532 IR-PCI-MSI 70254635-edge megasas
> > > 344 15527 0 15527 IR-PCI-MSI 70254637-edge megasas
> > > 346 15485 0 15485 IR-PCI-MSI 70254639-edge megasas
> > > 361 15482 0 15482 IR-PCI-MSI 70254654-edge megasas
> > > 348 15467 0 15467 IR-PCI-MSI 70254641-edge megasas
> > > 368 15463 0 15463 IR-PCI-MSI 70254661-edge megasas
> > > 354 15420 0 15420 IR-PCI-MSI 70254647-edge megasas
> > > 351 15378 0 15378 IR-PCI-MSI 70254644-edge megasas
> > > 352 15377 0 15377 IR-PCI-MSI 70254645-edge megasas
> > > 356 15348 0 15348 IR-PCI-MSI 70254649-edge megasas
> > > 337 15344 0 15344 IR-PCI-MSI 70254630-edge megasas
> > > 343 15320 0 15320 IR-PCI-MSI 70254636-edge megasas
> > > 355 15266 0 15266 IR-PCI-MSI 70254648-edge megasas
> > > 335 15247 0 15247 IR-PCI-MSI 70254628-edge megasas
> > > 363 15233 0 15233 IR-PCI-MSI 70254656-edge megasas
> > >
> > >
> > > Average: CPU %usr %nice %sys %iowait
%steal
> > > %irq %soft %guest %gnice %idle
> > > Average: 18 3.80 0.00 14.78 10.08
0.00
> > > 0.00 4.01 0.00 0.00 67.33
> > > Average: 19 3.26 0.00 15.35 10.62
0.00
> > > 0.00 4.03 0.00 0.00 66.74
> > > Average: 20 3.42 0.00 14.57 10.67
0.00
> > > 0.00 3.84 0.00 0.00 67.50
> > > Average: 21 3.19 0.00 15.60 10.75
0.00
> > > 0.00 4.16 0.00 0.00 66.30
> > > Average: 22 3.58 0.00 15.15 10.66
0.00
> > > 0.00 3.51 0.00 0.00 67.11
> > > Average: 23 3.34 0.00 15.36 10.63
0.00
> > > 0.00 4.17 0.00 0.00 66.50
> > > Average: 24 3.50 0.00 14.58 10.93
0.00
> > > 0.00 3.85 0.00 0.00 67.13
> > > Average: 25 3.20 0.00 14.68 10.86
0.00
> > > 0.00 4.31 0.00 0.00 66.95
> > > Average: 26 3.27 0.00 14.80 10.70
0.00
> > > 0.00 3.68 0.00 0.00 67.55
> > > Average: 27 3.58 0.00 15.36 10.80
0.00
> > > 0.00 3.79 0.00 0.00 66.48
> > > Average: 28 3.46 0.00 15.17 10.46
0.00
> > > 0.00 3.32 0.00 0.00 67.59
> > > Average: 29 3.34 0.00 14.42 10.72
0.00
> > > 0.00 3.34 0.00 0.00 68.18
> > > Average: 30 3.34 0.00 15.08 10.70
0.00
> > > 0.00 3.89 0.00 0.00 66.99
> > > Average: 31 3.26 0.00 15.33 10.47
0.00
> > > 0.00 3.33 0.00 0.00 67.61
> > > Average: 32 3.21 0.00 14.80 10.61
0.00
> > > 0.00 3.70 0.00 0.00 67.67
> > > Average: 33 3.40 0.00 13.88 10.55
0.00
> > > 0.00 4.02 0.00 0.00 68.15
> > > Average: 34 3.74 0.00 17.41 10.61
0.00
> > > 0.00 4.51 0.00 0.00 63.73
> > > Average: 35 3.35 0.00 14.37 10.74
0.00
> > > 0.00 3.84 0.00 0.00 67.71
> > > Average: 36 0.54 0.00 1.77 0.00
0.00
> > > 0.00 0.00 0.00 0.00 97.69
> > > ..
> > > Average: 54 3.60 0.00 15.17 10.39
0.00
> > > 0.00 4.22 0.00 0.00 66.62
> > > Average: 55 3.33 0.00 14.85 10.55
0.00
> > > 0.00 3.96 0.00 0.00 67.31
> > > Average: 56 3.40 0.00 15.19 10.54
0.00
> > > 0.00 3.74 0.00 0.00 67.13
> > > Average: 57 3.41 0.00 13.98 10.78
0.00
> > > 0.00 4.10 0.00 0.00 67.73
> > > Average: 58 3.32 0.00 15.16 10.52
0.00
> > > 0.00 4.01 0.00 0.00 66.99
> > > Average: 59 3.17 0.00 15.80 10.35
0.00
> > > 0.00 3.86 0.00 0.00 66.80
> > > Average: 60 3.00 0.00 14.63 10.59
0.00
> > > 0.00 3.97 0.00 0.00 67.80
> > > Average: 61 3.34 0.00 14.70 10.66
0.00
> > > 0.00 4.32 0.00 0.00 66.97
> > > Average: 62 3.34 0.00 15.29 10.56
0.00
> > > 0.00 3.89 0.00 0.00 66.92
> > > Average: 63 3.29 0.00 14.51 10.72
0.00
> > > 0.00 3.85 0.00 0.00 67.62
> > > Average: 64 3.48 0.00 15.31 10.65
0.00
> > > 0.00 3.97 0.00 0.00 66.60
> > > Average: 65 3.34 0.00 14.36 10.80
0.00
> > > 0.00 4.11 0.00 0.00 67.39
> > > Average: 66 3.13 0.00 14.94 10.70
0.00
> > > 0.00 4.10 0.00 0.00 67.13
> > > Average: 67 3.06 0.00 15.56 10.69
0.00
> > > 0.00 3.82 0.00 0.00 66.88
> > > Average: 68 3.33 0.00 14.98 10.61
0.00
> > > 0.00 3.81 0.00 0.00 67.27
> > > Average: 69 3.20 0.00 15.43 10.70
0.00
> > > 0.00 3.82 0.00 0.00 66.85
> > > Average: 70 3.34 0.00 17.14 10.59
0.00
> > > 0.00 3.00 0.00 0.00 65.92
> > > Average: 71 3.41 0.00 14.94 10.56
0.00
> > > 0.00 3.41 0.00 0.00 67.69
> > >
> > > Perf top -
> > >
> > > 64.33% [kernel] [k] bt_iter
> > > 4.86% [kernel] [k] blk_mq_queue_tag_busy_iter
> > > 4.23% [kernel] [k] _find_next_bit
> > > 2.40% [kernel] [k] native_queued_spin_lock_slowpath
> > > 1.09% [kernel] [k] sbitmap_any_bit_set
> > > 0.71% [kernel] [k] sbitmap_queue_clear
> > > 0.63% [kernel] [k] find_next_bit
> > > 0.54% [kernel] [k] _raw_spin_lock_irqsave
> > >
> > Ah. So we're spending quite some time in trying to find a free tag.
> > I guess this is due to every queue starting at the same position
> > trying to find a free tag, which inevitably leads to a contention.
>
> IMO, the above trace means that blk_mq_in_flight() may be the
bottleneck,
> and looks not related with tag allocation.
>
> Kashyap, could you run your performance test again after disabling
iostat by
> the following command on all test devices and killing all utilities
which may
> read iostat(/proc/diskstats, ...)?
>
> echo 0 > /sys/block/sdN/queue/iostat
Ming - After changing iostat = 0 , I see performance issue is resolved.
Below is perf top output after iostats = 0
23.45% [kernel] [k] bt_iter
2.27% [kernel] [k] blk_mq_queue_tag_busy_iter
2.18% [kernel] [k] _find_next_bit
2.06% [megaraid_sas] [k] complete_cmd_fusion
1.87% [kernel] [k] clflush_cache_range
1.70% [kernel] [k] dma_pte_clear_level
1.56% [kernel] [k] __domain_mapping
1.55% [kernel] [k] sbitmap_queue_clear
1.30% [kernel] [k] gup_pgd_range
>
> Thanks,
> Ming
next prev parent reply other threads:[~2018-02-09 4:58 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-03 4:21 [PATCH 0/5] blk-mq/scsi-mq: support global tags & introduce force_blk_mq Ming Lei
2018-02-03 4:21 ` [PATCH 1/5] blk-mq: tags: define several fields of tags as pointer Ming Lei
2018-02-05 6:57 ` Hannes Reinecke
2018-02-08 17:34 ` Bart Van Assche
2018-02-03 4:21 ` [PATCH 2/5] blk-mq: introduce BLK_MQ_F_GLOBAL_TAGS Ming Lei
2018-02-05 6:54 ` Hannes Reinecke
2018-02-05 10:35 ` Ming Lei
2018-02-03 4:21 ` [PATCH 3/5] block: null_blk: introduce module parameter of 'g_global_tags' Ming Lei
2018-02-05 6:54 ` Hannes Reinecke
2018-02-03 4:21 ` [PATCH 4/5] scsi: introduce force_blk_mq Ming Lei
2018-02-05 6:57 ` Hannes Reinecke
2018-02-03 4:21 ` [PATCH 5/5] scsi: virtio_scsi: fix IO hang by irq vector automatic affinity Ming Lei
2018-02-05 6:57 ` Hannes Reinecke
2018-02-05 6:58 ` [PATCH 0/5] blk-mq/scsi-mq: support global tags & introduce force_blk_mq Hannes Reinecke
2018-02-05 7:05 ` Kashyap Desai
2018-02-05 10:17 ` Ming Lei
2018-02-06 6:03 ` Kashyap Desai
2018-02-06 8:04 ` Ming Lei
2018-02-06 11:29 ` Kashyap Desai
2018-02-06 12:31 ` Ming Lei
2018-02-06 14:27 ` Kashyap Desai
2018-02-06 15:46 ` Ming Lei
2018-02-07 6:50 ` Hannes Reinecke
2018-02-07 12:23 ` Ming Lei
2018-02-07 14:14 ` Kashyap Desai
2018-02-08 1:23 ` Ming Lei
2018-02-08 7:00 ` Hannes Reinecke
2018-02-08 16:53 ` Ming Lei
2018-02-09 4:58 ` Kashyap Desai [this message]
2018-02-09 5:31 ` Ming Lei
2018-02-09 8:42 ` Kashyap Desai
2018-02-10 1:01 ` Ming Lei
2018-02-11 5:31 ` Ming Lei
2018-02-12 18:35 ` Kashyap Desai
2018-02-13 0:40 ` Ming Lei
2018-02-14 6:28 ` Kashyap Desai
2018-02-05 10:23 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=14c596572dcf99c4343c7eba65a7c427@mail.gmail.com \
--to=kashyap.desai@broadcom.com \
--cc=arun.easi@cavium.com \
--cc=axboe@kernel.dk \
--cc=don.brace@microsemi.com \
--cc=hare@suse.de \
--cc=hch@infradead.org \
--cc=hch@lst.de \
--cc=james.bottomley@hansenpartnership.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=loberman@redhat.com \
--cc=martin.petersen@oracle.com \
--cc=ming.lei@redhat.com \
--cc=osandov@fb.com \
--cc=pbonzini@redhat.com \
--cc=peter.rivera@broadcom.com \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).