From: Vivek Goyal <vgoyal@redhat.com>
To: Tao Ma <tm@tao.ma>
Cc: linux-kernel@vger.kernel.org, Jens Axboe <axboe@kernel.dk>
Subject: Re: CFQ: async queue blocks the whole system
Date: Tue, 14 Jun 2011 17:14:11 -0400 [thread overview]
Message-ID: <20110614211411.GI2525@redhat.com> (raw)
In-Reply-To: <4DF78157.6020907@tao.ma>
On Tue, Jun 14, 2011 at 11:42:15PM +0800, Tao Ma wrote:
> On 06/14/2011 09:30 PM, Vivek Goyal wrote:
> > On Tue, Jun 14, 2011 at 03:03:24PM +0800, Tao Ma wrote:
> >> Hi Vivek,
> >> On 06/14/2011 05:41 AM, Vivek Goyal wrote:
> >>> On Mon, Jun 13, 2011 at 06:08:40PM +0800, Tao Ma wrote:
> >>>
> >>> [..]
> >>>>> You can also run iostat on disk and should be able to see that with
> >>>>> the patch you are dispatching writes more often than before.
> >>>> Sorry, the patch doesn't work.
> >>>>
> >>>> I used trace event to capture all the blktraces since it doesn't
> >>>> interfere with the tests, hope it helps.
> >>>
> >>> Actually I was looking for CFQ traces. This seems to be generic block
> >>> layer trace points. May be you can use "blktrace -d /dev/<device>"
> >>> and then blkparse. It also gives the aggregate view which is helpful.
> >>>
> >>>>
> >>>> Please downloaded it from http://blog.coly.li/tmp/blktrace.tar.bz2
> >>>
> >>> What concerns me is following.
> >>>
> >>> 5255.521353: block_rq_issue: 8,0 W 0 () 571137153 + 8 [attr_set]
> >>> 5578.863871: block_rq_issue: 8,0 W 0 () 512950473 + 48 [kworker/0:1]
> >>>
> >>> IIUC, we dispatched second write more than 300 seconds after dispatching
> >>> 1 write. What happened in between. We should have dispatched more writes.
> >>>
> >>> CFQ traces might give better idea in terms of whether wl_type for async
> >>> queues was scheduled or not at all.
> >> I tried several times today, but it looks like that if I enable
> >> blktrace, the hung_task will not show up in the message. So do you think
> >> the blktrace at that time is still useful? If yes, I can capture 1
> >> minute for you. Thanks.
> >
> > Capturing 1 min output will also be good.
> OK, I captured 2 mins blkparse log before the hung. You can downloaded
> it from http://blog.coly.li/tmp/blkparse.tar.bz2
Thanks. I looked at this log and looks like now we are not starving
WRITES.
I did grep on the logs.
grep -e "wl_type:0" -e "cfq3068A / sl_used" blkparse.log | async-dispatch
And I see that now async WRITES are being dispatched at regular interval
and we are not seeing long delays (atleast in this log).
A sample output is as follows. What I am expecting from the patch is that
it will avoid the starvation of async queues in presence of lots of
writers. That's a different thing that one might not be able to push
enough WRITES in 120 seconds window and one can still get hung task
timeout message.
5.135877740 0 m N cfq3068A / set_active wl_prio:0 wl_type:0
5.231137776 0 m N cfq3068A / sl_used=95 disp=1 charge=95 iops=0 sect=16
13.311745653 0 m N cfq3068A / set_active wl_prio:0 wl_type:0
13.373046196 0 m N cfq3068A / sl_used=1 disp=16 charge=1 iops=0 sect=136
18.097413421 0 m N cfq3068A / set_active wl_prio:0 wl_type:0
18.097466598 0 m N cfq3068A / sl_used=1 disp=3 charge=1 iops=0 sect=32
18.119371182 0 m N cfq3068A / set_active wl_prio:0 wl_type:0
18.159420999 0 m N cfq3068A / sl_used=40 disp=1592 charge=40 iops=0 sect=14360
18.159424767 0 m N cfq3068A / set_active wl_prio:0 wl_type:0
18.199409182 0 m N cfq3068A / sl_used=40 disp=1646 charge=40 iops=0 sect=13584
18.199414996 0 m N cfq3068A / set_active wl_prio:0 wl_type:0
18.239374395 0 m N cfq3068A / sl_used=40 disp=1678 charge=40 iops=0 sect=13872
18.239378182 0 m N cfq3068A / set_active wl_prio:0 wl_type:0
18.254531670 0 m N cfq3068A / sl_used=15 disp=603 charge=15 iops=0 sect=4920
27.580961230 0 m N cfq3068A / set_active wl_prio:0 wl_type:0
27.653340897 0 m N cfq3068A / sl_used=72 disp=16 charge=72 iops=0 sect=128
Thanks
Vivek
next prev parent reply other threads:[~2011-06-14 21:14 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-09 10:49 CFQ: async queue blocks the whole system Tao Ma
2011-06-09 14:14 ` Vivek Goyal
2011-06-09 14:34 ` Jens Axboe
2011-06-09 14:47 ` Tao Ma
2011-06-09 15:37 ` Vivek Goyal
2011-06-09 15:44 ` Tao Ma
2011-06-09 18:27 ` Vivek Goyal
2011-06-10 5:48 ` Tao Ma
2011-06-10 9:14 ` Vivek Goyal
2011-06-10 10:00 ` Tao Ma
2011-06-10 15:44 ` Vivek Goyal
2011-06-11 7:24 ` Tao Ma
2011-06-13 10:08 ` Tao Ma
2011-06-13 21:41 ` Vivek Goyal
2011-06-14 7:03 ` Tao Ma
2011-06-14 13:30 ` Vivek Goyal
2011-06-14 15:42 ` Tao Ma
2011-06-14 21:14 ` Vivek Goyal [this message]
2011-06-17 3:04 ` Tao Ma
2011-06-17 12:50 ` Vivek Goyal
2011-06-17 14:34 ` Tao Ma
2011-06-10 1:19 ` Shaohua Li
2011-06-10 1:34 ` Shaohua Li
2011-06-10 2:06 ` Tao Ma
2011-06-10 2:35 ` Shaohua Li
2011-06-10 3:02 ` Tao Ma
2011-06-10 9:20 ` Vivek Goyal
2011-06-10 9:21 ` Jens Axboe
2011-06-13 1:03 ` Shaohua Li
2011-06-10 9:17 ` Vivek Goyal
2011-06-10 9:20 ` Jens Axboe
2011-06-10 9:29 ` Vivek Goyal
2011-06-10 9:31 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110614211411.GI2525@redhat.com \
--to=vgoyal@redhat.com \
--cc=axboe@kernel.dk \
--cc=linux-kernel@vger.kernel.org \
--cc=tm@tao.ma \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.