public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jianpeng Ma" <majianpeng@gmail.com>
To: shli <shli@kernel.org>
Cc: axboe <axboe@kernel.dk>, linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Re: [PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list.
Date: Tue, 16 Oct 2012 14:38:34 +0800	[thread overview]
Message-ID: <201210161438319848315@gmail.com> (raw)
In-Reply-To: CANejiEWWnrY_dKKuKLvjRms85crO+A7hi8P6nTkGMX9O2TL4kQ@mail.gmail.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb2312", Size: 11054 bytes --]

On 2012-10-15 21:18 Shaohua Li <shli@kernel.org> Wrote:
>2012/10/15 Shaohua Li <shli@fusionio.com>:
>> 2012/10/15 Jianpeng Ma <majianpeng@gmail.com>:
>>> My workload is a raid5 which had 16 disks. And used our filesystem to
>>> write using direct-io mode.
>>> I used the blktrace to find those message:
>>>
>>> 8,16   0     3570     1.083923979  2519  I   W 144323176 + 24 [md127_raid5]
>>> 8,16   0        0     1.083926214     0  m   N cfq2519 insert_request
>>> 8,16   0     3571     1.083926586  2519  I   W 144323072 + 104 [md127_raid5]
>>> 8,16   0        0     1.083926952     0  m   N cfq2519 insert_request
>>> 8,16   0     3572     1.083927180  2519  U   N [md127_raid5] 2
>>> 8,16   0        0     1.083927870     0  m   N cfq2519 Not idling.st->count:1
>>> 8,16   0        0     1.083928320     0  m   N cfq2519 dispatch_insert
>>> 8,16   0        0     1.083928951     0  m   N cfq2519 dispatched a request
>>> 8,16   0        0     1.083929443     0  m   N cfq2519 activate rq,drv=1
>>> 8,16   0     3573     1.083929530  2519  D   W 144323176 + 24 [md127_raid5]
>>> 8,16   0        0     1.083933883     0  m   N cfq2519 Not idling.st->count:1
>>> 8,16   0        0     1.083934189     0  m   N cfq2519 dispatch_insert
>>> 8,16   0        0     1.083934654     0  m   N cfq2519 dispatched a request
>>> 8,16   0        0     1.083935014     0  m   N cfq2519 activate rq,drv=2
>>> 8,16   0     3574     1.083935101  2519  D   W 144323072 + 104 [md127_raid5]
>>> 8,16   0     3575     1.084196179     0  C   W 144323176 + 24 [0]
>>> 8,16   0        0     1.084197979     0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   0     3576     1.084769073     0  C   W 144323072 + 104 [0]
>>>   ......
>>> 8,16   1     3596     1.091394357  2519  I   W 144322544 + 16 [md127_raid5]
>>> 8,16   1        0     1.091396181     0  m   N cfq2519 insert_request
>>> 8,16   1     3597     1.091396571  2519  I   W 144322520 + 24 [md127_raid5]
>>> 8,16   1        0     1.091396934     0  m   N cfq2519 insert_request
>>> 8,16   1     3598     1.091397165  2519  I   W 144322488 + 32 [md127_raid5]
>>> 8,16   1        0     1.091397477     0  m   N cfq2519 insert_request
>>> 8,16   1     3599     1.091397708  2519  I   W 144322432 + 56 [md127_raid5]
>>> 8,16   1        0     1.091398023     0  m   N cfq2519 insert_request
>>> 8,16   1     3600     1.091398284  2519  U   N [md127_raid5] 4
>>> 8,16   1        0     1.091398986     0  m   N cfq2519 Not idling. st->count:1
>>> 8,16   1        0     1.091399511     0  m   N cfq2519 dispatch_insert
>>> 8,16   1        0     1.091400217     0  m   N cfq2519 dispatched a request
>>> 8,16   1        0     1.091400688     0  m   N cfq2519 activate rq,drv=1
>>> 8,16   1     3601     1.091400766  2519  D   W 144322544 + 16 [md127_raid5]
>>> 8,16   1        0     1.091406151     0  m   N cfq2519 Not idling.st->count:1
>>> 8,16   1        0     1.091406460     0  m   N cfq2519 dispatch_insert
>>> 8,16   1        0     1.091406931     0  m   N cfq2519 dispatched a request
>>> 8,16   1        0     1.091407291     0  m   N cfq2519 activate rq,drv=2
>>> 8,16   1     3602     1.091407378  2519  D   W 144322520 + 24 [md127_raid5]
>>> 8,16   1        0     1.091414006     0  m   N cfq2519 Not idling.st->count:1
>>> 8,16   1        0     1.091414297     0  m   N cfq2519 dispatch_insert
>>> 8,16   1        0     1.091414702     0  m   N cfq2519 dispatched a request
>>> 8,16   1        0     1.091415047     0  m   N cfq2519 activate rq, drv=3
>>> 8,16   1     3603     1.091415125  2519  D   W 144322488 + 32 [md127_raid5]
>>> 8,16   1        0     1.091416469     0  m   N cfq2519 Not idling.st->count:1
>>> 8,16   1        0     1.091416754     0  m   N cfq2519 dispatch_insert
>>> 8,16   1        0     1.091417186     0  m   N cfq2519 dispatched a request
>>> 8,16   1        0     1.091417535     0  m   N cfq2519 activate rq,drv=4
>>> 8,16   1     3604     1.091417628  2519  D   W 144322432 + 56 [md127_raid5]
>>> 8,16   1     3605     1.091857225  4393  C   W 144322544 + 16 [0]
>>> 8,16   1        0     1.091858753     0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   1     3606     1.092068456  4393  C   W 144322520 + 24 [0]
>>> 8,16   1        0     1.092069851     0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   1     3607     1.092350440  4393  C   W 144322488 + 32 [0]
>>> 8,16   1        0     1.092351688     0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   1     3608     1.093629323     0  C   W 144322432 + 56 [0]
>>> 8,16   1        0     1.093631151     0  m   N cfq2519 complete rqnoidle 0
>>> 8,16   1        0     1.093631574     0  m   N cfq2519 will busy wait
>>> 8,16   1        0     1.093631829     0  m   N cfq schedule dispatch
>>>
>>> Because in func "elv_attempt_insert_merge", it only to try to
>>> backmerge.So the four request can't merge in theory.
>>> I trace ten minutes and count those situation, it can count 25%.
>>>
>>> With the patch,i tested and not found situation like above.
>>>
>>> Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
>>> ---
>>>  block/blk-core.c |    3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>> index a33870b..3c95c4d 100644
>>> --- a/block/blk-core.c
>>> +++ b/block/blk-core.c
>>> @@ -2868,7 +2868,8 @@ static int plug_rq_cmp(void *priv, struct list_head *a, struct list_head *b)
>>>         struct request *rqa = container_of(a, struct request, queuelist);
>>>         struct request *rqb = container_of(b, struct request, queuelist);
>>>
>>> -       return !(rqa->q <= rqb->q);
>>> +       return !(rqa->q < rqb->q ||
>>> +               (rqa->q == rqb->q && blk_rq_pos(rqa) < blk_rq_pos(rqb)));
>>>  }
>>
>> Does this one help too?
>> http://marc.info/?l=linux-kernel&m=132399972114668&w=2
>>
>> I thought the real problem is we don't do recursive request
>> merge. I had no objection to the patch itself, but just hope we
>> can make recursive merge work, which is more generic.
>
>Oh, wait, the 4 requests aren't merged completely in your case.
>And the requests are from one thread and plug context.
>Not the issue I mentioned. I'm wondering how this could happen.
>they should be merged in attempt_plug_merge already.
Hi shaohua,
	I think i missed some messages about blktrace.So make you confused.
I trace some message using blktrace:
 8,16   0     6638     2.453619407  2579  Q   W 7493144 + 8 [md0_raid5]
  8,16   0     6639     2.453620460  2579  G   W 7493144 + 8 [md0_raid5]
  8,16   0     6640     2.453639311  2579  Q   W 7493120 + 8 [md0_raid5]
  8,16   0     6641     2.453639842  2579  G   W 7493120 + 8 [md0_raid5]
  8,16   0     6642     2.453647796  2579  Q   W 7493128 + 8 [md0_raid5]
  8,16   0     6643     2.453647940  2579  M   W 7493128 + 8 [md0_raid5]
  8,16   0     6644     2.453658249  2579  Q   W 7493136 + 8 [md0_raid5]
  8,16   0     6645     2.453658393  2579  M   W 7493136 + 8 [md0_raid5]
  8,16   0     6646     2.453665309  2579  Q   W 7493152 + 8 [md0_raid5]
  8,16   0     6647     2.453665504  2579  M   W 7493152 + 8 [md0_raid5]
  8,16   0     6648     2.453672411  2579  Q   W 7493160 + 8 [md0_raid5]
  8,16   0     6649     2.453672606  2579  M   W 7493160 + 8 [md0_raid5]
  8,16   0     6650     2.453679255  2579  Q   W 7493168 + 8 [md0_raid5]
  8,16   0     6651     2.453679441  2579  M   W 7493168 + 8 [md0_raid5]
  8,16   0     6652     2.453685948  2579  Q   W 7493176 + 8 [md0_raid5]
  8,16   0     6653     2.453686149  2579  M   W 7493176 + 8 [md0_raid5]
  8,16   0     6654     2.453693074  2579  Q   W 7493184 + 8 [md0_raid5]
  8,16   0     6655     2.453693254  2579  M   W 7493184 + 8 [md0_raid5]
  8,16   0     6656     2.453704290  2579  Q   W 7493192 + 8 [md0_raid5]
  8,16   0     6657     2.453704482  2579  M   W 7493192 + 8 [md0_raid5]
  8,16   0     6658     2.453715016  2579  Q   W 7493200 + 8 [md0_raid5]
  8,16   0     6659     2.453715247  2579  M   W 7493200 + 8 [md0_raid5]
  8,16   0     6660     2.453721730  2579  Q   W 7493208 + 8 [md0_raid5]
  8,16   0     6661     2.453721974  2579  M   W 7493208 + 8 [md0_raid5]
  8,16   0     6662     2.453728202  2579  Q   W 7493216 + 8 [md0_raid5]
  8,16   0     6663     2.453728436  2579  M   W 7493216 + 8 [md0_raid5]
  8,16   0     6664     2.453734782  2579  Q   W 7493224 + 8 [md0_raid5]
  8,16   0     6665     2.453735019  2579  M   W 7493224 + 8 [md0_raid5]
  8,16   0     6666     2.453741401  2579  Q   W 7493232 + 8 [md0_raid5]
  8,16   0     6667     2.453741632  2579  M   W 7493232 + 8 [md0_raid5]
  8,16   0     6668     2.453748148  2579  Q   W 7493240 + 8 [md0_raid5]
  8,16   0     6669     2.453748386  2579  M   W 7493240 + 8 [md0_raid5]
  8,16   0     6670     2.453851843  2579  I   W 7493144 + 104 [md0_raid5]
  8,16   0        0     2.453853661     0  m   N cfq2579 insert_request
  8,16   0     6671     2.453854064  2579  I   W 7493120 + 24 [md0_raid5]
  8,16   0        0     2.453854439     0  m   N cfq2579 insert_request
  8,16   0     6672     2.453854793  2579  U   N [md0_raid5] 2
  8,16   0        0     2.453855513     0  m   N cfq2579 Not idling. st->count:1
  8,16   0        0     2.453855927     0  m   N cfq2579 dispatch_insert
  8,16   0        0     2.453861771     0  m   N cfq2579 dispatched a request
  8,16   0        0     2.453862248     0  m   N cfq2579 activate rq, drv=1
  8,16   0     6673     2.453862332  2579  D   W 7493120 + 24 [md0_raid5]
  8,16   0        0     2.453865957     0  m   N cfq2579 Not idling. st->count:1
  8,16   0        0     2.453866269     0  m   N cfq2579 dispatch_insert
  8,16   0        0     2.453866707     0  m   N cfq2579 dispatched a request
  8,16   0        0     2.453867061     0  m   N cfq2579 activate rq, drv=2
  8,16   0     6674     2.453867145  2579  D   W 7493144 + 104 [md0_raid5]
  8,16   0     6675     2.454147608     0  C   W 7493120 + 24 [0]
  8,16   0        0     2.454149357     0  m   N cfq2579 complete rqnoidle 0
  8,16   0     6676     2.454791505     0  C   W 7493144 + 104 [0]
  8,16   0        0     2.454794803     0  m   N cfq2579 complete rqnoidle 0
  8,16   0        0     2.454795160     0  m   N cfq schedule dispatch

>From above messages,we can found why rq[W 7493144 + 104] and rq[W 7493120 + 24] do not merge.
Because the bio order is:
  8,16   0     6638     2.453619407  2579  Q   W 7493144 + 8 [md0_raid5]
  8,16   0     6639     2.453620460  2579  G   W 7493144 + 8 [md0_raid5]
  8,16   0     6640     2.453639311  2579  Q   W 7493120 + 8 [md0_raid5]
  8,16   0     6641     2.453639842  2579  G   W 7493120 + 8 [md0_raid5]
Because the bio(7493144) first and bio(7493120) later.So the subsequent bios will be divided into two parts.
As you mentions, recursive merge doesn't work for this situation.
Why is can occur?I think because the raid operation.It disorder some bios.

Is it ok? If ok,i'll resend the patch using useful message.

Thanks!
Jianpeng


ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

  parent reply	other threads:[~2012-10-16  6:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-15 12:35 [PATCH] block: Add blk_rq_pos(rq) to sort rq when plushing plug-list Jianpeng Ma
2012-10-15 12:55 ` Shaohua Li
2012-10-15 13:18   ` Shaohua Li
2012-10-16  2:40     ` Jianpeng Ma
2012-10-16  6:38     ` Jianpeng Ma [this message]
2012-10-16  7:48       ` Shaohua Li
2012-10-17  1:00         ` Jianpeng Ma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201210161438319848315@gmail.com \
    --to=majianpeng@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shli@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox