From: Ingo Molnar <mingo@elte.hu>
To: Andrew Morton <akpm@osdl.org>
Cc: axboe@suse.de, linux-kernel@vger.kernel.org
Subject: Re: [patch] max-sectors-2.6.9-rc1-bk14-A0
Date: Wed, 8 Sep 2004 14:38:21 +0200 [thread overview]
Message-ID: <20040908123821.GA17953@elte.hu> (raw)
In-Reply-To: <20040908044328.46eec88b.akpm@osdl.org>
* Andrew Morton <akpm@osdl.org> wrote:
> Still sounds a bit odd. How many cachelines can that CPU fetch in 8
> usecs? Several tens at least?
the CPU in question is a 600 MHz C3, so it should be dozens. Considering
a conservative 200nsec cacheline-fetch latency and 8 nsecs per byte
bursted - so for a 32-byte cacheline it could take 264 nsecs. So with
... ~8 cachelines touched that could only explain 2-3 usec of overhead.
The bio itself is not layed out optimally: the bio and the vector are on
two different cachelines plus we have the buffer_head too (in the ext3
case) - all on different cachelines.
but the latency does happen and it happens even with tracing turned
completely off.
The main overhead is the completion path for a single page, which goes
like:
__end_that_request_first()
bio_endio()
end_bio_bh_io_sync()
journal_end_buffer_io_sync()
unlock_buffer()
wake_up_buffer()
bio_put()
bio_destructor()
mempool_free()
mempool_free_slab()
kmem_cache_free()
mempool_free()
mempool_free_slab()
kmem_cache_free()
this is quite fat just from an instruction count POV - 14 functions with
at least 20 instructions in each function, amounting to ~300
instructions per iteration - that alone is quite an icache footprint
assumption.
Plus we could be trashing the cache due to touching at least 3 new
cachelines per iteration - which is 192 new (dirty) cachelines for the
full completion or ~6K of new L1 cache contents. With 128 byte
cachelines it's much worse: at least 24K worth of new cache contents.
I'd suggest to at least attempt to merge bio and bio->bi_io_vec into a
single cacheline, for the simpler cases.
another detail is the SLAB's FIFO logic memmove-ing the full array:
0.184ms (+0.000ms): kmem_cache_free (mempool_free)
0.185ms (+0.000ms): cache_flusharray (kmem_cache_free)
0.185ms (+0.000ms): free_block (cache_flusharray)
0.200ms (+0.014ms): memmove (cache_flusharray)
0.200ms (+0.000ms): memcpy (memmove)
that's 14 usecs a pop and quite likely a fair amount of new dirty cache
contents.
The building of the sg-list of the next DMA request was responsible for
some of the latency as well:
0.571ms (+0.000ms): ide_build_dmatable (ide_start_dma)
0.571ms (+0.000ms): ide_build_sglist (ide_build_dmatable)
0.572ms (+0.000ms): blk_rq_map_sg (ide_build_sglist)
0.593ms (+0.021ms): do_IRQ (common_interrupt)
0.594ms (+0.000ms): mask_and_ack_8259A (do_IRQ)
this completion codeath isnt something people really profiled/measured
previously, because it's in an irqs-off hardirq path that triggers
relatively rarely. But for scheduling latencies it can be quite high.
Ingo
next prev parent reply other threads:[~2004-09-08 12:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-08 10:04 [patch] max-sectors-2.6.9-rc1-bk14-A0 Ingo Molnar
2004-09-08 10:09 ` Andrew Morton
2004-09-08 10:49 ` Ingo Molnar
2004-09-08 11:43 ` Andrew Morton
2004-09-08 12:38 ` Ingo Molnar [this message]
2004-09-08 10:17 ` Jens Axboe
2004-09-08 10:54 ` Ingo Molnar
2004-09-08 11:05 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040908123821.GA17953@elte.hu \
--to=mingo@elte.hu \
--cc=akpm@osdl.org \
--cc=axboe@suse.de \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.