public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Lincoln Dale <ltd@cisco.com>
Cc: Andrew Morton <akpm@zip.com.au>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: O_DIRECT performance impact on 2.4.18 (was: Re: [PATCH] 2.5.14IDE  56)
Date: Mon, 13 May 2002 13:37:18 +0200	[thread overview]
Message-ID: <20020513133718.R13730@dualathlon.random> (raw)
In-Reply-To: <5.1.0.14.2.20020510191214.018915f0@mira-sjcm-3.cisco.com> <3CDAC4EB.FC4FE5CF@zip.com.au> <5.1.0.14.2.20020510155122.02d97910@mira-sjcm-3.cisco.com> <5.1.0.14.2.20020510191214.018915f0@mira-sjcm-3.cisco.com> <5.1.0.14.2.20020512204053.03cc2eb8@mira-sjcm-3.cisco.com>

On Sun, May 12, 2002 at 09:23:55PM +1000, Lincoln Dale wrote:
> O_DIRECT:
>         [root@mel-stglab-host1 src]# tail -20 /tmp/direct.txt
>         8012a670 follow_page                                  25   0.1202
>         8012a740 get_user_pages                               89   0.1918

follow-page and get_user_pages is the actual cpu cost of walking the
pagetables. that could be trimmed down by wasting some memory for an
efficient software-kernel-side tlb.

>         80136d40 __free_pages                                 10   0.2083
>         801d28b0 generic_make_request                         83   0.2730
>         8012aa50 mark_dirty_kiobuf                            35   0.3125
>         8013f0e0 set_bh_page                                  22   0.3438
>         8011f950 do_softirq                                   88   0.3929
>         8023d670 sd_find_queue                                26   0.4062
>         80142a10 max_block                                    54   0.4219
>         80200fb0 __scsi_end_request                          165   0.5428
>         80142c80 blkdev_get_block                             37   0.5781
>         801405d0 brw_kiovec                                  581   0.6371
>         80140560 wait_kio                                     90   0.8036
>         80152820 end_kio_request                              76   0.9500
>         801d29e0 submit_bh                                   181   1.6161
>         8013e950 init_buffer                                  55   1.7188
>         801d22a0 __make_request                             3073   1.9800
>         8013dd10 unlock_buffer                               189   2.3625
>         80140520 end_buffer_io_kiobuf                        408   6.3750
>         80106d20 default_idle                              45686 713.8438

the cpu cost is much smaller than base as you can see and most of it
will be optimized away with vary-io that should lead to follow_page and
get_user_pages to move down in the above profiling.

> 
> base:
>         [root@mel-stglab-host1 src]# tail -20 /tmp/base.txt
>         80133e60 kmem_cache_alloc                            249   0.9154
>         80200fb0 __scsi_end_request                          291   0.9572
>         80134fb0 delta_nr_inactive_pages                      93   0.9688
>         801288b0 _spin_unlock_                               131   1.0234
>         8013f380 create_empty_buffers                        107   1.1146
>         80135010 delta_nr_cache_pages                        119   1.2396
>         801d28b0 generic_make_request                        396   1.3026
>         8013f0e0 set_bh_page                                 102   1.5938
>         80108a48 system_call                                  91   1.6250
>         801d29e0 submit_bh                                   185   1.6518
>         801340e0 kmem_cache_free                             217   1.6953
>         80140ea0 try_to_free_buffers                         664   1.9762
>         801d22a0 __make_request                             3214   2.0709
>         8012e0c0 unlock_page                                 283   2.2109
>         801298cc .text.lock.lockmeter                        332   2.2432
>         80136d40 __free_pages                                125   2.6042
>         801287d0 _spin_lock_                                 585   5.2232
>         8013e970 end_buffer_io_async                        1234   6.4271
>         8012edd0 file_read_actor                            3732  33.3214
>         80106d20 default_idle                               8875 138.6719

as expected the biggest cost is file_read_actor.

Both profiling looks fine, as expected.


> /dev/raw/rawN:
>         [root@mel-stglab-host1 src]# tail -20 /tmp/raw.txt
>         80122c50 tqueue_bh                                     4   0.1250
>         8012a670 follow_page                                  33   0.1587
>         8012a740 get_user_pages                              118   0.2543
>         80203890 scsi_init_io_vc                             139   0.2555
>         8012aa50 mark_dirty_kiobuf                            36   0.3214
>         80136d40 __free_pages                                 22   0.4583
>         8011f950 do_softirq                                  113   0.5045
>         801d28b0 generic_make_request                        204   0.6711
>         8013e950 init_buffer                                  34   1.0625
>         8023d670 sd_find_queue                                70   1.0938
>         8013f0e0 set_bh_page                                  74   1.1562
>         80200fb0 __scsi_end_request                          365   1.2007
>         801405d0 brw_kiovec                                 1288   1.4123
>         80140560 wait_kio                                    193   1.7232
>         80152820 end_kio_request                             166   2.0750
>         801d29e0 submit_bh                                   347   3.0982
>         8013dd10 unlock_buffer                               357   4.4625
>         801d22a0 __make_request                            11014   7.0966
>         80140520 end_buffer_io_kiobuf                        835  13.0469
>         80106d20 default_idle                              45156 705.5625

expected again, as you can see the cost goes quite up for __make_request
compared to O_DIRECT due the 512 b_size (raw compared to base is unfair
because base uses 1k b_size and raw uses 512 b_size and that's why it's
wasting so much cpu time there, o_direct vs base is instead fair because
they both uses 1k of b_size, once o_direct will take advantage of varyio
in your upgraded driver, the comparison between o_direct vs base will
become unfair too, o_direct will be more advantaged by a virtual-common
4k b_size)


> nocopy hack:
>         [root@mel-stglab-host1 src]# tail -20 /tmp/nocopy.txt
>         8012dec0 page_cache_read                             197   0.7695
>         80134fb0 delta_nr_inactive_pages                      77   0.8021
>         80133e60 kmem_cache_alloc                            221   0.8125
>         8013f020 get_unused_buffer_head                      182   0.9479
>         801288b0 _spin_unlock_                               124   0.9688
>         80135010 delta_nr_cache_pages                        110   1.1458
>         8013f380 create_empty_buffers                        114   1.1875
>         801d28b0 generic_make_request                        375   1.2336
>         801d29e0 submit_bh                                   201   1.7946
>         8013f0e0 set_bh_page                                 121   1.8906
>         8012e0c0 unlock_page                                 254   1.9844
>         80108a48 system_call                                 116   2.0714
>         801d22a0 __make_request                             3234   2.0838
>         80140ea0 try_to_free_buffers                         707   2.1042
>         801340e0 kmem_cache_free                             272   2.1250
>         801298cc .text.lock.lockmeter                        392   2.6486
>         80136d40 __free_pages                                134   2.7917
>         801287d0 _spin_lock_                                 636   5.6786
>         8013e970 end_buffer_io_async                        1200   6.2500
>         80106d20 default_idle                               5226  81.6562

very similar to o_direct, notice the overhead in _spin_lock_ here (and
also in "base") it's certainly the page_cache lock that won't go away in
2.5 with radix tree because you're working on the same file, higher
bandwith because the disks runs at the same time and possibly because
read/write returns faster and part of the cost of the I/O happens
outside your time measurements (for a more fair comparison you can
benchmark the whole workload, not only how fast read/write returns).

Andrea

  reply	other threads:[~2002-05-13 11:37 UTC|newest]

Thread overview: 220+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-05-06  3:53 Linux-2.5.14 Linus Torvalds
2002-05-06  6:30 ` Linux-2.5.14 Daniel Pittman
2002-05-06  6:51   ` Linux-2.5.14 Andrew Morton
2002-05-06 15:13   ` Linux-2.5.14 Linus Torvalds
2002-05-07  4:28     ` Linux-2.5.14 Daniel Pittman
2002-05-09  3:53     ` Linux-2.5.14 Daniel Pittman
2002-05-09  4:34       ` Linux-2.5.14 Andrew Morton
2002-05-09  6:02         ` Linux-2.5.14 Daniel Pittman
2002-05-06  6:47 ` Linux-2.5.14 bert hubert
2002-05-06  7:07   ` Linux-2.5.14 Andrew Morton
2002-05-06 14:00     ` Linux-2.5.14 Rik van Riel
2002-05-06  9:09 ` [PATCH] 2.5.14 IDE 55 Martin Dalecki
2002-05-06 17:48   ` David Lang
2002-05-06 22:40   ` Roman Zippel
2002-05-07 10:10     ` Martin Dalecki
2002-05-07 11:31       ` Roman Zippel
2002-05-07 10:31         ` Martin Dalecki
2002-05-07 10:34           ` Martin Dalecki
2002-05-07 11:48             ` Roman Zippel
2002-05-07 11:19               ` Martin Dalecki
2002-05-07 12:35                 ` Roman Zippel
2002-05-07 12:36                 ` Andrey Panin
2002-05-07 11:32                   ` Martin Dalecki
2002-05-07 12:38                 ` Dave Jones
2002-05-07  0:03   ` Roman Zippel
2002-05-07 10:12     ` Martin Dalecki
2002-05-07 11:39       ` Roman Zippel
2002-05-07 10:40         ` Martin Dalecki
2002-05-07 12:42           ` Roman Zippel
2002-05-07 11:22 ` [PATCH] 2.5.14 IDE 56 Martin Dalecki
2002-05-07 14:02   ` Padraig Brady
2002-05-07 13:15     ` Martin Dalecki
2002-05-07 14:30       ` Padraig Brady
2002-05-07 15:08       ` Anton Altaparmakov
2002-05-07 15:36         ` Linus Torvalds
2002-05-07 16:20           ` Jan Harkes
2002-05-07 15:26             ` Martin Dalecki
2002-05-07 21:36               ` Jan Harkes
2002-05-08  0:25                 ` Guest section DW
2002-05-08  3:03                   ` Jan Harkes
2002-05-08  9:03                   ` Martin Dalecki
2002-05-08 12:10                     ` Alan Cox
2002-05-08 10:51                       ` Martin Dalecki
2002-05-07 16:29           ` Padraig Brady
2002-05-07 16:51             ` Linus Torvalds
2002-05-07 18:29               ` Kai Henningsen
2002-05-08  7:48               ` Juan Quintela
2002-05-08 16:54                 ` Linus Torvalds
2002-05-07 17:08             ` Alan Cox
2002-05-07 17:00               ` Linus Torvalds
2002-05-07 17:19                 ` benh
2002-05-07 17:24                   ` Linus Torvalds
2002-05-07 17:30                     ` benh
2002-05-10  1:45                       ` Mike Fedyk
2002-05-07 17:43                     ` Richard Gooch
2002-05-07 18:05                       ` Linus Torvalds
2002-05-07 18:26                         ` Alan Cox
2002-05-07 18:16                           ` Linus Torvalds
2002-05-07 18:40                             ` Richard Gooch
2002-05-07 18:46                               ` Linus Torvalds
2002-05-07 23:54                                 ` Roman Zippel
2002-05-08  6:57                                 ` Kai Henningsen
2002-05-08  9:37                                   ` Ian Molton
2002-05-09 13:58                                 ` Pavel Machek
2002-05-08  8:21                               ` Martin Dalecki
2002-05-07 17:27                   ` Jauder Ho
2002-05-08  8:13                     ` Martin Dalecki
2002-05-07 18:29                   ` Patrick Mochel
2002-05-07 18:02                     ` Greg KH
2002-05-07 18:44                     ` Richard Gooch
2002-05-07 18:44                       ` Patrick Mochel
2002-05-07 19:21                         ` Richard Gooch
2002-05-07 19:58                           ` Patrick Mochel
2002-05-07 18:49                     ` Thunder from the hill
2002-05-07 19:47                       ` Patrick Mochel
2002-05-07 22:03                         ` Richard Gooch
2002-05-08  8:14                           ` Russell King
2002-05-08 16:07                             ` Richard Gooch
2002-05-08 17:07                               ` Russell King
2002-05-08  8:18                     ` Martin Dalecki
2002-05-08  8:07                   ` Martin Dalecki
2002-05-08  7:58                 ` Martin Dalecki
2002-05-08 12:18                   ` Alan Cox
2002-05-08 11:09                     ` Martin Dalecki
2002-05-08 12:42                       ` Alan Cox
2002-05-08 11:23                         ` Martin Dalecki
2002-05-09  2:37                         ` Lincoln Dale
2002-05-09  3:10                           ` Andrew Morton
2002-05-09 10:05                             ` Lincoln Dale
2002-05-09 18:50                               ` Andrew Morton
2002-05-10  0:33                                 ` Andi Kleen
2002-05-10  0:48                                   ` Andrew Morton
2002-05-10  1:06                                     ` Andi Kleen
2002-05-13 17:51                                       ` Pavel Machek
2002-05-14 21:44                                         ` Andi Kleen
2002-05-10  6:50                                 ` O_DIRECT performance impact on 2.4.18 (was: Re: [PATCH] 2.5.14 IDE 56) Lincoln Dale
2002-05-10  7:15                                   ` O_DIRECT performance impact on 2.4.18 (was: Re: [PATCH] 2.5.14IDE 56) Andrew Morton
2002-05-10  7:21                                     ` Jens Axboe
2002-05-10  8:12                                     ` Andrea Arcangeli
2002-05-10 10:14                                     ` Lincoln Dale
2002-05-10 12:36                                       ` Andrea Arcangeli
2002-05-11  3:23                                         ` Lincoln Dale
2002-05-13 11:19                                           ` Andrea Arcangeli
2002-05-13 23:58                                             ` Lincoln Dale
2002-05-14  0:22                                               ` Andrea Arcangeli
2002-05-14  2:43                                                 ` O_DIRECT on 2.4.19pre8aa2 md device Lincoln Dale
2002-05-21 15:51                                                   ` Andrea Arcangeli
2002-05-22  1:18                                                     ` Lincoln Dale
2002-05-22  2:51                                                       ` Andrea Arcangeli
2002-06-03  4:53                                                         ` high-end i/o performance of 2.4.19pre8aa2 (was: Re: O_DIRECT on 2.4.19pre8aa2 device) Lincoln Dale
2002-05-12 11:23                                         ` O_DIRECT performance impact on 2.4.18 (was: Re: [PATCH] 2.5.14IDE 56) Lincoln Dale
2002-05-13 11:37                                           ` Andrea Arcangeli [this message]
2002-05-10 15:55                                   ` O_DIRECT performance impact on 2.4.18 (was: Re: [PATCH] 2.5.14 IDE 56) Linus Torvalds
2002-05-11  1:01                                     ` Gerrit Huizenga
2002-05-11 18:04                                       ` Linus Torvalds
2002-05-11 18:19                                         ` Larry McVoy
2002-05-11 18:35                                           ` Linus Torvalds
2002-05-11 18:37                                             ` Larry McVoy
2002-05-11 18:56                                               ` Linus Torvalds
2002-05-11 21:42                                                 ` Gerrit Huizenga
2002-05-11 18:43                                             ` Mr. James W. Laferriere
2002-05-11 23:38                                             ` Lincoln Dale
2002-05-12  0:36                                               ` yodaiken
2002-05-12  2:40                                                 ` Andrew Morton
2002-05-11 18:26                                         ` O_DIRECT performance impact on 2.4.18 (was: Re: [PATCH] 2.5.14 Alan Cox
2002-05-11 18:09                                           ` Linus Torvalds
2002-05-11 18:45                                         ` O_DIRECT performance impact on 2.4.18 (was: Re: [PATCH] 2.5.14 IDE 56) yodaiken
2002-05-11 19:55                                         ` O_DIRECT performance impact on 2.4.18 Bernd Eckenfels
2002-05-11 14:18                                     ` O_DIRECT performance impact on 2.4.18 (was: Re: [PATCH] 2.5.14 IDE 56) Roy Sigurd Karlsbakk
2002-05-11 14:24                                       ` Jens Axboe
2002-05-11 18:25                                         ` Gerrit Huizenga
2002-05-11 20:17                                           ` Jens Axboe
2002-05-11 22:27                                             ` Gerrit Huizenga
2002-05-11 23:17                                       ` Lincoln Dale
2002-05-09  4:16                           ` [PATCH] 2.5.14 IDE 56 Andre Hedrick
2002-05-09 13:32                             ` Alan Cox
2002-05-09 14:58                           ` Alan Cox
2002-05-08 18:21                     ` Erik Andersen
2002-05-08 18:59                       ` Dave Jones
2002-05-08 19:31                       ` Alan Cox
2002-05-08 21:16                         ` Erik Andersen
2002-05-08 22:14                           ` Alan Cox
2002-05-09 13:13                   ` Pavel Machek
2002-05-09 19:22                     ` Daniel Jacobowitz
2002-05-10 12:01                   ` Padraig Brady
2002-05-09 13:18                 ` Pavel Machek
2002-05-07 17:10             ` Richard B. Johnson
2002-05-08  7:36             ` Martin Dalecki
2002-05-08 17:22               ` Greg KH
2002-05-08 18:46   ` Denis Vlasenko
2002-05-07 11:27 ` [PATCH] 2.5.14 IDE 57 Martin Dalecki
2002-05-07 13:16   ` Anton Altaparmakov
2002-05-07 12:34     ` Martin Dalecki
2002-05-07 13:56       ` Mikael Pettersson
2002-05-07 14:04         ` Dave Jones
2002-05-07 13:57       ` Anton Altaparmakov
2002-05-07 14:08         ` Dave Jones
2002-05-07 13:11           ` Martin Dalecki
2002-05-07 14:29           ` Anton Altaparmakov
2002-05-07 13:36             ` Martin Dalecki
2002-05-07 15:08               ` Anton Altaparmakov
2002-05-07 16:51             ` Dave Jones
2002-05-08  3:38               ` Anton Altaparmakov
2002-05-08 11:47                 ` Dave Jones
2002-05-07 15:07           ` Padraig Brady
2002-05-07 17:21           ` Andre Hedrick
2002-05-11 14:09   ` Aaron Lehmann
2002-05-07 15:03 ` [PATCH] IDE 58 Martin Dalecki
2002-05-08  6:42   ` Paul Mackerras
2002-05-08  8:53     ` Martin Dalecki
2002-05-08 10:37       ` Bjorn Wesen
2002-05-08 10:16         ` Martin Dalecki
2002-05-08 19:06           ` Linus Torvalds
2002-05-08 19:10             ` Benjamin Herrenschmidt
2002-05-08 20:31               ` Alan Cox
2002-05-08 19:49                 ` Benjamin Herrenschmidt
2002-05-08 20:44                   ` Alan Cox
2002-05-08 20:04                     ` Benjamin Herrenschmidt
2002-05-09 20:20                       ` Ian Molton
2002-05-08 20:36                     ` Andre Hedrick
2002-05-08 20:29                 ` Andre Hedrick
2002-05-08 20:06                   ` Benjamin Herrenschmidt
2002-05-09 12:14                   ` Martin Dalecki
2002-05-09 15:19               ` Eric W. Biederman
2002-05-09 20:20               ` Ian Molton
2002-05-08 11:00         ` Benjamin Herrenschmidt
2002-05-09 19:58 ` [PATCH] 2.5.14 IDE 59 Martin Dalecki
2002-05-11  4:16   ` William Lee Irwin III
2002-05-11 16:59 ` [PATCH] 2.5.15 IDE 60 Martin Dalecki
2002-05-11 18:47   ` Pierre Rousselet
2002-05-11 19:12     ` Andre Hedrick
2002-05-11 19:52       ` Pierre Rousselet
2002-05-11 23:48         ` Andre Hedrick
2002-05-12 19:19   ` pdc202xx.c fails to compile in 2.5.15 Zlatko Calusic
2002-05-12 19:40     ` Jurriaan on Alpha
2002-05-12 22:00     ` Petr Vandrovec
2002-05-13 12:03       ` Alan Cox
2002-05-13  9:48 ` [PATCH] 2.5.15 IDE 61 Martin Dalecki
2002-05-13 12:17 ` [PATCH] 2.5.15 IDE 62 Martin Dalecki
2002-05-13 13:48   ` Jens Axboe
2002-05-13 13:02     ` Martin Dalecki
2002-05-13 15:38       ` Jens Axboe
2002-05-13 15:45         ` Martin Dalecki
2002-05-13 16:54           ` Linus Torvalds
2002-05-13 16:55             ` Jens Axboe
2002-05-13 16:00               ` Martin Dalecki
2002-05-13 18:02             ` benh
2002-05-13 15:50         ` Martin Dalecki
2002-05-13 17:52         ` benh
2002-05-13 15:55           ` Martin Dalecki
2002-05-13 19:13             ` benh
2002-05-14  8:48               ` Martin Dalecki
2002-05-17 11:40               ` Martin Dalecki
2002-05-17  2:27                 ` Benjamin Herrenschmidt
2002-05-13 15:36   ` Tom Rini
2002-05-14 10:26 ` [PATCH] 2.5.15 IDE 62a Martin Dalecki
2002-05-14 10:28 ` [PATCH] 2.5.15 IDE 63 Martin Dalecki
2002-05-15 12:04 ` [PATCH] 2.5.15 IDE 64 Martin Dalecki
2002-05-15 13:12   ` Russell King
2002-05-15 12:14     ` Martin Dalecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020513133718.R13730@dualathlon.random \
    --to=andrea@suse.de \
    --cc=akpm@zip.com.au \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltd@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox