All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: Taras Glek <tglek@mozilla.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Downsides to madvise/fadvise(willneed) for application startup
Date: Mon, 12 Apr 2010 12:58:00 +0800	[thread overview]
Message-ID: <20100412045800.GB18099@localhost> (raw)
In-Reply-To: <l2z28c262361004112025ydabc82ceyfa21cff9debc85b3@mail.gmail.com>

Hi Minchan,

> > Yes, every binary/library starts with this 512b read.  It is requested
> > by ld.so/ld-linux.so, and will trigger a 4-page readahead. This is not
> > good readahead. I wonder if ld.so can switch to mmap read for the
> > first read, in order to trigger a larger 128kb readahead. However this
> > will introduce a little overhead on VMA operations.

Correction with data: in my system, ld is doing one 832b initial read for every library:

        $ strace true
        execve("/bin/true", ["true"], [/* 44 vars */]) = 0
        brk(0)                                  = 0x608000
        mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3ea0000
        access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
        mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3e9e000
        access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
        open("/etc/ld.so.cache", O_RDONLY)      = 3
        fstat(3, {st_mode=S_IFREG|0644, st_size=140899, ...}) = 0
        mmap(NULL, 140899, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb3b3e7b000
        close(3)                                = 0
        access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
        open("/lib/libc.so.6", O_RDONLY)        = 3
==>     read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\353\1\0\0\0\0\0@"..., 832) = 832
        fstat(3, {st_mode=S_IFREG|0755, st_size=1379752, ...}) = 0
        mmap(NULL, 3487784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb3b3931000
        mprotect(0x7fb3b3a7b000, 2097152, PROT_NONE) = 0
        mmap(0x7fb3b3c7b000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14a000) = 0x7fb3b3c7b000
        mmap(0x7fb3b3c80000, 18472, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3c80000
        close(3)                                = 0
        mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3e7a000
        mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3e79000
        arch_prctl(ARCH_SET_FS, 0x7fb3b3e796f0) = 0
        mprotect(0x7fb3b3c7b000, 16384, PROT_READ) = 0
        mprotect(0x7fb3b3ea1000, 4096, PROT_READ) = 0
        munmap(0x7fb3b3e7b000, 140899)          = 0
        brk(0)                                  = 0x608000
        brk(0x629000)                           = 0x629000
        open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
        fstat(3, {st_mode=S_IFREG|0644, st_size=4332320, ...}) = 0
        mmap(NULL, 4332320, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb3b350f000
        close(3)                                = 0
        close(1)                                = 0
        close(2)                                = 0
        exit_group(0)                           = ?

> AFAIK, kernel reads first sector(ELF header and so one)  of binary in
> case of binary.
> in fs/exec.c,
> prepare_binprm()
> {
> ...
> return kernel_read(bprm->file, 0, bprm->buf, BINPRM_BUF_SIZE);
> }

Thanks for pointing this out. Yes we may optimize the binary part by
adding a readahead call before the kernel_read().
 
> But dynamic loader uses libc_read for reading of shared library's one.
> 
> So you may have a chance to increase readahead size on binary but hard on shared
> library. Many of app have lots of shared library so the solution of
> only binary isn't big about
> performance. :(

Yeah, it won't be a big optimization..

Thanks,
Fengguang

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: Taras Glek <tglek@mozilla.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Downsides to madvise/fadvise(willneed) for application startup
Date: Mon, 12 Apr 2010 12:58:00 +0800	[thread overview]
Message-ID: <20100412045800.GB18099@localhost> (raw)
In-Reply-To: <l2z28c262361004112025ydabc82ceyfa21cff9debc85b3@mail.gmail.com>

Hi Minchan,

> > Yes, every binary/library starts with this 512b read. A It is requested
> > by ld.so/ld-linux.so, and will trigger a 4-page readahead. This is not
> > good readahead. I wonder if ld.so can switch to mmap read for the
> > first read, in order to trigger a larger 128kb readahead. However this
> > will introduce a little overhead on VMA operations.

Correction with data: in my system, ld is doing one 832b initial read for every library:

        $ strace true
        execve("/bin/true", ["true"], [/* 44 vars */]) = 0
        brk(0)                                  = 0x608000
        mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3ea0000
        access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
        mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3e9e000
        access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
        open("/etc/ld.so.cache", O_RDONLY)      = 3
        fstat(3, {st_mode=S_IFREG|0644, st_size=140899, ...}) = 0
        mmap(NULL, 140899, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb3b3e7b000
        close(3)                                = 0
        access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
        open("/lib/libc.so.6", O_RDONLY)        = 3
==>     read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\353\1\0\0\0\0\0@"..., 832) = 832
        fstat(3, {st_mode=S_IFREG|0755, st_size=1379752, ...}) = 0
        mmap(NULL, 3487784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb3b3931000
        mprotect(0x7fb3b3a7b000, 2097152, PROT_NONE) = 0
        mmap(0x7fb3b3c7b000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14a000) = 0x7fb3b3c7b000
        mmap(0x7fb3b3c80000, 18472, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3c80000
        close(3)                                = 0
        mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3e7a000
        mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb3b3e79000
        arch_prctl(ARCH_SET_FS, 0x7fb3b3e796f0) = 0
        mprotect(0x7fb3b3c7b000, 16384, PROT_READ) = 0
        mprotect(0x7fb3b3ea1000, 4096, PROT_READ) = 0
        munmap(0x7fb3b3e7b000, 140899)          = 0
        brk(0)                                  = 0x608000
        brk(0x629000)                           = 0x629000
        open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
        fstat(3, {st_mode=S_IFREG|0644, st_size=4332320, ...}) = 0
        mmap(NULL, 4332320, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb3b350f000
        close(3)                                = 0
        close(1)                                = 0
        close(2)                                = 0
        exit_group(0)                           = ?

> AFAIK, kernel reads first sector(ELF header and so one)  of binary in
> case of binary.
> in fs/exec.c,
> prepare_binprm()
> {
> ...
> return kernel_read(bprm->file, 0, bprm->buf, BINPRM_BUF_SIZE);
> }

Thanks for pointing this out. Yes we may optimize the binary part by
adding a readahead call before the kernel_read().
 
> But dynamic loader uses libc_read for reading of shared library's one.
> 
> So you may have a chance to increase readahead size on binary but hard on shared
> library. Many of app have lots of shared library so the solution of
> only binary isn't big about
> performance. :(

Yeah, it won't be a big optimization..

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-04-12  4:58 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-05 22:43 Downsides to madvise/fadvise(willneed) for application startup Taras Glek
2010-04-05 23:17 ` Dave Chinner
2010-04-05 23:52 ` Roland Dreier
2010-04-06 22:09   ` Taras Glek
2010-04-06  9:51 ` Johannes Weiner
2010-04-06  9:51   ` Johannes Weiner
2010-04-06 21:57   ` Taras Glek
2010-04-06 21:57     ` Taras Glek
2010-04-06 22:26     ` Johannes Weiner
2010-04-06 22:26       ` Johannes Weiner
2010-04-06 22:39       ` Taras Glek
2010-04-06 22:39         ` Taras Glek
2010-04-07  2:24   ` Wu Fengguang
2010-04-07  2:24     ` Wu Fengguang
2010-04-07  2:54     ` Taras Glek
2010-04-07  2:54       ` Taras Glek
2010-04-07  4:06       ` Minchan Kim
2010-04-07  4:06         ` Minchan Kim
2010-04-07  7:14         ` Wu Fengguang
2010-04-07  7:14           ` Wu Fengguang
2010-04-07  7:33           ` Minchan Kim
2010-04-07  7:33             ` Minchan Kim
2010-04-07  7:47             ` Wu Fengguang
2010-04-07  7:47               ` Wu Fengguang
2010-04-07  8:06               ` Minchan Kim
2010-04-07  8:06                 ` Minchan Kim
2010-04-07  8:13                 ` Wu Fengguang
2010-04-07  8:13                   ` Wu Fengguang
2010-04-07  7:38       ` Wu Fengguang
2010-04-07  7:38         ` Wu Fengguang
2010-04-08 17:44         ` Taras Glek
2010-04-08 17:44           ` Taras Glek
2010-04-12  2:27           ` Wu Fengguang
2010-04-12  2:27             ` Wu Fengguang
2010-04-12  3:25             ` Minchan Kim
2010-04-12  3:25               ` Minchan Kim
2010-04-12  4:58               ` Wu Fengguang [this message]
2010-04-12  4:58                 ` Wu Fengguang
2010-04-12  4:43             ` drepper
2010-04-12  4:46               ` Taras Glek
2010-04-12  4:46                 ` Taras Glek
2010-04-12  4:50               ` Wu Fengguang
2010-04-12  4:50                 ` Wu Fengguang
2010-04-12  8:50 ` Andi Kleen
2010-04-15 22:53 ` Andrew Morton
2010-04-15 23:21   ` Zan Lynx
2010-04-15 20:42     ` Andrew Morton
2010-04-16 11:41     ` Andi Kleen
2010-04-16 12:23       ` Theodore Tso
2010-04-16 12:23       ` Theodore Tso
2010-04-16  0:41   ` Taras Glek
2010-04-15 22:21     ` Andrew Morton
2010-04-16  2:37       ` Taras Glek
2010-04-16 11:40   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100412045800.GB18099@localhost \
    --to=fengguang.wu@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=tglek@mozilla.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.