From: Jan Kara <jack@suse.cz>
To: Andiry Xu <andiry@gmail.com>
Cc: Jan Kara <jack@suse.cz>,
Wang Shilong <wangsl-fnst@cn.fujitsu.com>,
linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org,
Andiry Xu <andiry.xu@gmail.com>
Subject: Re: [BUG][ext2] XIP does not work on ext2
Date: Mon, 11 Nov 2013 11:14:09 +0100 [thread overview]
Message-ID: <20131111101409.GB19893@quack.suse.cz> (raw)
In-Reply-To: <CAOvWMLbtBa4XSOJytZLR-=YMkF=RUMHZTYT+Gt+ZBKpHTYyw0A@mail.gmail.com>
On Fri 08-11-13 16:28:15, Andiry Xu wrote:
> On Thu, Nov 7, 2013 at 2:45 PM, Andiry Xu <andiry@gmail.com> wrote:
> > On Thu, Nov 7, 2013 at 2:20 PM, Jan Kara <jack@suse.cz> wrote:
> >> On Thu 07-11-13 13:50:09, Andiry Xu wrote:
> >>> On Thu, Nov 7, 2013 at 1:07 PM, Jan Kara <jack@suse.cz> wrote:
> >>> > On Thu 07-11-13 12:14:13, Andiry Xu wrote:
> >>> >> On Wed, Nov 6, 2013 at 1:18 PM, Jan Kara <jack@suse.cz> wrote:
> >>> >> > On Tue 05-11-13 17:28:35, Andiry Xu wrote:
> >>> >> >> >> Do you know the reason why write() outperforms mmap() in some cases? I
> >>> >> >> >> know it's not related the thread but I really appreciate if you can
> >>> >> >> >> answer my question.
> >>> >> >> > Well, I'm not completely sure. mmap()ed memory always works on page-by-page
> >>> >> >> > basis - you first access the page, it gets faulted in and you can further
> >>> >> >> > access it. So for small (sub page size) accesses this is a win because you
> >>> >> >> > don't have an overhead of syscall and fs write path. For accesses larger
> >>> >> >> > than page size the overhead of syscall and some initial checks is well
> >>> >> >> > hidden by other things. I guess write() ends up being more efficient
> >>> >> >> > because write path taken for each page is somewhat lighter than full page
> >>> >> >> > fault. But you'd need to look into perf data to get some hard numbers on
> >>> >> >> > where the time is spent.
> >>> >> >> >
> >>> >> >>
> >>> >> >> Thanks for the reply. However I have filled up the whole RAM disk
> >>> >> >> before doing the test, i.e. asked the brd driver to allocate all the
> >>> >> >> pages initially.
> >>> >> > Well, pages in ramdisk are always present, that's not an issue. But you
> >>> >> > will get a page fault to map a particular physical page in process'
> >>> >> > virtual address space when you first access that virtual address in the
> >>> >> > mapping from the process. The cost of setting up this virtual->physical
> >>> >> > mapping is what I'm talking about.
> >>> >> >
> >>> >>
> >>> >> Yes, you are right, there are page faults observed with perf. I
> >>> >> misunderstood page fault as copying pages between backing store and
> >>> >> physical memory.
> >>> >>
> >>> >> > If you had a process which first mmaps the file and writes to all pages in
> >>> >> > the mapping and *then* measure the cost of another round of writing to the
> >>> >> > mapping, I would expect you should see speeds close to those of memory bus.
> >>> >> >
> >>> >>
> >>> >> I've tried this as well. mmap() performance improves but still not as
> >>> >> good as write().
> >>> >> I used the perf report to compare write() and mmap() applications. For
> >>> >> write() version, top of perf report shows as:
> >>> >> 33.33% __copy_user_nocache
> >>> >> 4.72% ext2_get_blocks
> >>> >> 4.42% mutex_unlock
> >>> >> 3.59% __find_get_block
> >>> >>
> >>> >> which looks reasonable.
> >>> >>
> >>> >> However, for mmap() version, the perf report looks strange:
> >>> >> 94.98% libc-2.15.so [.] 0x000000000014698d
> >>> >> 2.25% page_fault
> >>> >> 0.18% handle_mm_fault
> >>> >>
> >>> >> I don't know what the first item is but it took the majority of cycles.
> >>> > The first item means that it's some userspace code in libc. My guess
> >>> > would be that it's libc's memcpy() function (or whatever you use to write
> >>> > to mmap). How do you access the mmap?
> >>> >
> >>>
> >>> Like this:
> >>>
> >>> fd = open(file_name, O_CREAT | O_RDWR | O_DIRECT, 0755);
> >>> dest = (char *)mmap(NULL, FILE_SIZE, PROT_WRITE, MAP_SHARED, fd, 0);
> >>> for (i = 0; i < count; i++)
> >>> {
> >>> memcpy(dest, src, request_size);
> >>> dest += request_size;
> >>> }
> >> OK, maybe libc memcpy isn't very well optimized for you cpu? Not sure how
> >> to tune that though...
> >>
> >
> > Hmm, I will try some different kinds of memcpy to see if there is a
> > difference. Just want to make sure I do not make some stupid mistakes
> > before trying that.
> > Thanks a lot for your help!
> >
>
> Your advice does makes difference. I use a optimized version of memcpy
> and it does improve the mmap application performance: on a Ramdisk
> with Ext2 xip, mmap() version now achieves 11GB/s of bandwidth,
> comparing to posix write version with 7GB/s.
Good :).
> Now I wonder if they have a plan to update the memcpy() in libc..
You better ask at glibc devel list... I've google for a while whether
memcpy() in glibc can be somehow tuned (for a particular instruction set)
but didn't find anything useful.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
prev parent reply other threads:[~2013-11-11 10:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-04 22:31 [BUG][ext2] XIP does not work on ext2 Andiry Xu
2013-11-05 0:37 ` Jan Kara
2013-11-05 2:37 ` Andiry Xu
2013-11-05 14:32 ` Jan Kara
2013-11-06 1:28 ` Andiry Xu
2013-11-06 21:18 ` Jan Kara
2013-11-07 20:14 ` Andiry Xu
2013-11-07 21:07 ` Jan Kara
2013-11-07 21:50 ` Andiry Xu
2013-11-07 22:20 ` Jan Kara
2013-11-07 22:45 ` Andiry Xu
2013-11-09 0:28 ` Andiry Xu
2013-11-11 10:14 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131111101409.GB19893@quack.suse.cz \
--to=jack@suse.cz \
--cc=andiry.xu@gmail.com \
--cc=andiry@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=wangsl-fnst@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox