From: Aaron Lu <aaron.lu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
linux-mm <linux-mm@kvack.org>,
lkml <linux-kernel@vger.kernel.org>,
Andi Kleen <ak@linux.intel.com>,
Huang Ying <ying.huang@intel.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Kemi Wang <kemi.wang@intel.com>,
Anshuman Khandual <khandual@linux.vnet.ibm.com>
Subject: Re: [PATCH v2] mm/page_alloc.c: inline __rmqueue()
Date: Wed, 11 Oct 2017 10:34:02 +0800 [thread overview]
Message-ID: <20171011023402.GC27907@intel.com> (raw)
In-Reply-To: <20171010144545.c87a28b0f3c4e475305254ab@linux-foundation.org>
On Tue, Oct 10, 2017 at 02:45:45PM -0700, Andrew Morton wrote:
> On Tue, 10 Oct 2017 13:43:43 +0800 Aaron Lu <aaron.lu@intel.com> wrote:
>
> > On Mon, Oct 09, 2017 at 10:19:52PM -0700, Dave Hansen wrote:
> > > On 10/09/2017 07:56 PM, Aaron Lu wrote:
> > > > This patch adds inline to __rmqueue() and vmlinux' size doesn't have any
> > > > change after this patch according to size(1).
> > > >
> > > > without this patch:
> > > > text data bss dec hex filename
> > > > 9968576 5793372 17715200 33477148 1fed21c vmlinux
> > > >
> > > > with this patch:
> > > > text data bss dec hex filename
> > > > 9968576 5793372 17715200 33477148 1fed21c vmlinux
> > >
> > > This is unexpected. Could you double-check this, please?
> >
> > mm/page_alloc.o has size changes:
> >
> > Without this patch:
> > $ size mm/page_alloc.o
> > text data bss dec hex filename
> > 36695 9792 8396 54883 d663 mm/page_alloc.o
> >
> > With this patch:
> > $ size mm/page_alloc.o
> > text data bss dec hex filename
> > 37511 9792 8396 55699 d993 mm/page_alloc.o
> >
> > But vmlinux doesn't.
> >
> > It's not clear to me what happened, do you want to me dig this out?
>
> There's weird stuff going on.
>
> With x86_64 gcc-4.8.4
>
> Patch not applied:
>
> akpm3:/usr/local/google/home/akpm/k/25> nm mm/page_alloc.o|grep __rmqueue
> 0000000000002a00 t __rmqueue
>
> Patch applied:
>
> akpm3:/usr/local/google/home/akpm/k/25> nm mm/page_alloc.o|grep __rmqueue
> 000000000000039f t __rmqueue_fallback
> 0000000000001220 t __rmqueue_smallest
>
> So inlining __rmqueue has caused the compiler to decide to uninline
> __rmqueue_fallback and __rmqueue_smallest, which largely undoes the
> effect of your patch.
>
> `inline' is basically advisory (or ignored) in modern gcc's. So gcc
> has felt free to ignore it in __rmqueue_fallback and __rmqueue_smallest
> because gcc thinks it knows best. That's why we created
> __always_inline, to grab gcc by the scruff of its neck.
This is a good point and I agree with Andi to use always_inline for
those functions that we really want to inline.
>
> So... I think this patch could do with quite a bit more care, tuning
> and testing with various gcc versions.
I did some more testing.
With x86_64 gcc-4.6.3 available from kernel.org crosstool:
Patch not applied:
[aaron@aaronlu linux]$ nm mm/page_alloc.o |grep __rmqueue
00000000000023f0 t __rmqueue
00000000000027c0 t __rmqueue_pcplist.isra.95
Patch applied:
[aaron@aaronlu linux]$ nm mm/page_alloc.o |grep __rmqueue
0000000000002950 t __rmqueue_pcplist.isra.95
Works expected.
With self built x86_64 gcc-4.8.4:
Patch not applied:
[aaron@aaronlu linux]$ nm mm/page_alloc.o |grep __rmqueue
0000000000001f20 t __rmqueue
Patch applied:
[aaron@aaronlu linux]$ nm mm/page_alloc.o |grep __rmqueue
Works expected.(conflicts with your result though).
I also tested gcc-4.9.4, gcc-5.3.1, gcc-6.4.0 and gcc-7.2.1, all have
the same output as the above gcc-4.8.4.
Then I realized CONFIG_OPTIMIZE_INLINING which I always disabled as
suggested by the help message(If unsure, say N). Turnining that config
on indeed caused gcc-4.8.4 to emit __rmqueue_fallback here.
I think I'll just mark those functions always_inline.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Aaron Lu <aaron.lu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
linux-mm <linux-mm@kvack.org>,
lkml <linux-kernel@vger.kernel.org>,
Andi Kleen <ak@linux.intel.com>,
Huang Ying <ying.huang@intel.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Kemi Wang <kemi.wang@intel.com>,
Anshuman Khandual <khandual@linux.vnet.ibm.com>
Subject: Re: [PATCH v2] mm/page_alloc.c: inline __rmqueue()
Date: Wed, 11 Oct 2017 10:34:02 +0800 [thread overview]
Message-ID: <20171011023402.GC27907@intel.com> (raw)
In-Reply-To: <20171010144545.c87a28b0f3c4e475305254ab@linux-foundation.org>
On Tue, Oct 10, 2017 at 02:45:45PM -0700, Andrew Morton wrote:
> On Tue, 10 Oct 2017 13:43:43 +0800 Aaron Lu <aaron.lu@intel.com> wrote:
>
> > On Mon, Oct 09, 2017 at 10:19:52PM -0700, Dave Hansen wrote:
> > > On 10/09/2017 07:56 PM, Aaron Lu wrote:
> > > > This patch adds inline to __rmqueue() and vmlinux' size doesn't have any
> > > > change after this patch according to size(1).
> > > >
> > > > without this patch:
> > > > text data bss dec hex filename
> > > > 9968576 5793372 17715200 33477148 1fed21c vmlinux
> > > >
> > > > with this patch:
> > > > text data bss dec hex filename
> > > > 9968576 5793372 17715200 33477148 1fed21c vmlinux
> > >
> > > This is unexpected. Could you double-check this, please?
> >
> > mm/page_alloc.o has size changes:
> >
> > Without this patch:
> > $ size mm/page_alloc.o
> > text data bss dec hex filename
> > 36695 9792 8396 54883 d663 mm/page_alloc.o
> >
> > With this patch:
> > $ size mm/page_alloc.o
> > text data bss dec hex filename
> > 37511 9792 8396 55699 d993 mm/page_alloc.o
> >
> > But vmlinux doesn't.
> >
> > It's not clear to me what happened, do you want to me dig this out?
>
> There's weird stuff going on.
>
> With x86_64 gcc-4.8.4
>
> Patch not applied:
>
> akpm3:/usr/local/google/home/akpm/k/25> nm mm/page_alloc.o|grep __rmqueue
> 0000000000002a00 t __rmqueue
>
> Patch applied:
>
> akpm3:/usr/local/google/home/akpm/k/25> nm mm/page_alloc.o|grep __rmqueue
> 000000000000039f t __rmqueue_fallback
> 0000000000001220 t __rmqueue_smallest
>
> So inlining __rmqueue has caused the compiler to decide to uninline
> __rmqueue_fallback and __rmqueue_smallest, which largely undoes the
> effect of your patch.
>
> `inline' is basically advisory (or ignored) in modern gcc's. So gcc
> has felt free to ignore it in __rmqueue_fallback and __rmqueue_smallest
> because gcc thinks it knows best. That's why we created
> __always_inline, to grab gcc by the scruff of its neck.
This is a good point and I agree with Andi to use always_inline for
those functions that we really want to inline.
>
> So... I think this patch could do with quite a bit more care, tuning
> and testing with various gcc versions.
I did some more testing.
With x86_64 gcc-4.6.3 available from kernel.org crosstool:
Patch not applied:
[aaron@aaronlu linux]$ nm mm/page_alloc.o |grep __rmqueue
00000000000023f0 t __rmqueue
00000000000027c0 t __rmqueue_pcplist.isra.95
Patch applied:
[aaron@aaronlu linux]$ nm mm/page_alloc.o |grep __rmqueue
0000000000002950 t __rmqueue_pcplist.isra.95
Works expected.
With self built x86_64 gcc-4.8.4:
Patch not applied:
[aaron@aaronlu linux]$ nm mm/page_alloc.o |grep __rmqueue
0000000000001f20 t __rmqueue
Patch applied:
[aaron@aaronlu linux]$ nm mm/page_alloc.o |grep __rmqueue
Works expected.(conflicts with your result though).
I also tested gcc-4.9.4, gcc-5.3.1, gcc-6.4.0 and gcc-7.2.1, all have
the same output as the above gcc-4.8.4.
Then I realized CONFIG_OPTIMIZE_INLINING which I always disabled as
suggested by the help message(If unsure, say N). Turnining that config
on indeed caused gcc-4.8.4 to emit __rmqueue_fallback here.
I think I'll just mark those functions always_inline.
next prev parent reply other threads:[~2017-10-11 2:34 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-09 5:44 [PATCH] page_alloc.c: inline __rmqueue() Aaron Lu
2017-10-09 5:44 ` Aaron Lu
2017-10-09 7:37 ` Anshuman Khandual
2017-10-09 7:37 ` Anshuman Khandual
2017-10-09 7:53 ` Aaron Lu
2017-10-09 7:53 ` Aaron Lu
2017-10-09 20:23 ` Dave Hansen
2017-10-09 20:23 ` Dave Hansen
2017-10-10 2:51 ` Aaron Lu
2017-10-10 2:51 ` Aaron Lu
2017-10-10 2:56 ` [PATCH v2] mm/page_alloc.c: " Aaron Lu
2017-10-10 2:56 ` Aaron Lu
2017-10-10 5:19 ` Dave Hansen
2017-10-10 5:19 ` Dave Hansen
2017-10-10 5:43 ` Aaron Lu
2017-10-10 5:43 ` Aaron Lu
2017-10-10 21:45 ` Andrew Morton
2017-10-10 21:45 ` Andrew Morton
2017-10-10 22:27 ` Andi Kleen
2017-10-10 22:27 ` Andi Kleen
2017-10-11 2:34 ` Aaron Lu [this message]
2017-10-11 2:34 ` Aaron Lu
2017-10-13 6:31 ` [PATCH] mm/page_alloc: make sure __rmqueue() etc. always inline Aaron Lu
2017-10-13 6:31 ` Aaron Lu
2017-10-17 11:32 ` Vlastimil Babka
2017-10-17 11:32 ` Vlastimil Babka
2017-10-18 1:53 ` Lu, Aaron
2017-10-18 6:28 ` Vlastimil Babka
2017-10-18 6:28 ` Vlastimil Babka
2017-10-18 8:57 ` Aaron Lu
2017-10-18 8:57 ` Aaron Lu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171011023402.GC27907@intel.com \
--to=aaron.lu@intel.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=kemi.wang@intel.com \
--cc=khandual@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tim.c.chen@linux.intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.