All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vineet Gupta <vgupta@synopsys.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hugh Dickins <hughd@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Michal Hocko <mhocko@suse.cz>,
	Jennifer Herbert <jennifer.herbert@citrix.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: optimize PageHighMem() check
Date: Fri, 2 Oct 2015 12:45:53 +0530	[thread overview]
Message-ID: <560E2F29.5070807@synopsys.com> (raw)
In-Reply-To: <20151001162528.32c5338efdff2bdea838befd@linux-foundation.org>

On Friday 02 October 2015 04:55 AM, Andrew Morton wrote:
> On Tue, 29 Sep 2015 13:24:20 +0530 Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
> 
>> > This came up when implementing HIHGMEM/PAE40 for ARC.
>> > The kmap() / kmap_atomic() generated code seemed needlessly bloated due
>> > to the way PageHighMem() macro is implemented.
>> > It derives the exact zone for page and then does pointer subtraction
>> > with first zone to infer the zone_type.
>> > The pointer arithmatic in turn generates the code bloat.
>> > 
>> > PageHighMem(page)
>> >   is_highmem(page_zone(page))
>> >      zone_off = (char *)zone - (char *)zone->zone_pgdat->node_zones
>> > 
>> > Instead use is_highmem_idx() to work on zone_type available in page flags
>> > 
>> >    ----- Before -----
>> > 80756348:	mov_s      r13,r0
>> > 8075634a:	ld_s       r2,[r13,0]
>> > 8075634c:	lsr_s      r2,r2,30
>> > 8075634e:	mpy        r2,r2,0x2a4
>> > 80756352:	add_s      r2,r2,0x80aef880
>> > 80756358:	ld_s       r3,[r2,28]
>> > 8075635a:	sub_s      r2,r2,r3
>> > 8075635c:	breq       r2,0x2a4,80756378 <kmap+0x48>
>> > 80756364:	breq       r2,0x548,80756378 <kmap+0x48>
>> > 
>> >    ----- After  -----
>> > 80756330:	mov_s      r13,r0
>> > 80756332:	ld_s       r2,[r13,0]
>> > 80756334:	lsr_s      r2,r2,30
>> > 80756336:	sub_s      r2,r2,1
>> > 80756338:	brlo       r2,2,80756348 <kmap+0x30>
>> > 
>> > For x86 defconfig build (32 bit only) it saves around 900 bytes.
>> > For ARC defconfig with HIGHMEM, it saved around 2K bytes.
>> > 
>> >    ---->8-------
>> > ./scripts/bloat-o-meter x86/vmlinux-defconfig-pre x86/vmlinux-defconfig-post
>> > add/remove: 0/0 grow/shrink: 0/36 up/down: 0/-934 (-934)
>> > function                                     old     new   delta
>> > saveable_page                                162     154      -8
>> > saveable_highmem_page                        154     146      -8
>> > skb_gro_reset_offset                         147     131     -16
>> > ...
>> > ...
>> > __change_page_attr_set_clr                  1715    1678     -37
>> > setup_data_read                              434     394     -40
>> > mon_bin_event                               1967    1927     -40
>> > swsusp_save                                 1148    1105     -43
>> > _set_pages_array                             549     493     -56
>> >    ---->8-------
>> > 
>> > e.g. For ARC kmap()
>> > 
> is_highmem() is deranged.  Can't we use a bit in zone->flags or
> something?

It won't be "a" bit since zone_type is an enum. However zone->flags could be split
into 2 bitfields to hold enum zone_flags and enum zone_type.
However this patch still is independent of that since we have struct page as
starting point and zone_type is available from there directly w/o monkeying around
with any zone structs.

-Vineet

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vineet Gupta <vgupta@synopsys.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Hugh Dickins <hughd@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Michal Hocko <mhocko@suse.cz>,
	Jennifer Herbert <jennifer.herbert@citrix.com>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: optimize PageHighMem() check
Date: Fri, 2 Oct 2015 12:45:53 +0530	[thread overview]
Message-ID: <560E2F29.5070807@synopsys.com> (raw)
In-Reply-To: <20151001162528.32c5338efdff2bdea838befd@linux-foundation.org>

On Friday 02 October 2015 04:55 AM, Andrew Morton wrote:
> On Tue, 29 Sep 2015 13:24:20 +0530 Vineet Gupta <Vineet.Gupta1@synopsys.com> wrote:
> 
>> > This came up when implementing HIHGMEM/PAE40 for ARC.
>> > The kmap() / kmap_atomic() generated code seemed needlessly bloated due
>> > to the way PageHighMem() macro is implemented.
>> > It derives the exact zone for page and then does pointer subtraction
>> > with first zone to infer the zone_type.
>> > The pointer arithmatic in turn generates the code bloat.
>> > 
>> > PageHighMem(page)
>> >   is_highmem(page_zone(page))
>> >      zone_off = (char *)zone - (char *)zone->zone_pgdat->node_zones
>> > 
>> > Instead use is_highmem_idx() to work on zone_type available in page flags
>> > 
>> >    ----- Before -----
>> > 80756348:	mov_s      r13,r0
>> > 8075634a:	ld_s       r2,[r13,0]
>> > 8075634c:	lsr_s      r2,r2,30
>> > 8075634e:	mpy        r2,r2,0x2a4
>> > 80756352:	add_s      r2,r2,0x80aef880
>> > 80756358:	ld_s       r3,[r2,28]
>> > 8075635a:	sub_s      r2,r2,r3
>> > 8075635c:	breq       r2,0x2a4,80756378 <kmap+0x48>
>> > 80756364:	breq       r2,0x548,80756378 <kmap+0x48>
>> > 
>> >    ----- After  -----
>> > 80756330:	mov_s      r13,r0
>> > 80756332:	ld_s       r2,[r13,0]
>> > 80756334:	lsr_s      r2,r2,30
>> > 80756336:	sub_s      r2,r2,1
>> > 80756338:	brlo       r2,2,80756348 <kmap+0x30>
>> > 
>> > For x86 defconfig build (32 bit only) it saves around 900 bytes.
>> > For ARC defconfig with HIGHMEM, it saved around 2K bytes.
>> > 
>> >    ---->8-------
>> > ./scripts/bloat-o-meter x86/vmlinux-defconfig-pre x86/vmlinux-defconfig-post
>> > add/remove: 0/0 grow/shrink: 0/36 up/down: 0/-934 (-934)
>> > function                                     old     new   delta
>> > saveable_page                                162     154      -8
>> > saveable_highmem_page                        154     146      -8
>> > skb_gro_reset_offset                         147     131     -16
>> > ...
>> > ...
>> > __change_page_attr_set_clr                  1715    1678     -37
>> > setup_data_read                              434     394     -40
>> > mon_bin_event                               1967    1927     -40
>> > swsusp_save                                 1148    1105     -43
>> > _set_pages_array                             549     493     -56
>> >    ---->8-------
>> > 
>> > e.g. For ARC kmap()
>> > 
> is_highmem() is deranged.  Can't we use a bit in zone->flags or
> something?

It won't be "a" bit since zone_type is an enum. However zone->flags could be split
into 2 bitfields to hold enum zone_flags and enum zone_type.
However this patch still is independent of that since we have struct page as
starting point and zone_type is available from there directly w/o monkeying around
with any zone structs.

-Vineet

  reply	other threads:[~2015-10-02  7:16 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-29  7:54 [PATCH] mm: optimize PageHighMem() check Vineet Gupta
2015-09-29  7:54 ` Vineet Gupta
2015-10-01 13:12 ` Michal Hocko
2015-10-01 13:12   ` Michal Hocko
2015-10-01 23:25 ` Andrew Morton
2015-10-01 23:25   ` Andrew Morton
2015-10-02  7:15   ` Vineet Gupta [this message]
2015-10-02  7:15     ` Vineet Gupta
2015-10-02 20:53     ` Andrew Morton
2015-10-02 20:53       ` Andrew Morton
2015-10-03 10:19       ` Vineet Gupta
2015-10-03 10:19         ` Vineet Gupta
2015-11-06 10:21         ` Vineet Gupta
2015-11-06 10:21           ` Vineet Gupta
2015-11-06 10:21           ` Vineet Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=560E2F29.5070807@synopsys.com \
    --to=vgupta@synopsys.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jennifer.herbert@citrix.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.