All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.h.duyck@intel.com>
To: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, andi@firstfloor.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org
Subject: [PATCH v3 0/8] Improve performance of VM translation on x86_64
Date: Mon, 05 Nov 2012 11:03:25 -0800	[thread overview]
Message-ID: <20121105185657.10205.27419.stgit@gitlad.jf.intel.com> (raw)

This patch series is meant to address several issues I encountered with VM
translations on x86_64.  In my testing I found that swiotlb was incurring up
to a 5% processing overhead due to calls to __phys_addr.  To address that I
have updated swiotlb to use physical addresses instead of virtual addresses
to reduce the need to call __phys_addr.  However those patches didn't address
the other callers.  With these patches applied I am able to achieve an
additional 1% to 2% performance gain on top of the changes to swiotlb.

The first 2 patches are the performance optimizations that result in the 1% to
2% increase in overall performance.  The remaining patches are various
cleanups for a number of spots where __pa or virt_to_phys was being called
and was not needed or __pa_symbol could have been used.

It doesn't seem like the v2 patch set was accepted so I am submitting an
updated v3 set that is rebased off of linux-next with a few additional
improvements to the existing patches.  Specifically the first patch now also
updates __virt_addr_valid so that it is almost identical in layout to
__phys_addr.  Also I found one additional spot in init_64.c that could use
__pa_symbol instead of virt_to_page calls so I updated the first __pa_symbol
patch for the x86 init calls.

With this patch set applied I am noticing a 1-2% improvement in performance in
my routing tests.  Without my earlier swiotlb changes applied it was getting as
high as 6-7% because that code originally relied heavily on virt_to_phys.

The overall effect on size varies depending on what kernel options are
enabled.  I have notices that almost all of the network device drivers have
dropped in size by around 100 bytes.  I suspect this is due to the fact that
the virt_to_page call in dma_map_single is now less expensive.  However the
default build for x86_64 increases the vmlinux size by 3.5K with this change
applied.

---

Alexander Duyck (8):
      x86/lguest: Use __pa_symbol instead of __pa on C visible symbols
      x86/acpi: Use __pa_symbol instead of __pa on C visible symbols
      x86/xen: Use __pa_symbol instead of __pa on C visible symbols
      x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols
      x86: Use __pa_symbol instead of __pa on C visible symbols
      x86: Drop 4 unnecessary calls to __pa_symbol
      x86: Make it so that __pa_symbol can only process kernel symbols on x86_64
      x86: Improve __phys_addr performance by making use of carry flags and inlining


 arch/x86/include/asm/page.h          |    3 +-
 arch/x86/include/asm/page_32.h       |    1 +
 arch/x86/include/asm/page_64_types.h |   20 +++++++++++-
 arch/x86/kernel/acpi/sleep.c         |    2 +
 arch/x86/kernel/cpu/intel.c          |    2 +
 arch/x86/kernel/ftrace.c             |    4 +-
 arch/x86/kernel/head32.c             |    4 +-
 arch/x86/kernel/head64.c             |    4 +-
 arch/x86/kernel/setup.c              |   16 +++++-----
 arch/x86/kernel/x8664_ksyms_64.c     |    3 ++
 arch/x86/lguest/boot.c               |    3 +-
 arch/x86/mm/init_64.c                |   18 +++++------
 arch/x86/mm/pageattr.c               |    8 ++---
 arch/x86/mm/physaddr.c               |   55 +++++++++++++++++++++++++---------
 arch/x86/platform/efi/efi.c          |    4 +-
 arch/x86/realmode/init.c             |    8 ++---
 arch/x86/xen/mmu.c                   |   21 +++++++------
 17 files changed, 111 insertions(+), 65 deletions(-)

-- 

             reply	other threads:[~2012-11-05 19:03 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-05 19:03 Alexander Duyck [this message]
2012-11-05 19:04 ` [PATCH v3 1/8] x86: Improve __phys_addr performance by making use of carry flags and inlining Alexander Duyck
2012-11-05 20:24   ` Kirill A. Shutemov
2012-11-05 21:56     ` Alexander Duyck
2012-11-05 22:08       ` Kirill A. Shutemov
2012-11-16 19:35         ` Alexander Duyck
2012-11-05 19:04 ` [PATCH v3 2/8] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64 Alexander Duyck
2012-11-05 19:04 ` [PATCH v3 3/8] x86: Drop 4 unnecessary calls to __pa_symbol Alexander Duyck
2012-11-05 19:05 ` [PATCH v3 4/8] x86: Use __pa_symbol instead of __pa on C visible symbols Alexander Duyck
2012-11-05 19:05 ` [PATCH v3 5/8] x86/ftrace: " Alexander Duyck
2012-11-05 19:06 ` [PATCH v3 6/8] x86/xen: " Alexander Duyck
2012-11-06 15:45   ` Konrad Rzeszutek Wilk
2012-11-05 19:06 ` [PATCH v3 7/8] x86/acpi: " Alexander Duyck
2012-11-05 19:06 ` [PATCH v3 8/8] x86/lguest: " Alexander Duyck
2012-11-06  1:11   ` Rusty Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121105185657.10205.27419.stgit@gitlad.jf.intel.com \
    --to=alexander.h.duyck@intel.com \
    --cc=andi@firstfloor.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.