public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Nick Piggin <npiggin@suse.de>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	KVM list <kvm@vger.kernel.org>
Subject: Slow vmalloc in 2.6.35-rc3
Date: Thu, 24 Jun 2010 12:19:32 +0300	[thread overview]
Message-ID: <4C232324.7070305@redhat.com> (raw)

I see really slow vmalloc performance on 2.6.35-rc3:

# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
  3)   3.581 us    |  vfree();
  3)               |  msr_io() {
  3) ! 523.880 us  |    vmalloc();
  3)   1.702 us    |    vfree();
  3) ! 529.960 us  |  }
  3)               |  msr_io() {
  3) ! 564.200 us  |    vmalloc();
  3)   1.429 us    |    vfree();
  3) ! 568.080 us  |  }
  3)               |  msr_io() {
  3) ! 578.560 us  |    vmalloc();
  3)   1.697 us    |    vfree();
  3) ! 584.791 us  |  }
  3)               |  msr_io() {
  3) ! 559.657 us  |    vmalloc();
  3)   1.566 us    |    vfree();
  3) ! 575.948 us  |  }
  3)               |  msr_io() {
  3) ! 536.558 us  |    vmalloc();
  3)   1.553 us    |    vfree();
  3) ! 542.243 us  |  }
  3)               |  msr_io() {
  3) ! 560.086 us  |    vmalloc();
  3)   1.448 us    |    vfree();
  3) ! 569.387 us  |  }

msr_io() is from arch/x86/kvm/x86.c, allocating at most 4K (yes it 
should use kmalloc()).  The memory is immediately vfree()ed.  There are 
96 entries in /proc/vmallocinfo, and the whole thing is single threaded 
so there should be no contention.

Here's the perf report:

     63.97%             qemu  
[kernel]                                            [k] rb_next
                        |
                        --- rb_next
                           |
                           |--70.75%-- alloc_vmap_area
                           |          __get_vm_area_node
                           |          __vmalloc_node
                           |          vmalloc
                           |          |
                           |          |--99.15%-- msr_io
                           |          |          kvm_arch_vcpu_ioctl
                           |          |          kvm_vcpu_ioctl
                           |          |          vfs_ioctl
                           |          |          do_vfs_ioctl
                           |          |          sys_ioctl
                           |          |          system_call
                           |          |          __GI_ioctl
                           |          |          |
                           |          |           --100.00%-- 
0x1dfc4a8878e71362
                           |          |
                           |           --0.85%-- __kvm_set_memory_region
                           |                     kvm_set_memory_region
                           |                     
kvm_vm_ioctl_set_memory_region
                           |                     kvm_vm_ioctl
                           |                     vfs_ioctl
                           |                     do_vfs_ioctl
                           |                     sys_ioctl
                           |                     system_call
                           |                     __GI_ioctl
                           |
                            --29.25%-- __get_vm_area_node
                                      __vmalloc_node
                                      vmalloc
                                      |
                                      |--98.89%-- msr_io
                                      |          kvm_arch_vcpu_ioctl
                                      |          kvm_vcpu_ioctl
                                      |          vfs_ioctl
                                      |          do_vfs_ioctl
                                      |          sys_ioctl
                                      |          system_call
                                      |          __GI_ioctl
                                      |          |
                                      |           --100.00%-- 
0x1dfc4a8878e71362


It seems completely wrong - iterating 8 levels of a binary tree 
shouldn't take half a millisecond.

-- 
error compiling committee.c: too many arguments to function


             reply	other threads:[~2010-06-24  9:19 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-24  9:19 Avi Kivity [this message]
2010-06-24 15:14 ` Slow vmalloc in 2.6.35-rc3 Nick Piggin
2010-06-27  9:17   ` Avi Kivity
2010-06-28  3:30     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C232324.7070305@redhat.com \
    --to=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox