public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: kvm@vger.kernel.org
Subject: [Bug 217380] New: Latency issues in creating kvm-nx-lpage-recovery kthread
Date: Fri, 28 Apr 2023 07:30:14 +0000	[thread overview]
Message-ID: <bug-217380-28872@https.bugzilla.kernel.org/> (raw)

https://bugzilla.kernel.org/show_bug.cgi?id=217380

            Bug ID: 217380
           Summary: Latency issues in creating kvm-nx-lpage-recovery
                    kthread
           Product: Virtualization
           Version: unspecified
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P3
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: zhuangel570@gmail.com
        Regression: No

Hi

We found some latency issue in high-density and high-concurrency scenarios, we
are using cloud hypervisor as vmm for lightweight VM, using VIRTIO net and
block for VM. In our test, we got about 50ms to 100ms+ latency in creating VM
and register irqfd, after trace with funclatency (a tool of bcc-tools,
https://github.com/iovisor/bcc), we found the latency introduced by following
functions:

- kvm_vm_create_worker_thread introduce tail latency more than 100ms.
  This function was called when create "kvm-nx-lpage-recovery" kthread when
  create a new VM, this patch was introduced to recovery large page to relief
  performance loss caused by software mitigation of ITLB_MULTIHIT, see
  b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") and 1aa9b9572b10
  ("kvm: x86: mmu: Recovery of shattered NX large pages").

Here is a simple case, which can emulate the latency issue (the real latency
is lager). The case create 800 VM as background do nothing, then repeatedly
create 20 VM then destroy them after 400ms, just trace the two function
latency, 
you will reproduce such kind latency issue. Here is a trace log on Xeon(R)
Platinum 8255C server (96C, 2 sockets) with linux 6.2.20.

Reproduce Case
https://github.com/zhuangel/misc/blob/main/test/kvm_irqfd_fork/kvm_irqfd_fork.c
Reproduce log
https://github.com/zhuangel/misc/blob/main/test/kvm_irqfd_fork/test.log

To fix these latencies, I didn't have a graceful method, just simple ideas
is give user a chance to avoid these latencies, like a module parameter to
disable "kvm-nx-lpage-recovery" kthread.

Any suggestion to fix the issue if welcomed.

Thanks!

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

                 reply	other threads:[~2023-04-28  7:30 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-217380-28872@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox