Re: [PATCH v4 2/3] LoongArch: KVM: Implement guest-side PV TLB flush

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Tao Cui <cui.tao@linux.dev>
To: Bibo Mao <maobibo@loongson.cn>,
	zhaotianrui@loongson.cn, chenhuacai@kernel.org,
	loongarch@lists.linux.dev
Cc: cui.tao@linux.dev, kernel@xen0n.name, kvm@vger.kernel.org,
	Tao Cui <cuitao@kylinos.cn>
Subject: Re: [PATCH v4 2/3] LoongArch: KVM: Implement guest-side PV TLB flush
Date: Thu, 25 Jun 2026 20:51:45 +0800	[thread overview]
Message-ID: <416dfbf8-f765-442e-b6de-6fc0fe1a4b5f@linux.dev> (raw)
In-Reply-To: <835a9c5c-2f66-293b-d093-fc59bac26a01@loongson.cn>



在 2026/6/25 15:36, Bibo Mao 写道:
> 
> 
> On 2026/6/25 下午3:15, Tao Cui wrote:
>>
>>
>> 在 2026/6/25 14:11, Bibo Mao 写道:
>>>
>>>
>>> On 2026/6/25 上午11:31, Bibo Mao wrote:
>>>>
>>>>
>>>> On 2026/6/25 上午10:27, Tao Cui wrote:
>>>>>
>>>>> Hi Bibo,
>>>>>
>>>>> 在 2026/6/17 09:05, Bibo Mao 写道:
>>>>>>
>>>>>>> Rather than argue from intuition, I'd like to try the hypercall approach
>>>>>>> you suggested and measure the performance improvement against the current
>>>>>>> path. I'll share the results with you once the testing is done, so we
>>>>>>> can decide the direction based on the numbers.
>>>>>> well, that is the best. It is my pleasure to discuss this with you.
>>>>>>
>>>>>
>>>>> A quick update on the testing. I put both the hypercall and the
>>>>> steal-time variants through two benchmarks on an 8-core host with a
>>>>> 4:1 overcommitted guest (32 vCPUs), and wanted to share where things
>>>>> stand.
>>>>>
>>>>> The two workloads:
>>>>>    - ebizzy (all threads busy, mm-flush heavy)
>>>>>    - tlb_bench in sleep-idle mode (1 flusher + 31 sleeping idle threads,
>>>>>      so the idle vCPUs get preempted)
>>>>>
>>>>> ebizzy (records/s, higher is better), 32 vCPUs:
>>>>>     no-PV       ~103,737
>>>>>     hypercall   ~105,779
>>>>>     steal-time  ~105,872
>>>>>     -> all within noise (±2%); no measurable difference.
>>>> Hi Tao,
>>>>
>>>> what is ebizzy command? ebizzy -m or ebizzy -M.
>>>>
>>>> could you try command on host and one VM without over-committed at first, and then two VMs and three VMs?
>>>>
>>>> Here is result on my 3C5000 Dual-way machines with 32 cores and two numa nodes:
>>>>                   ./ebizzy -m          ./ebizzy -M
>>>> host             8633                 158898
>>>> VM(32 vCPUs)     6610                 133153
>>>> VM/host          76%                  83%
>>>>
>>> just ./ebizzy -M is enough, it seems that CPU number is one key factor.
>>>
>>
>> Sorry for the delay — it turned out my ebizzy command was wrong. I had
>> been running `ebizzy -t <vcpus> -S 10`, which is neither -m nor -M, so
>> neither mmap mode was active and the workload wasn't really stressing
>> the TLB-flush path. Thanks for catching it.
>>
>> I re-ran with -m and -M on host and a single VM (8-core LoongArch
>> KVM host, 8 vCPU guest, 1:1, no overcommit).
>>
>>                   ebizzy -m        ebizzy -M
>> host             ~20,000          ~55,000
>> VM (8 vCPU,1:1)  ~17,000          ~53,000
>> VM/host          ~86%             ~97%
>>
>> The -m ratio (86%) is close to your 76%.
> On my 3C5000 Dual-way machine, VM has the same CPU/memory topology with physical machine, the kernel is mainline without any patch.
>                           ./ebizzy -M
> Host (32 pCPUs)           158898
> One VM(32 vCPUs)          133153                 83% of host
> Two VMs(32 vCPUs each)    9083 + 9630 = 18713    11% of host
> 
> It seems that with ebizzy benchmark, there is big difference if vCPU is preempted. Even if vCPU is not preempted, the performance is only 83% of host on my 3C5000 Dual-way machine.

After fixing the ebizzy command, I
have multi-VM overcommit results for both approaches.

Setup: 8-core LoongArch (single-socket, single NUMA), KVM,
linux-next-20260623, 8 vCPU per VM. All VMs run ebizzy -M
simultaneously, 3 runs each. The PV-off baseline uses a guest kernel
with CONFIG_PARAVIRT=y but without the PV TLB flush patches, so
PV IPI and steal-time are active in all three columns.

ebizzy -M, total records/s across all VMs:

              PV-off      steal-time   hypercall
  1:1 (1VM)   53,600      53,800       53,900
  2:1 (2VM)    2,600      42,600       45,300
  3:1 (3VM)    2,800      44,700       46,000

At 1:1 there is no difference — no vCPU gets preempted. Under
overcommit, without PV TLB flush the throughput drops to ~3-5% of
the single-VM case, because every remote TLB flush sends IPIs to
preempted vCPUs. With either PV TLB flush variant, preempted vCPUs
are skipped, and total throughput stays at ~85-90% of single-VM
(bounded by physical cores).

Hypercall is consistently 3-6% above steal-time in the overcommit
cases. A possible reason is that the hypercall hands the entire
target set to the host in one call, while steal-time still IPIs the
running vCPUs and only defers the preempted ones.

On our 8-core machine the collapse is more severe than on your
3C5000 (~5% vs 11% of host at 2 VMs), likely due to the smaller
core count and single-NUMA topology.

Thanks,
Tao

> 
> Regards
> Bibo Mao
>>
>> I then tried multi-VM overcommit (2 and 3 VMs, all running ebizzy -M
>> simultaneously). The initial result showed a large gap between the
>> PV-TLB-flush kernel and the baseline under overcommit. Both kernels
>> have CONFIG_PARAVIRT enabled (PV IPI and steal-time are active in
>> both), so PV TLB flush should be the main differentiator — but since
>> they are two separate kernel images rather than a clean on/off toggle,
>> there may be some noise from other differences. I'm now re-running
>> with a QEMU CPU property (kvm-pv-tlb-flush on/off) on the same kernel
>> to isolate the effect cleanly.
>>
>> I'll share the verified numbers once I have them.
>>
>> Thanks,
>> Tao
>>
>>>> Regards
>>>> Bibo Mao
>>>>
>>>>>
>>>>> tlb_bench sleep-idle (ns/flush, lower is better), 1 flusher + 31 idle:
>>>>>     no-PV       ~166,536
>>>>>     steal-time  ~149,553
>>>>>     hypercall    ~88,686
>>>>>
>>>>> ebizzy's workload is mostly threads staying busy with alloc/copy/free,
>>>>> which drives remote TLB flushes against running vCPUs — that may not be
>>>>> the path this feature is meant to optimize, so the flat result there
>>>>> probably says more about the workload mismatch than about the feature
>>>>> itself. I need to take another look at whether the benchmark actually
>>>>> exercises the cases PV TLB flush targets before reading too much into
>>>>> the numbers, including the tlb_bench figure above.
>>>>>
>>>>> So I'd hold off on any conclusion for now. Next I'll re-examine the
>>>>> test setup / pick a workload that better matches the feature, and keep
>>>>> you posted once I have something more representative.
>>>>>
>>>>> Best,
>>>>> Tao
>>>>>
>>>
>

next prev parent reply	other threads:[~2026-06-25 12:51 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-15  8:21 [PATCH v4 0/3] LoongArch: KVM: Add PV TLB flush support Tao Cui
2026-06-15  8:21 ` [PATCH v4 1/3] LoongArch: KVM: Add PV TLB flush support via steal-time shared memory Tao Cui
2026-06-15  8:35   ` sashiko-bot
2026-06-16  1:03   ` Bibo Mao
2026-06-16 14:14     ` Tao Cui
2026-06-15  8:21 ` [PATCH v4 2/3] LoongArch: KVM: Implement guest-side PV TLB flush Tao Cui
2026-06-16  1:14   ` Bibo Mao
2026-06-16 15:08     ` Tao Cui
2026-06-17  1:05       ` Bibo Mao
2026-06-25  2:27         ` Tao Cui
2026-06-25  3:31           ` Bibo Mao
2026-06-25  6:11             ` Bibo Mao
2026-06-25  7:15               ` Tao Cui
2026-06-25  7:36                 ` Bibo Mao
2026-06-25 12:51                   ` Tao Cui [this message]
2026-06-26  1:37                     ` Bibo Mao
2026-06-26  2:53                       ` Tao Cui
2026-06-16  2:19   ` Bibo Mao
2026-06-15  8:21 ` [PATCH v4 3/3] KVM: selftests: loongarch: Add PV TLB flush performance test Tao Cui
2026-06-15  8:29   ` sashiko-bot
2026-06-15  9:24   ` Bibo Mao
2026-06-16 15:42     ` Tao Cui
2026-06-17  0:59       ` Bibo Mao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=416dfbf8-f765-442e-b6de-6fc0fe1a4b5f@linux.dev \
    --to=cui.tao@linux.dev \
    --cc=chenhuacai@kernel.org \
    --cc=cuitao@kylinos.cn \
    --cc=kernel@xen0n.name \
    --cc=kvm@vger.kernel.org \
    --cc=loongarch@lists.linux.dev \
    --cc=maobibo@loongson.cn \
    --cc=zhaotianrui@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.