* Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host @ 2025-11-05 20:29 mlevitsk 2025-11-10 19:51 ` mlevitsk 2026-02-25 1:07 ` Sean Christopherson 0 siblings, 2 replies; 6+ messages in thread From: mlevitsk @ 2025-11-05 20:29 UTC (permalink / raw) To: kvm; +Cc: Sean Christopherson Hi, I have a small, a bit philosophical question about the pmu kvm unit test: One of the subtests of this test, tests all GP counters at once, and it depends on the NMI watchdog being disabled, because it occupies one GP counter. This works fine, except when this test is run nested. In this case, assuming that the host has the NMI watchdog enabled, the L1 still can’t use all counters and has no way of working this around. Since AFAIK the current long term direction is vPMU, which is especially designed to address those kinds of issues, I am not sure it is worthy to attempt to fix this at L0 level (by reducing the number of counters that the guest can see for example, which also won’t always fix the issue, since there could be more perf users on the host, and NMI watchdog can also get dynamically enabled and disabled). My question is: Since the test fails and since it interferes with CI, does it make sense to add a workaround to the test, by making it use 1 counter less if run nested? As a bonus the test can also check the NMI watchdog state and also reduce the number of tested counters instead of being skipped, improving coverage. Does all this make sense? If not, what about making the ‘all_counters’ testcase optional (only print a warning) in case the test is run nested? Best regards, Maxim Levitsky ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host 2025-11-05 20:29 Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host mlevitsk @ 2025-11-10 19:51 ` mlevitsk 2025-11-26 18:14 ` mlevitsk 2026-02-25 1:07 ` Sean Christopherson 1 sibling, 1 reply; 6+ messages in thread From: mlevitsk @ 2025-11-10 19:51 UTC (permalink / raw) To: kvm; +Cc: Sean Christopherson On Wed, 2025-11-05 at 15:29 -0500, mlevitsk@redhat.com wrote: > Hi, > > I have a small, a bit philosophical question about the pmu kvm unit test: > > One of the subtests of this test, tests all GP counters at once, and it depends on the NMI watchdog being disabled, > because it occupies one GP counter. > > This works fine, except when this test is run nested. In this case, assuming that the host has the NMI watchdog enabled, > the L1 still can’t use all counters and has no way of working this around. > > Since AFAIK the current long term direction is vPMU, which is especially designed to address those kinds of issues, > I am not sure it is worthy to attempt to fix this at L0 level (by reducing the number of counters that the guest can see for example, > which also won’t always fix the issue, since there could be more perf users on the host, and NMI watchdog can also > get dynamically enabled and disabled). > > My question is: Since the test fails and since it interferes with CI, does it make sense to add a workaround to the test, > by making it use 1 counter less if run nested? > > As a bonus the test can also check the NMI watchdog state and also reduce the number of tested counters instead of being skipped, > improving coverage. > > Does all this make sense? If not, what about making the ‘all_counters’ testcase optional (only print a warning) in case the test is run nested? > > Best regards, > Maxim Levitsky > Kind ping on this question. Best regards, Maxim Levitsky ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host 2025-11-10 19:51 ` mlevitsk @ 2025-11-26 18:14 ` mlevitsk 2026-02-10 15:23 ` mlevitsk 0 siblings, 1 reply; 6+ messages in thread From: mlevitsk @ 2025-11-26 18:14 UTC (permalink / raw) To: kvm; +Cc: Sean Christopherson On Mon, 2025-11-10 at 14:51 -0500, mlevitsk@redhat.com wrote: > On Wed, 2025-11-05 at 15:29 -0500, mlevitsk@redhat.com wrote: > > Hi, > > > > I have a small, a bit philosophical question about the pmu kvm unit test: > > > > One of the subtests of this test, tests all GP counters at once, and it depends on the NMI watchdog being disabled, > > because it occupies one GP counter. > > > > This works fine, except when this test is run nested. In this case, assuming that the host has the NMI watchdog enabled, > > the L1 still can’t use all counters and has no way of working this around. > > > > Since AFAIK the current long term direction is vPMU, which is especially designed to address those kinds of issues, > > I am not sure it is worthy to attempt to fix this at L0 level (by reducing the number of counters that the guest can see for example, > > which also won’t always fix the issue, since there could be more perf users on the host, and NMI watchdog can also > > get dynamically enabled and disabled). > > > > My question is: Since the test fails and since it interferes with CI, does it make sense to add a workaround to the test, > > by making it use 1 counter less if run nested? > > > > As a bonus the test can also check the NMI watchdog state and also reduce the number of tested counters instead of being skipped, > > improving coverage. > > > > Does all this make sense? If not, what about making the ‘all_counters’ testcase optional (only print a warning) in case the test is run nested? > > > > Best regards, > > Maxim Levitsky > > > > Kind ping on this question. Another kind ping on this question. Best regards, Maxim Levitsky > > Best regards, > Maxim Levitsky ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host 2025-11-26 18:14 ` mlevitsk @ 2026-02-10 15:23 ` mlevitsk 0 siblings, 0 replies; 6+ messages in thread From: mlevitsk @ 2026-02-10 15:23 UTC (permalink / raw) To: kvm; +Cc: Sean Christopherson On Wed, 2025-11-26 at 13:14 -0500, mlevitsk@redhat.com wrote: > On Mon, 2025-11-10 at 14:51 -0500, mlevitsk@redhat.com wrote: > > On Wed, 2025-11-05 at 15:29 -0500, mlevitsk@redhat.com wrote: > > > Hi, > > > > > > I have a small, a bit philosophical question about the pmu kvm unit test: > > > > > > One of the subtests of this test, tests all GP counters at once, and it depends on the NMI watchdog being disabled, > > > because it occupies one GP counter. > > > > > > This works fine, except when this test is run nested. In this case, assuming that the host has the NMI watchdog enabled, > > > the L1 still can’t use all counters and has no way of working this around. > > > > > > Since AFAIK the current long term direction is vPMU, which is especially designed to address those kinds of issues, > > > I am not sure it is worthy to attempt to fix this at L0 level (by reducing the number of counters that the guest can see for example, > > > which also won’t always fix the issue, since there could be more perf users on the host, and NMI watchdog can also > > > get dynamically enabled and disabled). > > > > > > My question is: Since the test fails and since it interferes with CI, does it make sense to add a workaround to the test, > > > by making it use 1 counter less if run nested? > > > > > > As a bonus the test can also check the NMI watchdog state and also reduce the number of tested counters instead of being skipped, > > > improving coverage. > > > > > > Does all this make sense? If not, what about making the ‘all_counters’ testcase optional (only print a warning) in case the test is run nested? > > > > > > Best regards, > > > Maxim Levitsky > > > > > > > Kind ping on this question. > > Another kind ping on this question. A ping on this question. > > Best regards, > Maxim Levitsky > > > > > Best regards, > > Maxim Levitsky > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host 2025-11-05 20:29 Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host mlevitsk 2025-11-10 19:51 ` mlevitsk @ 2026-02-25 1:07 ` Sean Christopherson 2026-02-25 16:02 ` mlevitsk 1 sibling, 1 reply; 6+ messages in thread From: Sean Christopherson @ 2026-02-25 1:07 UTC (permalink / raw) To: mlevitsk; +Cc: kvm On Wed, Nov 05, 2025, mlevitsk@redhat.com wrote: > Hi, > > I have a small, a bit philosophical question about the pmu kvm unit test: The problem with philosophical questions is that they're never small :-) > One of the subtests of this test, tests all GP counters at once, and it > depends on the NMI watchdog being disabled, because it occupies one GP > counter. > > This works fine, except when this test is run nested. In this case, assuming > that the host has the NMI watchdog enabled, the L1 still can’t use all > counters and has no way of working this around. > > Since AFAIK the current long term direction is vPMU, which is especially > designed to address those kinds of issues, I am not sure it is worthy to > attempt to fix this at L0 level (by reducing the number of counters that the > guest can see for example, which also won’t always fix the issue, since there > could be more perf users on the host, and NMI watchdog can also get > dynamically enabled and disabled). Agreed. For the emulated PMU, I think the only reasonable answer is that the admin needs to understand the ramifications of exposing a PMU to the guest. > My question is: Since the test fails and since it interferes with CI, does it > make sense to add a workaround to the test, by making it use 1 counter less > if run nested? Hrm. I'd prefer not to? Mainly because reducing the number of used counters seems fragile as it relies heavily on implementation details of pieces of the stack beyond the current environment (the VM). I don't suppose there's any way to configure your CI pipeline to disable the host NMI watchdog? > As a bonus the test can also check the NMI watchdog state and also reduce the > number of tested counters instead of being skipped, improving coverage. I don't think I followed this part. How would a test that runs nested be able to query the host's NMI watchdog state? Oh, you're saying in a non-nested scenario to reduce the number of counters. For me personally, I prefer the SKIP, because it's noisier, i.e. tells me pretty loudly that I forgot to turn off the watchdog. It's saved me from debugging false failures at least once when running tests in a VM on the same host. > Does all this make sense? If not, what about making the ‘all_counters’ > testcase optional (only print a warning) in case the test is run nested? Printing a warning would definitely be my least favorite option. Tests that print warns on failure inevitably get ignored. :-/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host 2026-02-25 1:07 ` Sean Christopherson @ 2026-02-25 16:02 ` mlevitsk 0 siblings, 0 replies; 6+ messages in thread From: mlevitsk @ 2026-02-25 16:02 UTC (permalink / raw) To: Sean Christopherson; +Cc: kvm On Tue, 2026-02-24 at 17:07 -0800, Sean Christopherson wrote: > On Wed, Nov 05, 2025, mlevitsk@redhat.com wrote: > > Hi, > > > > I have a small, a bit philosophical question about the pmu kvm unit test: > > The problem with philosophical questions is that they're never small :-) > > > One of the subtests of this test, tests all GP counters at once, and it > > depends on the NMI watchdog being disabled, because it occupies one GP > > counter. > > > > This works fine, except when this test is run nested. In this case, assuming > > that the host has the NMI watchdog enabled, the L1 still can’t use all > > counters and has no way of working this around. > > > > Since AFAIK the current long term direction is vPMU, which is especially > > designed to address those kinds of issues, I am not sure it is worthy to > > attempt to fix this at L0 level (by reducing the number of counters that the > > guest can see for example, which also won’t always fix the issue, since there > > could be more perf users on the host, and NMI watchdog can also get > > dynamically enabled and disabled). > > Agreed. For the emulated PMU, I think the only reasonable answer is that the > admin needs to understand the ramifications of exposing a PMU to the guest. > > > My question is: Since the test fails and since it interferes with CI, does it > > make sense to add a workaround to the test, by making it use 1 counter less > > if run nested? > > Hrm. I'd prefer not to? Mainly because reducing the number of used counters > seems fragile as it relies heavily on implementation details of pieces of the > stack beyond the current environment (the VM). OK, then I'll leave it as is. > > I don't suppose there's any way to configure your CI pipeline to disable the > host NMI watchdog? It is probably possible, I'll ask the CI people. > > > As a bonus the test can also check the NMI watchdog state and also reduce the > > number of tested counters instead of being skipped, improving coverage. > > I don't think I followed this part. How would a test that runs nested be able > to query the host's NMI watchdog state? > > Oh, you're saying in a non-nested scenario to reduce the number of counters. > For me personally, I prefer the SKIP, because it's noisier, i.e. tells me pretty > loudly that I forgot to turn off the watchdog. It's saved me from debugging > false failures at least once when running tests in a VM on the same host. Yes, I am talking here about non nested scenario. > > > Does all this make sense? If not, what about making the ‘all_counters’ > > testcase optional (only print a warning) in case the test is run nested? > > Printing a warning would definitely be my least favorite option. Tests that > print warns on failure inevitably get ignored. :-/ > OK, let it be as it is now. Thanks, Best regards, Maxim Levitsky ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-25 16:03 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-05 20:29 Question: 'pmu' kvm unit test fails when run nested with NMI watchdog on the host mlevitsk 2025-11-10 19:51 ` mlevitsk 2025-11-26 18:14 ` mlevitsk 2026-02-10 15:23 ` mlevitsk 2026-02-25 1:07 ` Sean Christopherson 2026-02-25 16:02 ` mlevitsk
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox