From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 27E33312815 for ; Mon, 27 Oct 2025 16:52:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761583977; cv=none; b=Q+oukf9bUH1bjD64+oXanWxpmVlyMFbQzpcwc9kbG2Vo4BMi0/QUSqL5BjFHCXdJ9DKCkMkw1IbZf5qj+RPixnhP4Pp8aB7uyJQ2Rg1ta0i3pC24ufi+K5pF3TyW3Fo7J6BKmMLImkQrfS3b3XoazGDMjbSnLbAZeRbYPMvTeFw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761583977; c=relaxed/simple; bh=QmotaY3r0h035ni9QQnUzfLBB1F8gP8eFddY8fyI+JQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WvPZjS6J4cpmLIv1+/YEEy5QOVNID0qsmKameQ4NzPEF39fPNSz1XgqTK+N/sQt6fAlqeZ9gSdpD/WgUFjHsbqqWcT7a8YDUmb8LuoN3mi3Gn+qQ8C2r1AT2tLT4LMlIW1tDtPxtPnXdqJqNFuJqKIjaAELRc7yqnuivFAHS4SI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=p0SE22xG; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="p0SE22xG" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-336b646768eso5178181a91.1 for ; Mon, 27 Oct 2025 09:52:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761583974; x=1762188774; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=nci3vD9+uQYKkn0RLtYJrNDkeeIN+3Boy1aojm3SBYc=; b=p0SE22xGt+VlAqWqlPOzUOHNnUjZgdlYbBVPMP5DXsTAv+lzkAfm8mVdY0Y5bsVomT YLsBQn/M7Gg/S9yUJbgUwhPSHXaPQ9ge5F4IwIwA2cqB6vsSIV97ilQ79sIkOD/2oBIx FywvfoaV3ANnzp7qCS7+P+qn1w051c/A4JO4X5kde5HNg2Gr/RJM7Fg+jitb0Dz5al36 /ALVx32L1UxiTj+wz9BbnVvNdEmW0kxbIg20kclmL8dU0F2Ql9E9rq5OMDCYiisNg9Rc gQaD5TgcD232lQnHpf9lIZuSebijfIV00eFtCs+AhyAZhuzilNerFGURy4AMUVp9dDrT UAnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761583974; x=1762188774; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=nci3vD9+uQYKkn0RLtYJrNDkeeIN+3Boy1aojm3SBYc=; b=Yco7l21zYB1xZt0siDCtkeQ7nQ5S9IwgVoDRCL6YS0HQHbNIInhJ55MhswlsglIVI2 wIt2/8bFw6xKZsSrL30G6B5btOMLVzrwpSyNyzhGkFpqsZ0bGgcHFtawvFoDlZ0eJFeu l+1pRNkQajryVYReneIs5/rGHX0dG7umfyG8ASjw/Szsevf8QrMGrKJITAzRajZD5L8c Smj8wYtfZnHL+I/5vlHgCiDb0i5v7CXRNpheKFTHFQ+wsW4tgY/NFlRufqUnFIovzKro UTW1hQXVaprbkMGAER5ay2pmYnfod1+DOiIe2053RHYJx8dGfnbUuyL/10ZPeusgEqFC nweA== X-Forwarded-Encrypted: i=1; AJvYcCUAszEjJd8qk6T4vON7zM2+u8Kd2XsPs+UNEA8LnbpkLDdcS00x3MGzOPXnjlXdYBP2Wq4=@vger.kernel.org X-Gm-Message-State: AOJu0YwIOAKBBxaRiM3SIRAx2T+iWFJ2xOm7lh+2ollGl2deVkhhCth6 SZuvXfU0kfCknXHEuw12M9M364zawgouiuKyCwtyHE06KBum1P8ht3SN1+49J9RRlTbGDlxLcYQ HxWAV3w== X-Google-Smtp-Source: AGHT+IFcqiR+FzM+C6QnC6le2s5MaEaeD2bzjCHYTABVRUvKH+HTCSUhlvM0hPFJOWEAvlnisl+RfK7Q2jY= X-Received: from pjyd7.prod.google.com ([2002:a17:90a:dfc7:b0:33b:be14:2b6b]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1a89:b0:327:f216:4360 with SMTP id 98e67ed59e1d1-340279f2bf7mr578346a91.8.1761583974543; Mon, 27 Oct 2025 09:52:54 -0700 (PDT) Date: Mon, 27 Oct 2025 09:52:52 -0700 In-Reply-To: <20250912222525.2515416-3-dmatlack@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250912222525.2515416-1-dmatlack@google.com> <20250912222525.2515416-3-dmatlack@google.com> Message-ID: Subject: Re: [PATCH 2/2] KVM: selftests: Add a test for vfio-pci device IRQ delivery to vCPUs From: Sean Christopherson To: David Matlack Cc: Paolo Bonzini , Alex Williamson , kvm@vger.kernel.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Fri, Sep 12, 2025, David Matlack wrote:=20 > Add a new selftest called vfio_pci_irq_test that routes and delivers an > MSI from a vfio-pci device into a guest. This test builds on top of the > VFIO selftests library, which provides helpers for interacting with VFIO > devices and drivers for generating interrupts with specific devices. >=20 > This test sets up a configurable number of vCPUs in separate threads > that all spin in guest-mode or halt. Then the test round robin routes > the device's interrupt to different CPUs, triggers it, and then verifies > the guest received it. The test supports several options to enable > affinitizing the host IRQ handler to different CPUs, pinning vCPU > threads to different CPUs, and more. >=20 > This test also measure and reports the number of times the device IRQ > was handled by the host. This can be used to confirm whether > device-posted interrupts are working as expected. >=20 > Running this test requires a PCI device bound to the vfio-pci driver, > and then passing the BDF of the device to the test, e.g.: >=20 > $ ./vfio_pci_irq_test 0000:6a:01.0 >=20 > To run the test with real device-sent MSIs (-d option), the PCI device > must also have a supported driver in > tools/testing/selftests/vfio/lib/drivers/. >=20 > This test only supports x86_64 for now, but can be ported to other > architectures in the future. Can it though? There are bits and pieces that can be reused, but this test= is x86 through and through. > Suggested-by: Sean Christopherson > Link: https://lore.kernel.org/kvm/20250404193923.1413163-68-seanjc@google= .com/ > Signed-off-by: David Matlack > --- > tools/testing/selftests/kvm/Makefile.kvm | 1 + > .../testing/selftests/kvm/vfio_pci_irq_test.c | 507 ++++++++++++++++++ Please break this into multiple patches, e.g. a "basic" patch and and then = roughly one per "advanced" command line option. There is a _lot_ going on here, with very little documentation, e.g. the ch= angelog just says: The test supports several options to enable affinitizing the host IRQ han= dler to different CPUs, pinning vCPU threads to different CPUs, and more.=20 which is rather useless for readers that aren't already familiar with much = of what is being tested, and why. > 2 files changed, 508 insertions(+) > create mode 100644 tools/testing/selftests/kvm/vfio_pci_irq_test.c >=20 > diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/sel= ftests/kvm/Makefile.kvm > index ac283eddb66c..fc1fb91a6810 100644 > --- a/tools/testing/selftests/kvm/Makefile.kvm > +++ b/tools/testing/selftests/kvm/Makefile.kvm > @@ -148,6 +148,7 @@ TEST_GEN_PROGS_x86 +=3D rseq_test > TEST_GEN_PROGS_x86 +=3D steal_time > TEST_GEN_PROGS_x86 +=3D system_counter_offset_test > TEST_GEN_PROGS_x86 +=3D pre_fault_memory_test > +TEST_GEN_PROGS_x86 +=3D vfio_pci_irq_test > =20 > # Compiled outputs used by test targets > TEST_GEN_PROGS_EXTENDED_x86 +=3D x86/nx_huge_pages_test > diff --git a/tools/testing/selftests/kvm/vfio_pci_irq_test.c b/tools/test= ing/selftests/kvm/vfio_pci_irq_test.c > new file mode 100644 > index 000000000000..ed6baa8f9d74 > --- /dev/null > +++ b/tools/testing/selftests/kvm/vfio_pci_irq_test.c > @@ -0,0 +1,507 @@ > +// SPDX-License-Identifier: GPL-2.0 > +#include "kvm_util.h" > +#include "test_util.h" > +#include "apic.h" > +#include "processor.h" > + > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include > + > +static bool x2apic =3D true; > +static bool done; > +static bool block; > + > +static bool guest_ready_for_irqs[KVM_MAX_VCPUS]; > +static bool guest_received_irq[KVM_MAX_VCPUS]; > +static bool guest_received_nmi[KVM_MAX_VCPUS]; > + > +static pid_t vcpu_tids[KVM_MAX_VCPUS]; s/tids/pids? > +#define TIMEOUT_NS (2ULL * 1000 * 1000 * 1000) The timeout should be configurable via command line. > +#define READ_FROM_GUEST(_vm, _variable) ({ \ > + sync_global_from_guest(_vm, _variable); \ > + READ_ONCE(_variable); \ > +}) > + > +#define WRITE_TO_GUEST(_vm, _variable, _value) do { \ > + WRITE_ONCE(_variable, _value); \ > + sync_global_to_guest(_vm, _variable); \ > +} while (0) These belong in a separate patch, and in tools/testing/selftests/kvm/includ= e/kvm_util.h. > + > +static u32 guest_get_vcpu_id(void) > +{ > + if (x2apic) > + return x2apic_read_reg(APIC_ID); > + else > + return xapic_read_reg(APIC_ID) >> 24; > +} > + > +static void guest_enable_interrupts(void) This is a misleading name, e.g. I would expect it to _just_ be sti_nop(). = If the intent is to provide a hook for a potential non-x86 implementation, then it= should probably be split into guest_arch_test_setup() and guest_arch_irq_enable() = or so (to align with the kernel's local_irq_{dis,en}able()). As is, I would just omit the helper. > +{ > + if (x2apic) > + x2apic_enable(); > + else > + xapic_enable(); > + > + sti_nop(); > +} > + > +static void guest_irq_handler(struct ex_regs *regs) > +{ > + WRITE_ONCE(guest_received_irq[guest_get_vcpu_id()], true); Hmm, using APID ID works, but I don't like the hidden dependency on the lib= rary using ascending IDs starting from '0'. This would also be a good opportuni= ty to improve the core infrastructure. Rather than use APIC IDs, vm_arch_vcpu_add() could set MSR_TSC_AUX if RDTSC= P or RDPID is supported, and then make guest_get_vcpu_id() a common API that use= s RDPID or RDTSCP (prefer RDPID when possible). Then tests that want to use guest_get_vcpu_id() can do a TEST_REQUIRE(). R= DTSCP exists at least as far back as Haswell, so I don't think it's unreasonable = to start depending on MSR_TSC_AUX. Hmm, or we could steal an idea from the kernel and use the LDT to stash inf= o in a segment limit as a fallback. > + > + if (x2apic) > + x2apic_write_reg(APIC_EOI, 0); > + else > + xapic_write_reg(APIC_EOI, 0); > +} > + > +static void guest_nmi_handler(struct ex_regs *regs) > +{ > + WRITE_ONCE(guest_received_nmi[guest_get_vcpu_id()], true); > +} > + > +static void guest_code(void) > +{ > + guest_enable_interrupts(); > + WRITE_ONCE(guest_ready_for_irqs[guest_get_vcpu_id()], true); > + > + while (!READ_ONCE(done)) { > + if (block) > + hlt(); else cpu_relax(); > + } > + > + GUEST_DONE(); > +} > + > +static void *vcpu_thread_main(void *arg) > +{ > + struct kvm_vcpu *vcpu =3D arg; > + struct ucall uc; > + > + WRITE_ONCE(vcpu_tids[vcpu->id], syscall(__NR_gettid)); Please add wrapper in tools/testing/selftests/kvm/include/kvm_syscalls.h. > + > + vcpu_run(vcpu); > + TEST_ASSERT_EQ(UCALL_DONE, get_ucall(vcpu, &uc)); > + > + return NULL; > +} > + > +static int get_cpu(struct kvm_vcpu *vcpu) > +{ > + pid_t tid =3D vcpu_tids[vcpu->id]; > + cpu_set_t cpus; > + int cpu =3D -1; > + int i, ret; > + > + ret =3D sched_getaffinity(tid, sizeof(cpus), &cpus); > + TEST_ASSERT(ret =3D=3D 0, "sched_getaffinity() failed"); > + > + for (i =3D 0; i < get_nprocs(); i++) { > + if (!CPU_ISSET(i, &cpus)) > + continue; > + > + if (cpu !=3D -1) { > + cpu =3D i; > + } else { > + /* vCPU is pinned to multiple CPUs */ > + return -1; > + } > + } > + > + return cpu; > +} > + > +static void pin_vcpu_threads(int nr_vcpus, int start_cpu, cpu_set_t *ava= ilable_cpus) > +{ > + const size_t size =3D sizeof(cpu_set_t); > + int nr_cpus, cpu, vcpu_index =3D 0; > + cpu_set_t target_cpu; > + int r; > + > + nr_cpus =3D get_nprocs(); Generally speaking, KVM selftests try to avoid affining tasks to CPUs that = are outside of the original affinity list. See various usage of sched_getaffin= ity(). > + CPU_ZERO(&target_cpu); > + > + for (cpu =3D start_cpu;; cpu =3D (cpu + 1) % nr_cpus) { > + if (vcpu_index =3D=3D nr_vcpus) > + break; > + > + if (!CPU_ISSET(cpu, available_cpus)) > + continue; > + > + CPU_SET(cpu, &target_cpu); > + > + r =3D sched_setaffinity(vcpu_tids[vcpu_index], size, &target_cpu); > + TEST_ASSERT(r =3D=3D 0, "sched_setaffinity() failed (cpu: %d)", cpu); > + > + CPU_CLR(cpu, &target_cpu); > + > + vcpu_index++; > + } > +} > + > +static FILE *open_proc_interrupts(void) > +{ > + FILE *fp; > + > + fp =3D fopen("/proc/interrupts", "r"); > + TEST_ASSERT(fp, "fopen(/proc/interrupts) failed"); open_path_or_exit()? > + > + return fp; > +} And all of these /proc helpers belong in library code. > +static int get_irq_number(const char *device_bdf, int msi) > +{ > + char search_string[64]; > + char line[4096]; > + int irq =3D -1; > + FILE *fp; > + > + fp =3D open_proc_interrupts(); > + > + snprintf(search_string, sizeof(search_string), "vfio-msix[%d]", msi); > + > + while (fgets(line, sizeof(line), fp)) { > + if (strstr(line, device_bdf) && strstr(line, search_string)) { > + TEST_ASSERT_EQ(1, sscanf(line, "%d:", &irq)); > + break; > + } > + } > + > + fclose(fp); > + > + TEST_ASSERT(irq !=3D -1, "Failed to locate IRQ for %s %s", device_bdf, = search_string); > + return irq; > +} > + > +static int parse_interrupt_count(char *token) > +{ > + char *c; > + > + for (c =3D token; *c; c++) { > + if (!isdigit(*c)) > + return 0; > + } > + > + return atoi_non_negative("interrupt count", token); > +} > + > +static u64 __get_irq_count(const char *search_string) > +{ > + u64 total_count =3D 0; > + char line[4096]; > + FILE *fp; > + > + fp =3D open_proc_interrupts(); > + > + while (fgets(line, sizeof(line), fp)) { > + char *token =3D strtok(line, " "); > + > + if (strcmp(token, search_string)) > + continue; > + > + while ((token =3D strtok(NULL, " "))) > + total_count +=3D parse_interrupt_count(token); > + > + break; > + } > + > + fclose(fp); > + return total_count; > +} > + > +static u64 get_irq_count(int irq) > +{ > + char search_string[32]; > + > + snprintf(search_string, sizeof(search_string), "%d:", irq); > + return __get_irq_count(search_string); > +} > + > +static void kvm_clear_gsi_routes(struct kvm_vm *vm) > +{ > + struct kvm_irq_routing routes =3D {}; > + > + vm_ioctl(vm, KVM_SET_GSI_ROUTING, &routes); > +} > + > +static void kvm_route_msi(struct kvm_vm *vm, u32 gsi, struct kvm_vcpu *v= cpu, > + u8 vector, bool do_nmi) > +{ > + u8 buf[sizeof(struct kvm_irq_routing) + sizeof(struct kvm_irq_routing_e= ntry)] =3D {}; > + struct kvm_irq_routing *routes =3D (void *)&buf; > + > + routes->nr =3D 1; > + routes->entries[0].gsi =3D gsi; > + routes->entries[0].type =3D KVM_IRQ_ROUTING_MSI; > + routes->entries[0].u.msi.address_lo =3D 0xFEE00000 | (vcpu->id << 12); > + routes->entries[0].u.msi.data =3D do_nmi ? NMI_VECTOR | (4 << 8) : vect= or; > + > + vm_ioctl(vm, KVM_SET_GSI_ROUTING, routes); > +} > + > +static int setup_msi(struct vfio_pci_device *device, bool use_device_msi= ) > +{ > + const int flags =3D MAP_SHARED | MAP_ANONYMOUS; > + const int prot =3D PROT_READ | PROT_WRITE; > + struct vfio_dma_region *region; > + > + if (use_device_msi) { > + /* A driver is required to generate an MSI. */ > + TEST_REQUIRE(device->driver.ops); > + > + /* Set up a DMA-able region for the driver to use. */ Why? > + region =3D &device->driver.region; > + region->iova =3D 0; > + region->size =3D SZ_2M; > + region->vaddr =3D mmap(NULL, region->size, prot, flags, -1, 0); kvm_mmap() > + TEST_ASSERT(region->vaddr !=3D MAP_FAILED, "mmap() failed\n"); > + vfio_pci_dma_map(device, region); > + > + vfio_pci_driver_init(device); > + > + return device->driver.msi; > + } > + > + TEST_REQUIRE(device->msix_info.count > 0); > + vfio_pci_msix_enable(device, 0, 1); > + return 0; > +} > + > +static void send_msi(struct vfio_pci_device *device, bool use_device_msi= , int msi) IMO, this helper does more harm than good. There is only one real user, th= e second call unconditionally passes %false for @use_device_msi. If you drop the helper, than there should be no need to assert that the MSI= is the device MSI on *every* send via device. > +{ > + if (use_device_msi) { > + TEST_ASSERT_EQ(msi, device->driver.msi); > + vfio_pci_driver_send_msi(device); > + } else { > + vfio_pci_irq_trigger(device, VFIO_PCI_MSIX_IRQ_INDEX, msi); > + } > +} > + > +static void help(const char *name) > +{ > + printf("Usage: %s [-a] [-b] [-d] [-e] [-h] [-i nr_irqs] [-n] [-p] [-v n= r_vcpus] [-x] segment:bus:device.function\n", > + name); > + printf("\n"); > + printf(" -a: Randomly affinitize the device IRQ to different CPUs\n" > + " throughout the test.\n"); > + printf(" -b: Block vCPUs (e.g. HLT) instead of spinning in guest-mode\= n"); > + printf(" -d: Use the device to trigger the IRQ instead of emulating\n" > + " it with an eventfd write.\n"); > + printf(" -e: Destroy and recreate KVM's GSI routing table in between\n= " > + " some interrupts.\n"); > + printf(" -i: The number of IRQs to generate during the test.\n"); > + printf(" -n: Route some of the device interrupts to be delivered as\n" > + " an NMI into the guest.\n"); > + printf(" -p: Pin vCPU threads to random pCPUs throughout the test.\n")= ; > + printf(" -v: Set the number of vCPUs that the test should create.\n" > + " Interrupts will be round-robined among vCPUs.\n"); > + printf(" -x: Use xAPIC mode instead of x2APIC mode in the guest.\n"); > + printf("\n"); > + exit(KSFT_FAIL); > +} > + > +int main(int argc, char **argv) > +{ > + /* Random non-reserved vector and GSI to use for the device IRQ */ > + const u8 vector =3D 0xe0; s/random/arbitrary Why not make it truly random? > + const u32 gsi =3D 32; Ditto here. > + /* Test configuration (overridable by command line flags). */ > + bool use_device_msi =3D false, irq_affinity =3D false, pin_vcpus =3D fa= lse; > + bool empty =3D false, nmi =3D false; > + int nr_irqs =3D 1000; > + int nr_vcpus =3D 1; > + > + struct kvm_vcpu *vcpus[KVM_MAX_VCPUS]; > + pthread_t vcpu_threads[KVM_MAX_VCPUS]; > + u64 irq_count, pin_count, piw_count; > + struct vfio_pci_device *device; > + cpu_set_t available_cpus; > + const char *device_bdf; > + FILE *irq_affinity_fp; > + int i, j, c, msi, irq; > + struct kvm_vm *vm; > + int irq_cpu; > + int ret; > + > + device_bdf =3D vfio_selftests_get_bdf(&argc, argv); > + > + while ((c =3D getopt(argc, argv, "abdehi:npv:x")) !=3D -1) { > + switch (c) { > + case 'a': > + irq_affinity =3D true; > + break; > + case 'b': > + block =3D true; > + break; > + case 'd': > + use_device_msi =3D true; > + break; > + case 'e': > + empty =3D true; > + break; > + case 'i': > + nr_irqs =3D atoi_positive("Number of IRQs", optarg); > + break; > + case 'n': > + nmi =3D true; > + break; > + case 'p': > + pin_vcpus =3D true; > + break; > + case 'v': > + nr_vcpus =3D atoi_positive("nr_vcpus", optarg); > + break; > + case 'x': > + x2apic =3D false; > + break; > + case 'h': > + default: > + help(argv[0]); > + } > + } > + > + vm =3D vm_create_with_vcpus(nr_vcpus, guest_code, vcpus); > + vm_install_exception_handler(vm, vector, guest_irq_handler); > + vm_install_exception_handler(vm, NMI_VECTOR, guest_nmi_handler); > + > + if (!x2apic) > + virt_pg_map(vm, APIC_DEFAULT_GPA, APIC_DEFAULT_GPA); > + > + device =3D vfio_pci_device_init(device_bdf, default_iommu_mode); > + msi =3D setup_msi(device, use_device_msi); > + irq =3D get_irq_number(device_bdf, msi); > + > + irq_count =3D get_irq_count(irq); > + pin_count =3D __get_irq_count("PIN:"); > + piw_count =3D __get_irq_count("PIW:"); This is obviously very Intel specific information. If you're going to prin= t the posted IRQ info, then the test should also print e.g. AMD GALogIntr events. > + printf("%s %s MSI-X[%d] (IRQ-%d) %d times\n", > + use_device_msi ? "Triggering" : "Notifying the eventfd for", > + device_bdf, msi, irq, nr_irqs); > + > + kvm_assign_irqfd(vm, gsi, device->msi_eventfds[msi]); > + > + sync_global_to_guest(vm, x2apic); > + sync_global_to_guest(vm, block); > + > + for (i =3D 0; i < nr_vcpus; i++) > + pthread_create(&vcpu_threads[i], NULL, vcpu_thread_main, vcpus[i]); > + > + for (i =3D 0; i < nr_vcpus; i++) { > + struct kvm_vcpu *vcpu =3D vcpus[i]; > + > + while (!READ_FROM_GUEST(vm, guest_ready_for_irqs[vcpu->id])) > + continue; > + } > + > + if (pin_vcpus) { > + ret =3D sched_getaffinity(vcpu_tids[0], sizeof(available_cpus), &avail= able_cpus); > + TEST_ASSERT(ret =3D=3D 0, "sched_getaffinity() failed"); !ret Though this is another syscall that deserves a wrapper in kvm_syscalls.h. > + > + if (nr_vcpus > CPU_COUNT(&available_cpus)) { > + printf("There are more vCPUs than pCPUs; refusing to pin.\n"); > + pin_vcpus =3D false; Why is this not an assertion? Alternatively, why not double/triple/quadrup= le up as needed? > + } > + } > + > + if (irq_affinity) { > + char path[PATH_MAX]; > + > + snprintf(path, sizeof(path), "/proc/irq/%d/smp_affinity_list", irq); > + irq_affinity_fp =3D fopen(path, "w"); > + TEST_ASSERT(irq_affinity_fp, "fopen(%s) failed", path); More code that belongs in the library. > + } > + > + /* Set a consistent seed so that test are repeatable. */ > + srand(0); We should really figure out a solution for reproducible random numbers in t= he host. Ah, and kvm_selftest_init()'s handling of guest random seeds is flaw= ed, because it does random() without srand() and so AFAICT, gets the same seed = every time. E.g. seems like we want something like this, but with a way to overr= ide "random_seed" from a test. diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/sel= ftests/kvm/lib/kvm_util.c index 5744643d9ec3..0118fd2ba56b 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -2310,6 +2310,7 @@ void __attribute((constructor)) kvm_selftest_init(voi= d) struct sigaction sig_sa =3D { .sa_handler =3D report_unexpected_signal, }; + int random_seed; =20 /* Tell stdout not to buffer its content. */ setbuf(stdout, NULL); @@ -2319,8 +2320,13 @@ void __attribute((constructor)) kvm_selftest_init(vo= id) sigaction(SIGILL, &sig_sa, NULL); sigaction(SIGFPE, &sig_sa, NULL); =20 + random_seed =3D time(NULL); + srand(random_seed); + guest_random_seed =3D last_guest_seed =3D random(); - pr_info("Random seed: 0x%x\n", guest_random_seed); + + pr_info("Guest random seed: 0x%x (srand: 0x%x)\n", + guest_random_seed, random_seed); =20 kvm_selftest_arch_init(); } > + for (i =3D 0; i < nr_irqs; i++) { > + struct kvm_vcpu *vcpu =3D vcpus[i % nr_vcpus]; > + const bool do_nmi =3D nmi && (i & BIT(2)); > + const bool do_empty =3D empty && (i & BIT(3)); > + struct timespec start; > + > + if (do_empty) > + kvm_clear_gsi_routes(vm); > + > + kvm_route_msi(vm, gsi, vcpu, vector, do_nmi); > + > + if (irq_affinity && vcpu->id =3D=3D 0) { Please add comments explaining why affinity related stuff is only applied t= o vCPU0. I could probably figure it out, but I really shouldn't have to. > + irq_cpu =3D rand() % get_nprocs(); > + > + ret =3D fprintf(irq_affinity_fp, "%d\n", irq_cpu); > + TEST_ASSERT(ret > 0, "Failed to affinitize IRQ-%d to CPU %d", irq, ir= q_cpu); > + } > + > + if (pin_vcpus && vcpu->id =3D=3D 0) > + pin_vcpu_threads(nr_vcpus, rand() % get_nprocs(), &available_cpus); > + > + for (j =3D 0; j < nr_vcpus; j++) { > + TEST_ASSERT_EQ(READ_FROM_GUEST(vm, guest_received_irq[vcpu->id]), fal= se); > + TEST_ASSERT_EQ(READ_FROM_GUEST(vm, guest_received_nmi[vcpu->id]), fal= se); These won't generate helpful assert messages. > + } > + > + send_msi(device, use_device_msi, msi); > + > + clock_gettime(CLOCK_MONOTONIC, &start); > + for (;;) { > + if (do_nmi && READ_FROM_GUEST(vm, guest_received_nmi[vcpu->id])) > + break; > + > + if (!do_nmi && READ_FROM_GUEST(vm, guest_received_irq[vcpu->id])) > + break; received_irq =3D do_nmi ? &guest_received_nmi[vcpu->id] : &guest_received_irq[vcpu->id]; while (!READ_FROM_GUEST(vm, *received_irq)) if (timespec_to_ns(timespec_elapsed(start)) > TIMEOUT_NS) { ... }=09 cpu_relax(); } > + > + if (timespec_to_ns(timespec_elapsed(start)) > TIMEOUT_NS) { > + printf("Timeout waiting for interrupt!\n"); > + printf(" vCPU: %d\n", vcpu->id); > + printf(" do_nmi: %d\n", do_nmi); > + printf(" do_empty: %d\n", do_empty); > + if (irq_affinity) > + printf(" irq_cpu: %d\n", irq_cpu); vfio_pci_irq_test.c: In function =E2=80=98main=E2=80=99: vfio_pci_irq_test.c:469:41: error: =E2=80=98irq_cpu=E2=80=99 may be used un= initialized [-Werror=3Dmaybe-uninitialized] 469 | printf(" irq_cpu: %d\n", i= rq_cpu); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~= ~~~~~~~ vfio_pci_irq_test.c:332:13: note: =E2=80=98irq_cpu=E2=80=99 was declared he= re 332 | int irq_cpu; | ^~~~~~~ > + if (pin_vcpus) > + printf(" vcpu_cpu: %d\n", get_cpu(vcpu)); > + > + TEST_FAIL("vCPU never received IRQ!\n"); > + } > + } TEST_ASSERT(guest_received_irq[vcpu->id] !=3D guest_received_nmi[vcpu->id], "blah blah blah"); WRITE_TO_GUEST(vm, *received_irq, false); > + if (do_nmi) > + WRITE_TO_GUEST(vm, guest_received_nmi[vcpu->id], false); > + else > + WRITE_TO_GUEST(vm, guest_received_irq[vcpu->id], false); > + } > + > + WRITE_TO_GUEST(vm, done, true); > + > + for (i =3D 0; i < nr_vcpus; i++) { > + if (block) { > + kvm_route_msi(vm, gsi, vcpus[i], vector, false); > + send_msi(device, false, msi); > + } > + > + pthread_join(vcpu_threads[i], NULL); > + } > + > + if (irq_affinity) > + fclose(irq_affinity_fp); > + > + printf("Host interrupts handled:\n"); > + printf(" IRQ-%d: %lu\n", irq, get_irq_count(irq) - irq_count); > + printf(" Posted-interrupt notification events: %lu\n", > + __get_irq_count("PIN:") - pin_count); > + printf(" Posted-interrupt wakeup events: %lu\n", > + __get_irq_count("PIW:") - piw_count); > + > + vfio_pci_device_cleanup(device); > + > + return 0; > +} > --=20 > 2.51.0.384.g4c02a37b29-goog >=20