From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D4FF836134 for ; Tue, 21 May 2024 23:29:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716334198; cv=none; b=Mby1jt9vMQNp+lljtpDMJB8xGDXb6qcSstZEeJ1ogVltk2sd6Az/iG4JefTu3zcXJXgsMtpzwwJHVINrN3cfRkHoigT+jtY6fmuUzHlSRdHIdbFmpIf3AGw6S01I3SKA6EqTcMx7E+2jxSHoMzB+nf4JZDNTGV7RtpV59DEcSbE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716334198; c=relaxed/simple; bh=icORQk2+PbaJYEtENjA4tTy2RhkUjyDmCrWKszO1tTk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=kxM7oXFLo28atG6/DF/bqx9MiR7AOWS8cOu/ggUdiNVd31YPbrrltT4i8mNGjuryAn4f3lLh9OcuctnrF90rrYjguulKApIKoyr8LbvgFh1w6WCfR0e2QA9DmpxImZm6nPV3nKMYXjYwBVqzYlkhKqsGhJdoBmDA6R6Vp5rMBy0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=e5XdSiOO; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="e5XdSiOO" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-61be325413eso5029217b3.1 for ; Tue, 21 May 2024 16:29:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1716334196; x=1716938996; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+BzI0I7st4rXq5MwmbAV5Tzi38yIZZhUt1WvSYOgULQ=; b=e5XdSiOOyUTSz4lrE1t7uEP1r4FBJzLAms5e+hXlhVhMK9/lM3nrhvaE+qQq3HBALp f9dlNqy15ZKrwBn7tGFHnAVHCW3VYQ0UD4IjnsDC20jhiiuijzeD2mji7JIqzT8RJXr/ GRk6xMmJO31fOBUra6+S0azI2TF38V2xw3r/UoDC8B5yQUqSEzRF7lkelzS689xNqalP eiQvPKOtaHe49n3jCchxfeSiOYj7SLT7dCl54m8KBI/a6Wwmj12j1rSDMsFhuw9cRyi2 dkvVish8ai9wUwEAEKKNd4JubImHrfRWuw6wZnvo71YzsO3TlUl6WoL4DYJVG5p4Bap3 Fdeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716334196; x=1716938996; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+BzI0I7st4rXq5MwmbAV5Tzi38yIZZhUt1WvSYOgULQ=; b=nMvSMoOQD9puEdj8g6Th1HlLPmzZaewADelEYx3B6/sKVdlArEOvUUcMyjlvO+ddJ+ Vj1J0wf4YJl5xt19qHovEJx9WnMFmX+ouHtcnMbPNk4RRQnmIeMJS/k9llp081WFAXeW v/GuWv1E6FW8g2WIva7Z/XbfjkNEbv4CF+E707Gpi93Z/st6fAeZp9tGet6/IQHnJTOz +6UW++LTyPmo5PqJsPmhRwQGRsm25O2qQlLovCx2gv5Qc831xycfuSCMHqW409DEEcpu +0rJCBVOMBTdU73aWk7o0IeLihL8CbDoGeTJ3sy7DXeI/WQyKgq1nKjeAHJwQV84WDEK Z4Mg== X-Gm-Message-State: AOJu0YwJoDHqGCZ0s6fiMX/rFBSQpXE9go0mnKPQ+ynIR8f7fSotanb+ 7wIAFiwW+zGDMoQTnBGHODrUjpYkQoN5/tjvT7fWRVg79322h0AKDNg+VAQUxNK/mO1IgNG0XgS uUw== X-Google-Smtp-Source: AGHT+IFdSbE18dHmXUZm+Ed94A9LAnG/1FtV6QIW3gxa+x5R7ywczK/so7v774Y5MGmKju/Uxz5TPDu07oo= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a81:49c3:0:b0:618:5009:cb71 with SMTP id 00721157ae682-627e4892a8amr1387677b3.5.1716334195799; Tue, 21 May 2024 16:29:55 -0700 (PDT) Date: Tue, 21 May 2024 16:29:54 -0700 In-Reply-To: <7a46456d6750ea682ba321ad09541fa81677b81a.camel@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <7a46456d6750ea682ba321ad09541fa81677b81a.camel@redhat.com> Message-ID: Subject: Re: access_tracking_perf_test kvm selftest doesn't work when Multi-Gen LRU is in use From: Sean Christopherson To: Maxim Levitsky Cc: kvm@vger.kernel.org, Paolo Bonzini , Henry Huang , linux-mm@kvack.org Content-Type: text/plain; charset="us-ascii" On Wed, May 15, 2024, Maxim Levitsky wrote: > Small note on why we started seeing this failure on RHEL 9 and only on some machines: > > - RHEL9 has MGLRU enabled, RHEL8 doesn't. For a stopgap in KVM selftests, or possibly even a long term solution in case the decision is that page_idle will simply have different behavior for MGLRU, couldn't we tweak the test to not assert if MGRLU is enabled? E.g. refactor get_module_param_integer() and/or get_module_param() to add get_sysfs_value_integer() or so, and then do this? diff --git a/tools/testing/selftests/kvm/access_tracking_perf_test.c b/tools/testing/selftests/kvm/access_tracking_perf_test.c index 3c7defd34f56..1e759df36098 100644 --- a/tools/testing/selftests/kvm/access_tracking_perf_test.c +++ b/tools/testing/selftests/kvm/access_tracking_perf_test.c @@ -123,6 +123,11 @@ static void mark_page_idle(int page_idle_fd, uint64_t pfn) "Set page_idle bits for PFN 0x%" PRIx64, pfn); } +static bool is_lru_gen_enabled(void) +{ + return !!get_sysfs_value_integer("/sys/kernel/mm/lru_gen/enabled"); +} + static void mark_vcpu_memory_idle(struct kvm_vm *vm, struct memstress_vcpu_args *vcpu_args) { @@ -185,7 +190,8 @@ static void mark_vcpu_memory_idle(struct kvm_vm *vm, */ if (still_idle >= pages / 10) { #ifdef __x86_64__ - TEST_ASSERT(this_cpu_has(X86_FEATURE_HYPERVISOR), + TEST_ASSERT(this_cpu_has(X86_FEATURE_HYPERVISOR) || + is_lru_gen_enabled(), "vCPU%d: Too many pages still idle (%lu out of %lu)", vcpu_idx, still_idle, pages); #endif > - machine needs to have more than one NUMA node because NUMA balancing > (enabled by default) tries apparently to write protect the primary PTEs > of (all?) processes every few seconds, and that causes KVM to flush the secondary PTEs: > (at least with new tdp mmu) > > access_tracking-3448 [091] ....1.. 1380.244666: handle_changed_spte <-tdp_mmu_set_spte > access_tracking-3448 [091] ....1.. 1380.244667: > => cdc_driver_init > => handle_changed_spte > => tdp_mmu_set_spte > => tdp_mmu_zap_leafs > => kvm_tdp_mmu_unmap_gfn_range > => kvm_unmap_gfn_range > => kvm_mmu_notifier_invalidate_range_start > => __mmu_notifier_invalidate_range_start > => change_p4d_range > => change_protection > => change_prot_numa > => task_numa_work > => task_work_run > => exit_to_user_mode_prepare > => syscall_exit_to_user_mode > => do_syscall_64 > => entry_SYSCALL_64_after_hwframe > > It's a separate question, if the NUMA balancing should do this, or if NUMA > balancing should be enabled by default, FWIW, IMO, enabling NUMA balancing on a system whose primary purpose is to run VMs is bad idea. NUMA balancing operates under the assumption that a !PRESENT #PF is relatively cheap. When secondary MMUs are involved, that is simply not the case, e.g. to honor the mmu_notifer event, KVM zaps _and_ does a remote TLB flush. Even if we reworked KVM and/or the mmu_notifiers so that KVM didn't need to do such a heavy operation, the cost of page fault VM-Exit is significantly higher than the cost of a host #PF. > because there are other reasons that can force KVM to invalidate the > secondary mappings and trigger this issue. Ya.