Linux Trace Kernel
 help / color / mirror / Atom feed
* Re: [PATCH v8 19/46] KVM: guest_memfd: Use actual size for invalidation in kvm_gmem_release()
From: Fuad Tabba @ 2026-06-19 10:46 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-19-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> __kvm_gmem_invalidate_begin() and __kvm_gmem_invalidate_end() actually do
> not specially handle -1ul. -1ul is used as a huge number, which legal
> indices do not exceed, and hence the invalidation works as expected.
>
> Since a later patch is going to make use of the exact range, calculate the
> size of the guest_memfd inode and use it as the end range for invalidating
> SPTEs.
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> ---

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad

>  virt/kvm/guest_memfd.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index d163559da0235..d72ecbfcc3144 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -366,6 +366,7 @@ static long kvm_gmem_fallocate(struct file *file, int mode, loff_t offset,
>
>  static int kvm_gmem_release(struct inode *inode, struct file *file)
>  {
> +       pgoff_t end = i_size_read(inode) >> PAGE_SHIFT;
>         struct gmem_file *f = file->private_data;
>         struct kvm_memory_slot *slot;
>         struct kvm *kvm = f->kvm;
> @@ -396,9 +397,9 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
>          * Zap all SPTEs pointed at by this file.  Do not free the backing
>          * memory, as its lifetime is associated with the inode, not the file.
>          */
> -       __kvm_gmem_invalidate_start(f, 0, -1ul,
> +       __kvm_gmem_invalidate_start(f, 0, end,
>                                     kvm_gmem_get_invalidate_filter(inode));
> -       __kvm_gmem_invalidate_end(f, 0, -1ul);
> +       __kvm_gmem_invalidate_end(f, 0, end);
>
>         list_del(&f->entry);
>
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 17/46] KVM: guest_memfd: Advertise KVM_SET_MEMORY_ATTRIBUTES2 ioctl
From: Fuad Tabba @ 2026-06-19 10:35 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-17-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> Introduce KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES to advertise the
> availability of the KVM_SET_MEMORY_ATTRIBUTES2 ioctl.
>
> KVM_SET_MEMORY_ATTRIBUTES2 is a guest_memfd-scoped version of the existing
> KVM_SET_MEMORY_ATTRIBUTES VM ioctl. It allows userspace to manage memory
> attributes, such as KVM_MEMORY_ATTRIBUTE_PRIVATE, directly on a guest_memfd
> file descriptor.
>
> This new version uses struct kvm_memory_attributes2, which adds an
> error_offset field to the output. This allows KVM to return the specific
> offset that triggered an error, which is especially useful for handling
> EAGAIN results caused by transient page reference counts during attribute
> conversions.
>
> Update the KVM API documentation to define the new ioctl and its behavior,
> and add the necessary UAPI definitions and capability checks.
>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Suggested-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  Documentation/virt/kvm/api.rst | 78 +++++++++++++++++++++++++++++++++++++++++-
>  include/uapi/linux/kvm.h       |  2 ++
>  virt/kvm/kvm_main.c            | 23 +++++++++----
>  3 files changed, 95 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index a833d90845b95..73878f34f6d2e 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -117,7 +117,7 @@ description:
>        x86 includes both i386 and x86_64.
>
>    Type:
> -      system, vm, or vcpu.
> +      system, vm, vcpu or guest_memfd.
>
>    Parameters:
>        what parameters are accepted by the ioctl.
> @@ -6373,6 +6373,8 @@ S390:
>  Returns -EINVAL if the VM has the KVM_VM_S390_UCONTROL flag set.
>  Returns -EINVAL if called on a protected VM.
>
> +.. _KVM_SET_MEMORY_ATTRIBUTES:
> +
>  4.141 KVM_SET_MEMORY_ATTRIBUTES
>  -------------------------------
>
> @@ -6566,6 +6568,80 @@ KVM_S390_KEYOP_SSKE
>    Sets the storage key for the guest address ``guest_addr`` to the key
>    specified in ``key``, returning the previous value in ``key``.
>
> +4.145 KVM_SET_MEMORY_ATTRIBUTES2
> +---------------------------------
> +
> +:Capability: KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES
> +:Architectures: all
> +:Type: guest_memfd ioctl
> +:Parameters: struct kvm_memory_attributes2 (in/out)
> +:Returns: 0 on success, <0 on error
> +
> +Errors:
> +
> +  ========== ===============================================================
> +  EINVAL     The specified `offset` or `size` were invalid (e.g. not
> +             page aligned, causes an overflow, or size is zero).
> +  EFAULT     The parameter address was invalid.
> +  EAGAIN     Some page within requested range had unexpected refcounts. The
> +             offset of the page will be returned in `error_offset`.
> +  ENOMEM     Ran out of memory trying to track private/shared state
> +  ========== ===============================================================
> +
> +KVM_SET_MEMORY_ATTRIBUTES2 is an extension to
> +KVM_SET_MEMORY_ATTRIBUTES that supports returning (writing) values to
> +userspace.  The original (pre-extension) fields are shared with
> +KVM_SET_MEMORY_ATTRIBUTES identically.
> +
> +Attribute values are shared with KVM_SET_MEMORY_ATTRIBUTES.
> +
> +::
> +
> +  struct kvm_memory_attributes2 {
> +       /* in */
> +       union {
> +               __u64 address;
> +               __u64 offset;
> +       };
> +       __u64 size;
> +       __u64 attributes;
> +       __u64 flags;
> +       /* out */
> +       __u64 error_offset;
> +       __u64 reserved[11];
> +  };
> +
> +  #define KVM_MEMORY_ATTRIBUTE_PRIVATE           (1ULL << 3)
> +
> +Set attributes for a range of offsets within a guest_memfd to
> +KVM_MEMORY_ATTRIBUTE_PRIVATE to limit the specified guest_memfd backed
> +memory range for guest_use. Even if KVM_CAP_GUEST_MEMFD_MMAP is
> +supported, after a successful call to set
> +KVM_MEMORY_ATTRIBUTE_PRIVATE, the requested range will not be mappable
> +into host userspace and will only be mappable by the guest.
> +
> +To allow the range to be mappable into host userspace again, call
> +KVM_SET_MEMORY_ATTRIBUTES2 on the guest_memfd again with
> +KVM_MEMORY_ATTRIBUTE_PRIVATE unset.
> +
> +KVM does not directly manipulate the memory contents of pages during
> +attribute updates. However, the process of setting these attributes,
> +which includes operations such as unmapping pages from the host or
> +stage-2 page tables, may result in side effects on memory contents
> +that vary across different trusted firmware implementations.
> +
> +If this ioctl returns -EAGAIN, the offset of the page with unexpected
> +refcounts will be returned in `error_offset`. This can occur if there
> +are transient refcounts on the pages, taken by other parts of the
> +kernel.
> +
> +Userspace is expected to figure out how to remove all known refcounts
> +on the shared pages, such as refcounts taken by get_user_pages(), and
> +try the ioctl again. A possible source of these long term refcounts is
> +if the guest_memfd memory was pinned in IOMMU page tables.
> +
> +See also: :ref: `KVM_SET_MEMORY_ATTRIBUTES`.
> +
>  .. _kvm_run:
>
>  5. The kvm_run structure
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 876c0429f9d4e..129d6f6303251 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -997,6 +997,7 @@ struct kvm_enable_cap {
>  #define KVM_CAP_S390_KEYOP 247
>  #define KVM_CAP_S390_VSIE_ESAMODE 248
>  #define KVM_CAP_S390_HPAGE_2G 249
> +#define KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES 250
>
>  struct kvm_irq_routing_irqchip {
>         __u32 irqchip;
> @@ -1649,6 +1650,7 @@ struct kvm_memory_attributes {
>         __u64 flags;
>  };
>
> +/* Available with KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES */
>  #define KVM_SET_MEMORY_ATTRIBUTES2              _IOWR(KVMIO,  0xd2, struct kvm_memory_attributes2)
>
>  struct kvm_memory_attributes2 {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index a08b518cdb175..044486f128c37 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2434,18 +2434,22 @@ static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm,
>  }
>  #endif /* CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */
>
> +#ifdef kvm_arch_has_private_mem
> +static u64 kvm_supports_private_mem(struct kvm *kvm)
> +{
> +       return !kvm || kvm_arch_has_private_mem(kvm);
> +}
> +#else
> +#define kvm_supports_private_mem(kvm) false
> +#endif
> +
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>  static u64 kvm_supported_vm_mem_attributes(struct kvm *kvm)
>  {
> -#ifdef kvm_arch_has_private_mem
> -       if (gmem_in_place_conversion)
> +       if (gmem_in_place_conversion || !kvm_supports_private_mem(kvm))
>                 return 0;
>
> -       if (!kvm || kvm_arch_has_private_mem(kvm))
> -               return KVM_MEMORY_ATTRIBUTE_PRIVATE;
> -#endif
> -
> -       return 0;
> +       return KVM_MEMORY_ATTRIBUTE_PRIVATE;
>  }
>
>  /*
> @@ -4969,6 +4973,11 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
>                 return 1;
>         case KVM_CAP_GUEST_MEMFD_FLAGS:
>                 return kvm_gmem_get_supported_flags(kvm);
> +       case KVM_CAP_GUEST_MEMFD_MEMORY_ATTRIBUTES:
> +               if (!gmem_in_place_conversion || !kvm_supports_private_mem(kvm))
> +                       return 0;
> +
> +               return KVM_MEMORY_ATTRIBUTE_PRIVATE;
>  #endif
>         default:
>                 break;
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 15/46] KVM: guest_memfd: Call arch invalidate hooks on conversion
From: Fuad Tabba @ 2026-06-19 10:09 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-15-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> When memory in guest_memfd is converted from private to shared, the
> platform-specific state associated with the guest-private pages must be
> invalidated or cleaned up.
>
> Iterate over the folios in the affected range and call the
> kvm_arch_gmem_invalidate() hook for each PFN range. This allows
> architectures to perform necessary teardown, such as updating hardware
> metadata or encryption states, before the pages are transitioned to the
> shared state.
>
> Invoke this helper after indicating to KVM's mmu code that an invalidation
> is in progress to stop in-flight page faults from succeeding.
>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Coming back to this after working through the arm64/pKVM side. My
Reviewed-by here is from the previous round and the patch hasn't
changed, but I missed an implication for arm64.

kvm_arch_gmem_invalidate() is now called from two paths with the same
(start, end) signature: folio teardown (kvm_gmem_free_folio) and
private->shared conversion (here). For SNP/TDX that's fine, conversion is
destructive anyway. For pKVM the two need opposite content semantics:
conversion must preserve the page in place (same physical page, the point
of in-place conversion without encryption), while teardown must scrub it
before returning it to the host.

The hook gets only a pfn range with no indication of which caller it's
serving, so arm64 can't give the two paths the behaviour they need. It
would help to signal intent on the conversion path: a reason/flag, a
separate hook, or not routing non-destructive conversion through the
teardown hook.

arm64 isn't here yet, so this isn't urgent, but the hook is gaining a
second caller now, and it's cheaper to leave room for the distinction
than to change a generic contract other arches depend on later.

Cheers,
/fuad


> ---
>  virt/kvm/guest_memfd.c | 41 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 433f79047b9d1..3c94442bc8131 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -607,6 +607,42 @@ static bool kvm_gmem_is_safe_for_conversion(struct inode *inode, pgoff_t start,
>         return safe;
>  }
>
> +#ifdef CONFIG_HAVE_KVM_ARCH_GMEM_INVALIDATE
> +static void kvm_gmem_invalidate(struct inode *inode, pgoff_t start, pgoff_t end)
> +{
> +       struct folio_batch fbatch;
> +       pgoff_t next = start;
> +       int i;
> +
> +       folio_batch_init(&fbatch);
> +       while (filemap_get_folios(inode->i_mapping, &next, end - 1, &fbatch)) {
> +               for (i = 0; i < folio_batch_count(&fbatch); ++i) {
> +                       struct folio *folio = fbatch.folios[i];
> +                       pgoff_t start_index, end_index;
> +                       kvm_pfn_t start_pfn, end_pfn;
> +
> +                       start_index = max(start, folio->index);
> +                       end_index = min(end, folio_next_index(folio));
> +                       /*
> +                        * end_index is either in folio or points to
> +                        * the first page of the next folio. Hence,
> +                        * all pages in range [start_index, end_index)
> +                        * are contiguous.
> +                        */
> +                       start_pfn = folio_file_pfn(folio, start_index);
> +                       end_pfn = start_pfn + end_index - start_index;
> +
> +                       kvm_arch_gmem_invalidate(start_pfn, end_pfn);
> +               }
> +
> +               folio_batch_release(&fbatch);
> +               cond_resched();
> +       }
> +}
> +#else
> +static void kvm_gmem_invalidate(struct inode *inode, pgoff_t start, pgoff_t end) {}
> +#endif
> +
>  static int __kvm_gmem_set_attributes(struct inode *inode, pgoff_t start,
>                                      size_t nr_pages, uint64_t attrs,
>                                      pgoff_t *err_index)
> @@ -647,7 +683,12 @@ static int __kvm_gmem_set_attributes(struct inode *inode, pgoff_t start,
>          */
>
>         kvm_gmem_invalidate_start(inode, start, end);
> +
> +       if (!to_private)
> +               kvm_gmem_invalidate(inode, start, end);
> +
>         mas_store_prealloc(&mas, xa_mk_value(attrs));
> +
>         kvm_gmem_invalidate_end(inode, start, end);
>  out:
>         filemap_invalidate_unlock(mapping);
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 08/46] KVM: Provide generic interface for checking memory private/shared status
From: Suzuki K Poulose @ 2026-06-19  9:57 UTC (permalink / raw)
  To: Fuad Tabba, ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, aneesh.kumar, liam, Paolo Bonzini,
	Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
	Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
	Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
	Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka,
	kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
	linux-mm, linux-coco
In-Reply-To: <CA+EHjTxu32nQ+vPV7Zmcw76_-V4g2_g=P_UzRnnO2dP1PFO2ww@mail.gmail.com>

On 19/06/2026 09:21, Fuad Tabba wrote:
> On Fri, 19 Jun 2026 at 09:19, Fuad Tabba <tabba@google.com> wrote:
>>
>> On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
>> <devnull+ackerleytng.google.com@kernel.org> wrote:
>>>
>>> From: Sean Christopherson <seanjc@google.com>
>>>
>>> Introduce a generic kvm_mem_is_private() interface using a static call to
>>> determine if a GFN is private. This allows the implementation for checking
>>> a GFN's private/shared status to be set at runtime.
>>>
>>> In preparation for choosing implementations between a guest_memfd lookup
>>> and the existing VM attribute lookup, rename the existing
>>> VM-attribute-based check to kvm_vm_mem_is_private to emphasize that it
>>> looks up VM attributes.
>>>
>>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>>
>> (SoB fix plz)
>>
>> Reviewed-by: Fuad Tabba <tabba@google.com>
>>
>> Cheers,
>> /fuad
>>> ---
>>>   include/linux/kvm_host.h | 12 +++++++++++-
>>>   virt/kvm/kvm_main.c      | 15 +++++++++++++++
>>>   2 files changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>>> index eb26d4ea8945a..3915da2a61778 100644
>>> --- a/include/linux/kvm_host.h
>>> +++ b/include/linux/kvm_host.h
>>> @@ -2546,7 +2546,7 @@ bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm,
>>>   bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
>>>                                           struct kvm_gfn_range *range);
>>>
>>> -static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>>> +static inline bool kvm_vm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> 
> Should have read the Sashiko review first, but where is this used?
> It's not used at all in this series...

See below:

> 
> /fuad
> 
>>>   {
>>>          return kvm_get_vm_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
>>>   }
>>> @@ -2557,6 +2557,16 @@ static inline bool kvm_mem_range_is_private(struct kvm *kvm, gfn_t start,
>>>                                                    KVM_MEMORY_ATTRIBUTE_PRIVATE,
>>>                                                    KVM_MEMORY_ATTRIBUTE_PRIVATE);
>>>   }
>>> +#endif  /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
>>> +
>>> +#ifdef kvm_arch_has_private_mem
>>> +typedef bool (kvm_mem_is_private_t)(struct kvm *kvm, gfn_t gfn);
>>> +DECLARE_STATIC_CALL(__kvm_mem_is_private, kvm_mem_is_private_t);
>>> +
>>> +static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>>> +{
>>> +       return static_call(__kvm_mem_is_private)(kvm, gfn);
>>> +}
>>>   #else
>>>   static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>>>   {
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index 6669f1477013c..8b238e461b854 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -2627,6 +2627,20 @@ static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
>>>   }
>>>   #endif /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
>>>
>>> +#ifdef kvm_arch_has_private_mem
>>> +DEFINE_STATIC_CALL_RET0(__kvm_mem_is_private, kvm_mem_is_private_t);
>>> +EXPORT_STATIC_CALL_GPL(__kvm_mem_is_private);
>>> +
>>> +static void kvm_init_memory_attributes(void)
>>> +{
>>> +#ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>>> +       static_call_update(__kvm_mem_is_private, kvm_vm_mem_is_private);
>>> +#endif
>>> +}


Here ^^ as the static call update ?


Suzuki

^ permalink raw reply

* Re: [PATCH v8 13/46] KVM: guest_memfd: Add base support for KVM_SET_MEMORY_ATTRIBUTES2
From: Fuad Tabba @ 2026-06-19  9:25 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-13-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> Introduce base support for KVM_SET_MEMORY_ATTRIBUTES2 in guest_memfd, which
> just updates attributes tracked by guest_memfd.
>
> Validate input fields in general. Guard usage of KVM_SET_MEMORY_ATTRIBUTES2
> by making sure requested attributes are supported for this instance of kvm.
>
> A new KVM_SET_MEMORY_ATTRIBUTES2 is defined to support writes (unlike
> KVM_SET_MEMORY_ATTRIBUTES) in addition to reads so it can provide error
> details to userspace. This will be used in a later patch.
>
> The two ioctls use their corresponding structs with no overlap, but
> backward compatibility is baked in for future support of
> KVM_SET_MEMORY_ATTRIBUTES2 and struct kvm_memory_attributes2 in the VM
> ioctl.
>
> The process of setting memory attributes is set up such that the later half
> will not fail due to allocation. Any necessary checks are performed before
> the point of no return.
>
> Co-developed-by: Vishal Annapurve <vannapurve@google.com>
> Signed-off-by: Vishal Annapurve <vannapurve@google.com>
> Co-developed-by: Sean Christoperson <seanjc@google.com>
> Signed-off-by: Sean Christoperson <seanjc@google.com>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Note sure if it's user error on my part, if I'm applying this to the
wrong base, but I found a build break here on patch 13:
kvm_gmem_invalidate_start() doesn't exist in the base tree. The
function is kvm_gmem_invalidate_begin() here. The rename
(190cc5370a8b6) landed via a different merge path and isn't an
ancestor of the stated base.

Patches 19 and 20 have the same mismatch. Fix for all three is
s/kvm_gmem_invalidate_start/kvm_gmem_invalidate_begin/.

Cheers,
/fuad

> ---
>  include/uapi/linux/kvm.h |  13 ++++++
>  virt/kvm/Kconfig         |   1 +
>  virt/kvm/guest_memfd.c   | 116 +++++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/kvm_main.c      |  12 +++++
>  4 files changed, 142 insertions(+)
>
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 419011097fa8e..956877a6aab05 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1649,6 +1649,19 @@ struct kvm_memory_attributes {
>         __u64 flags;
>  };
>
> +#define KVM_SET_MEMORY_ATTRIBUTES2              _IOWR(KVMIO,  0xd2, struct kvm_memory_attributes2)
> +
> +struct kvm_memory_attributes2 {
> +       union {
> +               __u64 address;
> +               __u64 offset;
> +       };
> +       __u64 size;
> +       __u64 attributes;
> +       __u64 flags;
> +       __u64 reserved[12];
> +};
> +
>  #define KVM_MEMORY_ATTRIBUTE_PRIVATE           (1ULL << 3)
>
>  #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO,  0xd4, struct kvm_create_guest_memfd)
> diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
> index 297e4399fbd49..cfa2c78ba5fb9 100644
> --- a/virt/kvm/Kconfig
> +++ b/virt/kvm/Kconfig
> @@ -102,6 +102,7 @@ config KVM_MMU_LOCKLESS_AGING
>
>  config KVM_GUEST_MEMFD
>         select XARRAY_MULTI
> +       select KVM_MEMORY_ATTRIBUTES
>         bool
>
>  config HAVE_KVM_ARCH_GMEM_PREPARE
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 65ce795c090d9..0d14548c1ed22 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -541,11 +541,127 @@ bool kvm_gmem_is_private(struct kvm *kvm, gfn_t gfn)
>  }
>  EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_gmem_is_private);
>
> +/*
> + * Preallocate memory for attributes to be stored on a maple tree, pointed to
> + * by mas.  Adjacent ranges with attributes identical to the new attributes
> + * will be merged.  Also sets mas's bounds up for storing attributes.
> + *
> + * This maintains the invariant that ranges with the same attributes will
> + * always be merged.
> + */
> +static int kvm_gmem_mas_preallocate(struct ma_state *mas, u64 attributes,
> +                                   pgoff_t start, size_t nr_pages)
> +{
> +       pgoff_t end = start + nr_pages;
> +       pgoff_t last = end - 1;
> +       void *entry;
> +
> +       /* Try extending range. entry is NULL on overflow/wrap-around. */
> +       mas_set(mas, end);
> +       entry = mas_find(mas, end);
> +       if (entry && xa_to_value(entry) == attributes)
> +               last = mas->last;
> +
> +       if (start > 0) {
> +               mas_set(mas, start - 1);
> +               entry = mas_find(mas, start - 1);
> +               if (entry && xa_to_value(entry) == attributes)
> +                       start = mas->index;
> +       }
> +
> +       mas_set_range(mas, start, last);
> +       return mas_preallocate(mas, xa_mk_value(attributes), GFP_KERNEL);
> +}
> +
> +static int __kvm_gmem_set_attributes(struct inode *inode, pgoff_t start,
> +                                    size_t nr_pages, uint64_t attrs)
> +{
> +       struct address_space *mapping = inode->i_mapping;
> +       struct gmem_inode *gi = GMEM_I(inode);
> +       pgoff_t end = start + nr_pages;
> +       struct maple_tree *mt;
> +       struct ma_state mas;
> +       int r;
> +
> +       mt = &gi->attributes;
> +
> +       filemap_invalidate_lock(mapping);
> +
> +       mas_init(&mas, mt, start);
> +       r = kvm_gmem_mas_preallocate(&mas, attrs, start, nr_pages);
> +       if (r)
> +               goto out;
> +
> +       /*
> +        * From this point on guest_memfd has performed necessary
> +        * checks and can proceed to do guest-breaking changes.
> +        */
> +
> +       kvm_gmem_invalidate_start(inode, start, end);
> +       mas_store_prealloc(&mas, xa_mk_value(attrs));
> +       kvm_gmem_invalidate_end(inode, start, end);
> +out:
> +       filemap_invalidate_unlock(mapping);
> +       return r;
> +}
> +
> +static long kvm_gmem_set_attributes(struct file *file, void __user *argp)
> +{
> +       struct gmem_file *f = file->private_data;
> +       struct inode *inode = file_inode(file);
> +       struct kvm_memory_attributes2 attrs;
> +       size_t nr_pages;
> +       pgoff_t index;
> +       int i;
> +
> +       if (copy_from_user(&attrs, argp, sizeof(attrs)))
> +               return -EFAULT;
> +
> +       if (attrs.flags)
> +               return -EINVAL;
> +       for (i = 0; i < ARRAY_SIZE(attrs.reserved); i++) {
> +               if (attrs.reserved[i])
> +                       return -EINVAL;
> +       }
> +       if (!kvm_arch_has_private_mem(f->kvm))
> +               return -EINVAL;
> +       if (attrs.attributes & ~KVM_MEMORY_ATTRIBUTE_PRIVATE)
> +               return -EINVAL;
> +       if (attrs.size == 0 || attrs.offset + attrs.size < attrs.offset)
> +               return -EINVAL;
> +       if (!PAGE_ALIGNED(attrs.offset) || !PAGE_ALIGNED(attrs.size))
> +               return -EINVAL;
> +
> +       if (attrs.offset >= i_size_read(inode) ||
> +           attrs.offset + attrs.size > i_size_read(inode))
> +               return -EINVAL;
> +
> +       nr_pages = attrs.size >> PAGE_SHIFT;
> +       index = attrs.offset >> PAGE_SHIFT;
> +       return __kvm_gmem_set_attributes(inode, index, nr_pages,
> +                                        attrs.attributes);
> +}
> +
> +static long kvm_gmem_ioctl(struct file *file, unsigned int ioctl,
> +                          unsigned long arg)
> +{
> +       switch (ioctl) {
> +       case KVM_SET_MEMORY_ATTRIBUTES2:
> +               if (!gmem_in_place_conversion)
> +                       return -ENOTTY;
> +
> +               return kvm_gmem_set_attributes(file, (void __user *)arg);
> +       default:
> +               return -ENOTTY;
> +       }
> +}
> +
>  static struct file_operations kvm_gmem_fops = {
>         .mmap           = kvm_gmem_mmap,
>         .open           = generic_file_open,
>         .release        = kvm_gmem_release,
>         .fallocate      = kvm_gmem_fallocate,
> +       .unlocked_ioctl = kvm_gmem_ioctl,
>  };
>
>  static int kvm_gmem_migrate_folio(struct address_space *mapping,
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 01761f6e25d25..a08b518cdb175 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -105,6 +105,18 @@ module_param(allow_unsafe_mappings, bool, 0444);
>  bool __ro_after_init gmem_in_place_conversion = false;
>  #endif
>
> +#define MEMORY_ATTRIBUTES_MATCH(one, two)                              \
> +       static_assert(offsetof(struct kvm_memory_attributes, one) ==    \
> +                     offsetof(struct kvm_memory_attributes2, two));    \
> +       static_assert(sizeof_field(struct kvm_memory_attributes, one) ==\
> +                     sizeof_field(struct kvm_memory_attributes2, two))
> +
> +/* Ensure the common parts of the two structs are identical. */
> +MEMORY_ATTRIBUTES_MATCH(address, address);
> +MEMORY_ATTRIBUTES_MATCH(size, size);
> +MEMORY_ATTRIBUTES_MATCH(attributes, attributes);
> +MEMORY_ATTRIBUTES_MATCH(flags, flags);
> +
>  /*
>   * Ordering of locks:
>   *
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH] rv: update rvgen monitor synthesis documentation path
From: Gabriele Monaco @ 2026-06-19  8:52 UTC (permalink / raw)
  To: lucayu.alight; +Cc: linux-trace-kernel, linux-kernel, Steven Rostedt
In-Reply-To: <20260618-rvgen-doc-path-v1-1-9bcf0148417a@gmail.com>

On Thu, 2026-06-18 at 21:45 +0800, Yu Chuanyu via B4 Relay wrote:
> From: Yu Chuanyu <lucayu.alight@gmail.com>
> 
> The rvgen source comments still refer to da_monitor_synthesis.rst,
> which
> no longer exists. The documentation is now available in
> monitor_synthesis.rst. Update both references to point to the current
> file.
> 
> Signed-off-by: Yu Chuanyu <lucayu.alight@gmail.com>

Thanks for catching this!

Acked-by: Gabriele Monaco <gmonaco@redhat.com>

> ---
>  tools/verification/rvgen/__main__.py    | 2 +-
>  tools/verification/rvgen/rvgen/dot2k.py | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/verification/rvgen/__main__.py
> b/tools/verification/rvgen/__main__.py
> index 5c923dc..2a2bb03 100644
> --- a/tools/verification/rvgen/__main__.py
> +++ b/tools/verification/rvgen/__main__.py
> @@ -6,7 +6,7 @@
>  # dot2k: transform dot files into a monitor for the Linux kernel.
>  #
>  # For further information, see:
> -#   Documentation/trace/rv/da_monitor_synthesis.rst
> +#   Documentation/trace/rv/monitor_synthesis.rst
>  
>  if __name__ == '__main__':
>      from rvgen.dot2k import da2k, ha2k
> diff --git a/tools/verification/rvgen/rvgen/dot2k.py
> b/tools/verification/rvgen/rvgen/dot2k.py
> index 110cfd6..326984f 100644
> --- a/tools/verification/rvgen/rvgen/dot2k.py
> +++ b/tools/verification/rvgen/rvgen/dot2k.py
> @@ -6,7 +6,7 @@
>  # dot2k: transform dot files into a monitor for the Linux kernel.
>  #
>  # For further information, see:
> -#   Documentation/trace/rv/da_monitor_synthesis.rst
> +#   Documentation/trace/rv/monitor_synthesis.rst
>  
>  from collections import deque
>  from .dot2c import Dot2c
> 
> ---
> base-commit: e771677c937da5808f7b6c1f0e4a97ec1a84f8a8
> change-id: 20260618-rvgen-doc-path-11695c57153d
> 
> Best regards,
> --  
> Yu Chuanyu <lucayu.alight@gmail.com>
> 
> 


^ permalink raw reply

* Re: [PATCH v8 10/46] KVM: guest_memfd: Wire up core private/shared attribute interfaces
From: Fuad Tabba @ 2026-06-19  8:34 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-10-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> With in-place conversion, guest_memfd is able to track the private/shared
> status of memory. Use a global flag to toggle between tracking
> private/shared status per-vm or within guest_memfd.
>
> When queried for supported vm memory attributes, return 0 if attributes are
> tracked in guest_memfd.
>
> When querying for memory attributes over a range, look up memory attributes
> based on the flag's state at query time.
>
> For per-GFN memory attribute queries, choosing an implementation (VM or
> guest_memfd lookup) at KVM load time.
>
> The flag is always false for now and will be made toggle-able after all
> in-place conversion features are added in subsequent patches.
>
> If/since the flag is false, if CONFIG_KVM_VM_MEMORY_ATTRIBUTES is also not
> selected, the per-GFN memory attribute query defaults to returning
> 0 (false/not private).
>
> Co-developed-by: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad

> ---
>  include/linux/kvm_host.h |  4 ++++
>  virt/kvm/guest_memfd.c   | 22 +++++++++++++++++++---
>  virt/kvm/kvm_main.c      | 12 +++++++++++-
>  3 files changed, 34 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 27687fb9d5201..acb552745b428 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2560,6 +2560,8 @@ static inline bool kvm_mem_range_is_private(struct kvm *kvm, gfn_t start,
>  #endif  /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
>
>  #ifdef kvm_arch_has_private_mem
> +extern bool gmem_in_place_conversion;
> +
>  typedef bool (kvm_mem_is_private_t)(struct kvm *kvm, gfn_t gfn);
>  DECLARE_STATIC_CALL(__kvm_mem_is_private, kvm_mem_is_private_t);
>
> @@ -2568,6 +2570,8 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>         return static_call(__kvm_mem_is_private)(kvm, gfn);
>  }
>  #else
> +#define gmem_in_place_conversion false
> +
>  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>  {
>         return false;
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index bca912db5be6e..e0e544ef47d69 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -926,6 +926,24 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>  EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_gmem_get_pfn);
>
>  #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_POPULATE
> +static bool kvm_gmem_range_is_private(struct file *file, pgoff_t index,
> +                                     size_t nr_pages, struct kvm *kvm, gfn_t gfn)
> +{
> +       struct maple_tree *mt = &GMEM_I(file_inode(file))->attributes;
> +       pgoff_t end = index + nr_pages - 1;
> +       void *entry;
> +
> +       if (!gmem_in_place_conversion)
> +               return kvm_range_has_vm_memory_attributes(kvm, gfn, gfn + nr_pages,
> +                                                         KVM_MEMORY_ATTRIBUTE_PRIVATE,
> +                                                         KVM_MEMORY_ATTRIBUTE_PRIVATE);
> +
> +       mt_for_each(mt, entry, index, end) {
> +               if (xa_to_value(entry) != KVM_MEMORY_ATTRIBUTE_PRIVATE)
> +                       return false;
> +       }
> +       return true;
> +}
>
>  static long __kvm_gmem_populate(struct kvm *kvm, struct kvm_memory_slot *slot,
>                                 struct file *file, gfn_t gfn, struct page *src_page,
> @@ -946,9 +964,7 @@ static long __kvm_gmem_populate(struct kvm *kvm, struct kvm_memory_slot *slot,
>
>         folio_unlock(folio);
>
> -       if (!kvm_range_has_vm_memory_attributes(kvm, gfn, gfn + 1,
> -                                               KVM_MEMORY_ATTRIBUTE_PRIVATE,
> -                                               KVM_MEMORY_ATTRIBUTE_PRIVATE)) {
> +       if (!kvm_gmem_range_is_private(file, index, 1, kvm, gfn)) {
>                 ret = -EINVAL;
>                 goto out_put_folio;
>         }
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 8b238e461b854..01761f6e25d25 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -101,6 +101,10 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(halt_poll_ns_shrink);
>  static bool __ro_after_init allow_unsafe_mappings;
>  module_param(allow_unsafe_mappings, bool, 0444);
>
> +#ifdef kvm_arch_has_private_mem
> +bool __ro_after_init gmem_in_place_conversion = false;
> +#endif
> +
>  /*
>   * Ordering of locks:
>   *
> @@ -2422,6 +2426,9 @@ static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm,
>  static u64 kvm_supported_vm_mem_attributes(struct kvm *kvm)
>  {
>  #ifdef kvm_arch_has_private_mem
> +       if (gmem_in_place_conversion)
> +               return 0;
> +
>         if (!kvm || kvm_arch_has_private_mem(kvm))
>                 return KVM_MEMORY_ATTRIBUTE_PRIVATE;
>  #endif
> @@ -2633,8 +2640,11 @@ EXPORT_STATIC_CALL_GPL(__kvm_mem_is_private);
>
>  static void kvm_init_memory_attributes(void)
>  {
> +       if (gmem_in_place_conversion)
> +               static_call_update(__kvm_mem_is_private, kvm_gmem_is_private);
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> -       static_call_update(__kvm_mem_is_private, kvm_vm_mem_is_private);
> +       else
> +               static_call_update(__kvm_mem_is_private, kvm_vm_mem_is_private);
>  #endif
>  }
>  #else
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v5 1/7] tracing/events: Fix to check the simple_tsk_fn creation
From: Masami Hiramatsu @ 2026-06-19  8:33 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Steven Rostedt, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
	linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <178165817322.269421.3992299509400184196.stgit@devnote2>

Let me pick this fix to probes/core.

Thanks,

On Wed, 17 Jun 2026 10:02:53 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Sashiko pointed that this sample code does not correctly handle the
> failure of thread creation because kthread_run() can return -errno.
> 
> Check the simple_tsk_fn is correctly initialized (created) or not.
> 
> Link: https://sashiko.dev/#/patchset/178092865666.163648.10457567771536160909.stgit%40devnote2
> 
> Fixes: 9cfe06f8cd5c ("tracing/events: add trace-events-sample")
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  Changes in v4:
>    - Fix to remove decrementing counter in error path, since foo_bar_reg() always returns 0.
>    - Add a newline to error message.
>  Changes in v3:
>    - Recover the usage counter.
> ---
>  samples/trace_events/trace-events-sample.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/samples/trace_events/trace-events-sample.c b/samples/trace_events/trace-events-sample.c
> index ecc7db237f2e..0b7a6efdb247 100644
> --- a/samples/trace_events/trace-events-sample.c
> +++ b/samples/trace_events/trace-events-sample.c
> @@ -107,6 +107,10 @@ int foo_bar_reg(void)
>  	 * for consistency sake, we still take the thread_mutex.
>  	 */
>  	simple_tsk_fn = kthread_run(simple_thread_fn, NULL, "event-sample-fn");
> +	if (IS_ERR_OR_NULL(simple_tsk_fn)) {
> +		pr_err("Failed to create simple_thread_fn\n");
> +		simple_tsk_fn = NULL;
> +	}
>   out:
>  	mutex_unlock(&thread_mutex);
>  	return 0;
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply

* Re: [PATCH v8 09/46] KVM: guest_memfd: Introduce function to check GFN private/shared status
From: Fuad Tabba @ 2026-06-19  8:25 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-9-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> Introduce function for KVM to check the private/shared status of guest
> memory at a given GFN.
>
> This will be used in a later patch.
>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
> Co-developed-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  include/linux/kvm_host.h |  2 ++
>  virt/kvm/guest_memfd.c   | 31 +++++++++++++++++++++++++++++++
>  2 files changed, 33 insertions(+)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 3915da2a61778..27687fb9d5201 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2575,6 +2575,8 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>  #endif /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
>
>  #ifdef CONFIG_KVM_GUEST_MEMFD
> +bool kvm_gmem_is_private(struct kvm *kvm, gfn_t gfn);
> +
>  int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot,
>                      gfn_t gfn, kvm_pfn_t *pfn, struct page **page,
>                      int *max_order);
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index 8101f64e0366f..bca912db5be6e 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -510,6 +510,37 @@ static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma)
>         return 0;
>  }
>
> +bool kvm_gmem_is_private(struct kvm *kvm, gfn_t gfn)
> +{
> +       struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
> +       struct inode *inode;
> +
> +       /*
> +        * If this gfn has no associated memslot, there's no chance of the gfn
> +        * being backed by private memory, since guest_memfd must be used for
> +        * private memory, and guest_memfd must be associated with some memslot.
> +        */
> +       if (!slot)
> +               return 0;
> +
> +       CLASS(gmem_get_file, file)(slot);
> +       if (!file)
> +               return 0;
> +
> +       inode = file_inode(file);
> +
> +       /*
> +        * Rely on the maple tree's internal RCU lock to ensure a
> +        * stable result. This result can become stale as soon as the
> +        * lock is dropped, so the caller _must_ still protect
> +        * consumption of private vs. shared by checking
> +        * mmu_invalidate_retry_gfn() under mmu_lock to serialize
> +        * against ongoing attribute updates.
> +        */
> +       return kvm_gmem_is_private_mem(inode, kvm_gmem_get_index(slot, gfn));
> +}
> +EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_gmem_is_private);
> +
>  static struct file_operations kvm_gmem_fops = {
>         .mmap           = kvm_gmem_mmap,
>         .open           = generic_file_open,
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 08/46] KVM: Provide generic interface for checking memory private/shared status
From: Fuad Tabba @ 2026-06-19  8:21 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <CA+EHjTw6x-mxDnJjnhE-6SV73tMrb0paKDTtOC2j6zJ1fXZDLA@mail.gmail.com>

On Fri, 19 Jun 2026 at 09:19, Fuad Tabba <tabba@google.com> wrote:
>
> On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
> <devnull+ackerleytng.google.com@kernel.org> wrote:
> >
> > From: Sean Christopherson <seanjc@google.com>
> >
> > Introduce a generic kvm_mem_is_private() interface using a static call to
> > determine if a GFN is private. This allows the implementation for checking
> > a GFN's private/shared status to be set at runtime.
> >
> > In preparation for choosing implementations between a guest_memfd lookup
> > and the existing VM attribute lookup, rename the existing
> > VM-attribute-based check to kvm_vm_mem_is_private to emphasize that it
> > looks up VM attributes.
> >
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
>
> (SoB fix plz)
>
> Reviewed-by: Fuad Tabba <tabba@google.com>
>
> Cheers,
> /fuad
> > ---
> >  include/linux/kvm_host.h | 12 +++++++++++-
> >  virt/kvm/kvm_main.c      | 15 +++++++++++++++
> >  2 files changed, 26 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index eb26d4ea8945a..3915da2a61778 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -2546,7 +2546,7 @@ bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm,
> >  bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
> >                                          struct kvm_gfn_range *range);
> >
> > -static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> > +static inline bool kvm_vm_mem_is_private(struct kvm *kvm, gfn_t gfn)

Should have read the Sashiko review first, but where is this used?
It's not used at all in this series...

/fuad

> >  {
> >         return kvm_get_vm_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
> >  }
> > @@ -2557,6 +2557,16 @@ static inline bool kvm_mem_range_is_private(struct kvm *kvm, gfn_t start,
> >                                                   KVM_MEMORY_ATTRIBUTE_PRIVATE,
> >                                                   KVM_MEMORY_ATTRIBUTE_PRIVATE);
> >  }
> > +#endif  /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
> > +
> > +#ifdef kvm_arch_has_private_mem
> > +typedef bool (kvm_mem_is_private_t)(struct kvm *kvm, gfn_t gfn);
> > +DECLARE_STATIC_CALL(__kvm_mem_is_private, kvm_mem_is_private_t);
> > +
> > +static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> > +{
> > +       return static_call(__kvm_mem_is_private)(kvm, gfn);
> > +}
> >  #else
> >  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> >  {
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 6669f1477013c..8b238e461b854 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -2627,6 +2627,20 @@ static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
> >  }
> >  #endif /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
> >
> > +#ifdef kvm_arch_has_private_mem
> > +DEFINE_STATIC_CALL_RET0(__kvm_mem_is_private, kvm_mem_is_private_t);
> > +EXPORT_STATIC_CALL_GPL(__kvm_mem_is_private);
> > +
> > +static void kvm_init_memory_attributes(void)
> > +{
> > +#ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> > +       static_call_update(__kvm_mem_is_private, kvm_vm_mem_is_private);
> > +#endif
> > +}
> > +#else
> > +static void kvm_init_memory_attributes(void) { }
> > +#endif
> > +
> >  struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
> >  {
> >         return __gfn_to_memslot(kvm_memslots(kvm), gfn);
> > @@ -6528,6 +6542,7 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
> >         kvm_preempt_ops.sched_in = kvm_sched_in;
> >         kvm_preempt_ops.sched_out = kvm_sched_out;
> >
> > +       kvm_init_memory_attributes();
> >         kvm_init_debug();
> >
> >         r = kvm_vfio_ops_init();
> >
> > --
> > 2.55.0.rc0.738.g0c8ab3ebcc-goog
> >
> >

^ permalink raw reply

* Re: [PATCH v8 08/46] KVM: Provide generic interface for checking memory private/shared status
From: Fuad Tabba @ 2026-06-19  8:19 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-8-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> Introduce a generic kvm_mem_is_private() interface using a static call to
> determine if a GFN is private. This allows the implementation for checking
> a GFN's private/shared status to be set at runtime.
>
> In preparation for choosing implementations between a guest_memfd lookup
> and the existing VM attribute lookup, rename the existing
> VM-attribute-based check to kvm_vm_mem_is_private to emphasize that it
> looks up VM attributes.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

(SoB fix plz)

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  include/linux/kvm_host.h | 12 +++++++++++-
>  virt/kvm/kvm_main.c      | 15 +++++++++++++++
>  2 files changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index eb26d4ea8945a..3915da2a61778 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2546,7 +2546,7 @@ bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm,
>  bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
>                                          struct kvm_gfn_range *range);
>
> -static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> +static inline bool kvm_vm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>  {
>         return kvm_get_vm_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
>  }
> @@ -2557,6 +2557,16 @@ static inline bool kvm_mem_range_is_private(struct kvm *kvm, gfn_t start,
>                                                   KVM_MEMORY_ATTRIBUTE_PRIVATE,
>                                                   KVM_MEMORY_ATTRIBUTE_PRIVATE);
>  }
> +#endif  /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
> +
> +#ifdef kvm_arch_has_private_mem
> +typedef bool (kvm_mem_is_private_t)(struct kvm *kvm, gfn_t gfn);
> +DECLARE_STATIC_CALL(__kvm_mem_is_private, kvm_mem_is_private_t);
> +
> +static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> +{
> +       return static_call(__kvm_mem_is_private)(kvm, gfn);
> +}
>  #else
>  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>  {
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 6669f1477013c..8b238e461b854 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2627,6 +2627,20 @@ static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
>  }
>  #endif /* CONFIG_KVM_VM_MEMORY_ATTRIBUTES */
>
> +#ifdef kvm_arch_has_private_mem
> +DEFINE_STATIC_CALL_RET0(__kvm_mem_is_private, kvm_mem_is_private_t);
> +EXPORT_STATIC_CALL_GPL(__kvm_mem_is_private);
> +
> +static void kvm_init_memory_attributes(void)
> +{
> +#ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> +       static_call_update(__kvm_mem_is_private, kvm_vm_mem_is_private);
> +#endif
> +}
> +#else
> +static void kvm_init_memory_attributes(void) { }
> +#endif
> +
>  struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn)
>  {
>         return __gfn_to_memslot(kvm_memslots(kvm), gfn);
> @@ -6528,6 +6542,7 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module)
>         kvm_preempt_ops.sched_in = kvm_sched_in;
>         kvm_preempt_ops.sched_out = kvm_sched_out;
>
> +       kvm_init_memory_attributes();
>         kvm_init_debug();
>
>         r = kvm_vfio_ops_init();
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 07/46] KVM: Rename memory attribute APIs to prepare for in-place gmem conversion
From: Fuad Tabba @ 2026-06-19  8:16 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-7-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> Rename memory attribute APIs to add a "vm_" in the name in anticipation of
> moving PRIVATE tracking into guest_memfd, to allow in-place conversion
> between SHARED and PRIVATE.  At that point, there will effectively be two
> (potential) sources of memory attributes: the VM and guest_memfd.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Missing SoB (other patches as well, I won't mention it again). But for
this (and other patches I review with a missing SoB fixed):

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  arch/x86/kvm/mmu/mmu.c   |  6 +++---
>  include/linux/kvm_host.h | 15 +++++++++++----
>  virt/kvm/guest_memfd.c   |  6 +++---
>  virt/kvm/kvm_main.c      | 16 ++++++++--------
>  4 files changed, 25 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e0005a21b6e22..cbc50aef801fb 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -8087,11 +8087,11 @@ static bool hugepage_has_attrs(struct kvm *kvm, struct kvm_memory_slot *slot,
>         const unsigned long end = start + KVM_PAGES_PER_HPAGE(level);
>
>         if (level == PG_LEVEL_2M)
> -               return kvm_range_has_memory_attributes(kvm, start, end, ~0, attrs);
> +               return kvm_range_has_vm_memory_attributes(kvm, start, end, ~0, attrs);
>
>         for (gfn = start; gfn < end; gfn += KVM_PAGES_PER_HPAGE(level - 1)) {
>                 if (hugepage_test_mixed(slot, gfn, level - 1) ||
> -                   attrs != kvm_get_memory_attributes(kvm, gfn))
> +                   attrs != kvm_get_vm_memory_attributes(kvm, gfn))
>                         return false;
>         }
>         return true;
> @@ -8191,7 +8191,7 @@ void kvm_mmu_init_memslot_memory_attributes(struct kvm *kvm,
>                  * be manually checked as the attributes may already be mixed.
>                  */
>                 for (gfn = start; gfn < end; gfn += nr_pages) {
> -                       unsigned long attrs = kvm_get_memory_attributes(kvm, gfn);
> +                       unsigned long attrs = kvm_get_vm_memory_attributes(kvm, gfn);
>
>                         if (hugepage_has_attrs(kvm, slot, gfn, level, attrs))
>                                 hugepage_clear_mixed(slot, gfn, level);
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index d370e834d619e..eb26d4ea8945a 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -2534,13 +2534,13 @@ static inline bool kvm_memslot_is_gmem_only(const struct kvm_memory_slot *slot)
>  }
>
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> -static inline unsigned long kvm_get_memory_attributes(struct kvm *kvm, gfn_t gfn)
> +static inline unsigned long kvm_get_vm_memory_attributes(struct kvm *kvm, gfn_t gfn)
>  {
>         return xa_to_value(xa_load(&kvm->mem_attr_array, gfn));
>  }
>
> -bool kvm_range_has_memory_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> -                                    unsigned long mask, unsigned long attrs);
> +bool kvm_range_has_vm_memory_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> +                                       unsigned long mask, unsigned long attrs);
>  bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm,
>                                         struct kvm_gfn_range *range);
>  bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
> @@ -2548,7 +2548,14 @@ bool kvm_arch_post_set_memory_attributes(struct kvm *kvm,
>
>  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
>  {
> -       return kvm_get_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
> +       return kvm_get_vm_memory_attributes(kvm, gfn) & KVM_MEMORY_ATTRIBUTE_PRIVATE;
> +}
> +static inline bool kvm_mem_range_is_private(struct kvm *kvm, gfn_t start,
> +                                           gfn_t end)
> +{
> +       return kvm_range_has_vm_memory_attributes(kvm, start, end,
> +                                                 KVM_MEMORY_ATTRIBUTE_PRIVATE,
> +                                                 KVM_MEMORY_ATTRIBUTE_PRIVATE);
>  }
>  #else
>  static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index b4c24fdf159f6..8101f64e0366f 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -915,9 +915,9 @@ static long __kvm_gmem_populate(struct kvm *kvm, struct kvm_memory_slot *slot,
>
>         folio_unlock(folio);
>
> -       if (!kvm_range_has_memory_attributes(kvm, gfn, gfn + 1,
> -                                            KVM_MEMORY_ATTRIBUTE_PRIVATE,
> -                                            KVM_MEMORY_ATTRIBUTE_PRIVATE)) {
> +       if (!kvm_range_has_vm_memory_attributes(kvm, gfn, gfn + 1,
> +                                               KVM_MEMORY_ATTRIBUTE_PRIVATE,
> +                                               KVM_MEMORY_ATTRIBUTE_PRIVATE)) {
>                 ret = -EINVAL;
>                 goto out_put_folio;
>         }
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 7b989b659cf82..6669f1477013c 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2419,7 +2419,7 @@ static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm,
>  #endif /* CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */
>
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> -static u64 kvm_supported_mem_attributes(struct kvm *kvm)
> +static u64 kvm_supported_vm_mem_attributes(struct kvm *kvm)
>  {
>  #ifdef kvm_arch_has_private_mem
>         if (!kvm || kvm_arch_has_private_mem(kvm))
> @@ -2433,19 +2433,19 @@ static u64 kvm_supported_mem_attributes(struct kvm *kvm)
>   * Returns true if _all_ gfns in the range [@start, @end) have attributes
>   * such that the bits in @mask match @attrs.
>   */
> -bool kvm_range_has_memory_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> -                                    unsigned long mask, unsigned long attrs)
> +bool kvm_range_has_vm_memory_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
> +                                       unsigned long mask, unsigned long attrs)
>  {
>         XA_STATE(xas, &kvm->mem_attr_array, start);
>         unsigned long index;
>         void *entry;
>
> -       mask &= kvm_supported_mem_attributes(kvm);
> +       mask &= kvm_supported_vm_mem_attributes(kvm);
>         if (attrs & ~mask)
>                 return false;
>
>         if (end == start + 1)
> -               return (kvm_get_memory_attributes(kvm, start) & mask) == attrs;
> +               return (kvm_get_vm_memory_attributes(kvm, start) & mask) == attrs;
>
>         guard(rcu)();
>         if (!attrs)
> @@ -2567,7 +2567,7 @@ static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end,
>         mutex_lock(&kvm->slots_lock);
>
>         /* Nothing to do if the entire range has the desired attributes. */
> -       if (kvm_range_has_memory_attributes(kvm, start, end, ~0, attributes))
> +       if (kvm_range_has_vm_memory_attributes(kvm, start, end, ~0, attributes))
>                 goto out_unlock;
>
>         /*
> @@ -2606,7 +2606,7 @@ static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm,
>         /* flags is currently not used. */
>         if (attrs->flags)
>                 return -EINVAL;
> -       if (attrs->attributes & ~kvm_supported_mem_attributes(kvm))
> +       if (attrs->attributes & ~kvm_supported_vm_mem_attributes(kvm))
>                 return -EINVAL;
>         if (attrs->size == 0 || attrs->address + attrs->size < attrs->address)
>                 return -EINVAL;
> @@ -4926,7 +4926,7 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
>                 return 1;
>  #ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>         case KVM_CAP_MEMORY_ATTRIBUTES:
> -               return kvm_supported_mem_attributes(kvm);
> +               return kvm_supported_vm_mem_attributes(kvm);
>  #endif
>  #ifdef CONFIG_KVM_GUEST_MEMFD
>         case KVM_CAP_GUEST_MEMFD:
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 05/46] KVM: Make CONFIG_KVM_VM_MEMORY_ATTRIBUTES selectable
From: Fuad Tabba @ 2026-06-19  8:12 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-5-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Ackerley Tng <ackerleytng@google.com>
>
> Make CONFIG_KVM_VM_MEMORY_ATTRIBUTES selectable, only for (CoCo) VM types
> that might use vm_memory_attributes.
>
> Also document CONFIG_KVM_VM_MEMORY_ATTRIBUTES to specifically be about the
> private/shared attribute.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

You're missing a SoB, but with that fixed:

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad

> ---
>  arch/x86/kvm/Kconfig | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 24f96396cfa1c..c28393dc664eb 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -81,13 +81,16 @@ config KVM_WERROR
>           If in doubt, say "N".
>
>  config KVM_VM_MEMORY_ATTRIBUTES
> -       bool
> +       depends on KVM_SW_PROTECTED_VM || KVM_INTEL_TDX || KVM_AMD_SEV
> +       bool "Enable per-VM PRIVATE vs. SHARED attributes (for CoCo VMs)"
> +       help
> +         Enable support for tracking PRIVATE vs. SHARED memory using per-VM
> +         memory attributes.
>
>  config KVM_SW_PROTECTED_VM
>         bool "Enable support for KVM software-protected VMs"
>         depends on EXPERT
>         depends on KVM_X86 && X86_64
> -       select KVM_VM_MEMORY_ATTRIBUTES
>         help
>           Enable support for KVM software-protected VMs.  Currently, software-
>           protected VMs are purely a development and testing vehicle for
> @@ -138,7 +141,6 @@ config KVM_INTEL_TDX
>         bool "Intel Trust Domain Extensions (TDX) support"
>         default y
>         depends on INTEL_TDX_HOST
> -       select KVM_VM_MEMORY_ATTRIBUTES
>         select HAVE_KVM_ARCH_GMEM_POPULATE
>         help
>           Provides support for launching Intel Trust Domain Extensions (TDX)
> @@ -162,7 +164,6 @@ config KVM_AMD_SEV
>         depends on KVM_AMD && X86_64
>         depends on CRYPTO_DEV_SP_PSP && !(KVM_AMD=y && CRYPTO_DEV_CCP_DD=m)
>         select ARCH_HAS_CC_PLATFORM
> -       select KVM_VM_MEMORY_ATTRIBUTES
>         select HAVE_KVM_ARCH_GMEM_PREPARE
>         select HAVE_KVM_ARCH_GMEM_INVALIDATE
>         select HAVE_KVM_ARCH_GMEM_POPULATE
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* Re: [PATCH v8 04/46] KVM: Decouple kvm_has_arch_private_mem from CONFIG_KVM_VM_MEMORY_ATTRIBUTES
From: Fuad Tabba @ 2026-06-19  8:10 UTC (permalink / raw)
  To: ackerleytng
  Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
	jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
	rick.p.edgecombe, rientjes, shivankg, steven.price, willy, wyihan,
	yan.y.zhao, forkloop, pratyush, suzuki.poulose, aneesh.kumar,
	liam, Paolo Bonzini, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
	Jonathan Corbet, Shuah Khan, Shuah Khan, Vishal Annapurve,
	Andrew Morton, Chris Li, Kairui Song, Kemeng Shi, Nhat Pham,
	Barry Song, Axel Rasmussen, Yuanchu Xie, Wei Xu, Youngjun Park,
	Qi Zheng, Shakeel Butt, Kiryl Shutsemau, Baoquan He,
	Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
	linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
	linux-coco
In-Reply-To: <20260618-gmem-inplace-conversion-v8-4-9d2959357853@google.com>

On Fri, 19 Jun 2026 at 01:31, Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org> wrote:
>
> From: Sean Christopherson <seanjc@google.com>
>
> When memory attributes become trackable in guest_memfd, the concept of
> having private memory is no longer dependent on
> CONFIG_KVM_VM_MEMORY_ATTRIBUTES.
>
> With this, on x86, kvm_arch_has_private_mem() is defined if some CoCo
> platform support (or the testing CONFIG_KVM_SW_PROTECTED_VM) is compiled
> in.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Co-developed-by: Ackerley Tng <ackerleytng@google.com>
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Fuad Tabba <tabba@google.com>

Cheers,
/fuad
> ---
>  arch/x86/include/asm/kvm_host.h | 4 +++-
>  include/linux/kvm_host.h        | 2 +-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 8e8eb8a5e8a6b..1bde67cf6eb0e 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -2394,7 +2394,9 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
>                        int tdp_max_root_level, int tdp_huge_page_level);
>
>
> -#ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> +#if defined(CONFIG_KVM_SW_PROTECTED_VM) ||     \
> +       defined(CONFIG_KVM_INTEL_TDX) ||        \
> +       defined(CONFIG_KVM_AMD_SEV)
>  #define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem)
>  #endif
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 201d0f2143976..d370e834d619e 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -722,7 +722,7 @@ static inline int kvm_arch_vcpu_memslots_id(struct kvm_vcpu *vcpu)
>  }
>  #endif
>
> -#ifndef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
> +#ifndef kvm_arch_has_private_mem
>  static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
>  {
>         return false;
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
>

^ permalink raw reply

* [PATCH v2 4/4] rv/rtapp: Add wakeup monitor
From: Nam Cao @ 2026-06-19  7:21 UTC (permalink / raw)
  To: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-doc,
	linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781852967.git.namcao@linutronix.de>

Add a wakeup monitor to detect a lower-priority task waking up a
higher-priority task.

The rtapp/sleep monitor already detects this. However, that monitor
triggers an error in the context of the wakee task and user only gets
the stacktrace of that task. It is also extremely useful to get the
stacktrace of the waker task, which this monitor offers. In other
words, this monitor complements the rtapp/sleep monitor.

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 Documentation/trace/rv/monitor_rtapp.rst      |  20 +++
 kernel/trace/rv/Kconfig                       |   1 +
 kernel/trace/rv/Makefile                      |   1 +
 kernel/trace/rv/monitors/rtapp/Kconfig        |   2 +-
 kernel/trace/rv/monitors/wakeup/Kconfig       |  16 ++
 kernel/trace/rv/monitors/wakeup/wakeup.c      | 153 ++++++++++++++++++
 kernel/trace/rv/monitors/wakeup/wakeup.h      |  92 +++++++++++
 .../trace/rv/monitors/wakeup/wakeup_trace.h   |  14 ++
 kernel/trace/rv/rv_trace.h                    |   1 +
 tools/verification/models/rtapp/wakeup.ltl    |   5 +
 10 files changed, 304 insertions(+), 1 deletion(-)
 create mode 100644 kernel/trace/rv/monitors/wakeup/Kconfig
 create mode 100644 kernel/trace/rv/monitors/wakeup/wakeup.c
 create mode 100644 kernel/trace/rv/monitors/wakeup/wakeup.h
 create mode 100644 kernel/trace/rv/monitors/wakeup/wakeup_trace.h
 create mode 100644 tools/verification/models/rtapp/wakeup.ltl

diff --git a/Documentation/trace/rv/monitor_rtapp.rst b/Documentation/trace/rv/monitor_rtapp.rst
index 502d3ea412eb..238b59395ff5 100644
--- a/Documentation/trace/rv/monitor_rtapp.rst
+++ b/Documentation/trace/rv/monitor_rtapp.rst
@@ -124,3 +124,23 @@ to handle some special cases:
     real-time-safe because preemption is disabled for the duration.
   - `FUTEX_LOCK_PI` is included in the allowlist for the same reason as
     `BLOCK_ON_RT_MUTEX`.
+
+Monitor wakeup
+++++++++++++++
+
+The `wakeup` monitor reports real-time threads being woken by lower-priority threads,
+which is a hint of priority inversion. Its specification is::
+
+  RULE = always (((RT and USER_THREAD) imply
+                (not (WOKEN_BY_LOWER_PRIO or WOKEN_BY_SOFTIRQ)) or ALLOWLIST))
+
+  ALLOWLIST = BLOCK_ON_RT_MUTEX
+           or FUTEX_LOCK_PI
+
+The `sleep` monitor already reports this type of problem. The difference is the
+context in which the problem is reported. While the `sleep` monitor reports the problem
+in the context of the wakee, this `wakeup` monitor reports the problem in the context of
+the waker. This monitor complement the `sleep` monitor, giving user better
+understanding of the issue. For instance, to debug a lower-priority task waking a
+higher-priority task scenario, user can enable both `wakeup` monitor and `sleep`
+monitor to get the stack traces of both tasks.
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index 3884b14df375..4d3a14a0bac2 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -76,6 +76,7 @@ source "kernel/trace/rv/monitors/opid/Kconfig"
 source "kernel/trace/rv/monitors/rtapp/Kconfig"
 source "kernel/trace/rv/monitors/pagefault/Kconfig"
 source "kernel/trace/rv/monitors/sleep/Kconfig"
+source "kernel/trace/rv/monitors/wakeup/Kconfig"
 # Add new rtapp monitors here
 
 source "kernel/trace/rv/monitors/stall/Kconfig"
diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
index 94498da35b37..c2c0e4142eb4 100644
--- a/kernel/trace/rv/Makefile
+++ b/kernel/trace/rv/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_RV_MON_OPID) += monitors/opid/opid.o
 obj-$(CONFIG_RV_MON_STALL) += monitors/stall/stall.o
 obj-$(CONFIG_RV_MON_DEADLINE) += monitors/deadline/deadline.o
 obj-$(CONFIG_RV_MON_NOMISS) += monitors/nomiss/nomiss.o
+obj-$(CONFIG_RV_MON_WAKEUP) += monitors/wakeup/wakeup.o
 # Add new monitors here
 obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
 obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
diff --git a/kernel/trace/rv/monitors/rtapp/Kconfig b/kernel/trace/rv/monitors/rtapp/Kconfig
index 1ce9370a9ba8..1fcd7a400ded 100644
--- a/kernel/trace/rv/monitors/rtapp/Kconfig
+++ b/kernel/trace/rv/monitors/rtapp/Kconfig
@@ -1,6 +1,6 @@
 config RV_MON_RTAPP
 	depends on RV
-	depends on RV_PER_TASK_MONITORS >= 2
+	depends on RV_PER_TASK_MONITORS >= 3
 	bool "rtapp monitor"
 	help
 	  Collection of monitors to check for common problems with real-time
diff --git a/kernel/trace/rv/monitors/wakeup/Kconfig b/kernel/trace/rv/monitors/wakeup/Kconfig
new file mode 100644
index 000000000000..ec3a5c06a8c4
--- /dev/null
+++ b/kernel/trace/rv/monitors/wakeup/Kconfig
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_WAKEUP
+	depends on RV
+	depends on RV_MON_RTAPP
+	depends on HAVE_SYSCALL_TRACEPOINTS
+	default y
+	select LTL_MON_EVENTS_ID
+	bool "wakeup monitor"
+	help
+	  This monitor detects a lower-priority task waking up a
+	  higher-priority task. The RV_MON_SLEEP monitor already
+	  detects this case, but this monitor detects in the context
+	  of the waker task instead. This and RV_MON_SLEEP can be
+	  enabled together to get the stacktrace of both the waker
+	  task and the wakee task.
diff --git a/kernel/trace/rv/monitors/wakeup/wakeup.c b/kernel/trace/rv/monitors/wakeup/wakeup.c
new file mode 100644
index 000000000000..01b47416f24e
--- /dev/null
+++ b/kernel/trace/rv/monitors/wakeup/wakeup.c
@@ -0,0 +1,153 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "wakeup"
+
+#include <trace/events/syscalls.h>
+#include <trace/events/sched.h>
+#include <trace/events/lock.h>
+#include <uapi/linux/futex.h>
+
+#include <rv_trace.h>
+#include <monitors/rtapp/rtapp.h>
+
+
+#ifndef __NR_futex
+#define __NR_futex (-__COUNTER__)
+#endif
+#ifndef __NR_futex_time64
+#define __NR_futex_time64 (-__COUNTER__)
+#endif
+
+#include "wakeup.h"
+#include <rv/ltl_monitor.h>
+
+static void ltl_atoms_fetch(struct task_struct *task, struct ltl_monitor *mon)
+{
+	/*
+	 * This includes "actual" real-time tasks and also PI-boosted
+	 * tasks. A task being PI-boosted means it is blocking an "actual"
+	 * real-task, therefore it should also obey the monitor's rule,
+	 * otherwise the "actual" real-task may be delayed.
+	 */
+	ltl_atom_set(mon, LTL_RT, rt_or_dl_task(task));
+}
+
+static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
+{
+	ltl_atom_set(mon, LTL_WOKEN_BY_LOWER_PRIO, false);
+	ltl_atom_set(mon, LTL_WOKEN_BY_SOFTIRQ, false);
+
+	if (task_creation) {
+		ltl_atom_set(mon, LTL_BLOCK_ON_RT_MUTEX, false);
+		ltl_atom_set(mon, LTL_FUTEX_LOCK_PI, false);
+	}
+
+	ltl_atom_set(mon, LTL_USER_THREAD, !(task->flags & PF_KTHREAD));
+}
+
+static void handle_sched_waking(void *data, struct task_struct *task)
+{
+	if (in_task()) {
+		if (current->prio > task->prio)
+			ltl_atom_pulse(task, LTL_WOKEN_BY_LOWER_PRIO, true);
+	} else if (in_serving_softirq()) {
+		ltl_atom_pulse(task, LTL_WOKEN_BY_SOFTIRQ, true);
+	}
+}
+
+static void handle_contention_begin(void *data, void *lock, unsigned int flags)
+{
+	if (flags & LCB_F_RT)
+		ltl_atom_update(current, LTL_BLOCK_ON_RT_MUTEX, true);
+}
+
+static void handle_contention_end(void *data, void *lock, int ret)
+{
+	ltl_atom_update(current, LTL_BLOCK_ON_RT_MUTEX, false);
+}
+
+static void handle_sys_enter(void *data, struct pt_regs *regs, long id)
+{
+	unsigned long args[6];
+	int op, cmd;
+
+	switch (id) {
+	case __NR_futex:
+	case __NR_futex_time64:
+		syscall_get_arguments(current, regs, args);
+		op = args[1];
+		cmd = op & FUTEX_CMD_MASK;
+
+		switch (cmd) {
+		case FUTEX_LOCK_PI:
+		case FUTEX_LOCK_PI2:
+			ltl_atom_update(current, LTL_FUTEX_LOCK_PI, true);
+			break;
+		}
+		break;
+	}
+}
+
+static void handle_sys_exit(void *data, struct pt_regs *regs, long ret)
+{
+	ltl_atom_update(current, LTL_FUTEX_LOCK_PI, false);
+}
+
+static int enable_wakeup(void)
+{
+	int retval;
+
+	retval = ltl_monitor_init();
+	if (retval)
+		return retval;
+
+	rv_attach_trace_probe("rtapp_wakeup", sched_waking, handle_sched_waking);
+	rv_attach_trace_probe("rtapp_wakeup", contention_begin, handle_contention_begin);
+	rv_attach_trace_probe("rtapp_wakeup", contention_end, handle_contention_end);
+	rv_attach_trace_probe("rtapp_wakeup", sys_enter, handle_sys_enter);
+	rv_attach_trace_probe("rtapp_wakeup", sys_exit, handle_sys_exit);
+
+	return 0;
+}
+
+static void disable_wakeup(void)
+{
+	rv_detach_trace_probe("rtapp_wakeup", sched_waking, handle_sched_waking);
+	rv_detach_trace_probe("rtapp_wakeup", contention_begin, handle_contention_begin);
+	rv_detach_trace_probe("rtapp_wakeup", contention_end, handle_contention_end);
+	rv_detach_trace_probe("rtapp_wakeup", sys_enter, handle_sys_enter);
+	rv_detach_trace_probe("rtapp_wakeup", sys_exit, handle_sys_exit);
+
+	ltl_monitor_destroy();
+}
+
+static struct rv_monitor rv_wakeup = {
+	.name = "wakeup",
+	.description = "Monitor that real-time tasks are not woken by lower-priority tasks",
+	.enable = enable_wakeup,
+	.disable = disable_wakeup,
+};
+
+static int __init register_wakeup(void)
+{
+	return rv_register_monitor(&rv_wakeup, &rv_rtapp);
+}
+
+static void __exit unregister_wakeup(void)
+{
+	rv_unregister_monitor(&rv_wakeup);
+}
+
+module_init(register_wakeup);
+module_exit(unregister_wakeup);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Nam Cao <namcao@linutronix.de>");
+MODULE_DESCRIPTION("Monitor that real-time tasks are not woken by lower-priority tasks");
diff --git a/kernel/trace/rv/monitors/wakeup/wakeup.h b/kernel/trace/rv/monitors/wakeup/wakeup.h
new file mode 100644
index 000000000000..6f80da64e0e1
--- /dev/null
+++ b/kernel/trace/rv/monitors/wakeup/wakeup.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * C implementation of Buchi automaton, automatically generated by
+ * tools/verification/rvgen from the linear temporal logic specification.
+ * For further information, see kernel documentation:
+ *   Documentation/trace/rv/linear_temporal_logic.rst
+ */
+
+#include <linux/rv.h>
+
+#define MONITOR_NAME wakeup
+
+enum ltl_atom {
+	LTL_BLOCK_ON_RT_MUTEX,
+	LTL_FUTEX_LOCK_PI,
+	LTL_RT,
+	LTL_USER_THREAD,
+	LTL_WOKEN_BY_LOWER_PRIO,
+	LTL_WOKEN_BY_SOFTIRQ,
+	LTL_NUM_ATOM
+};
+static_assert(LTL_NUM_ATOM <= RV_MAX_LTL_ATOM);
+
+static const char *ltl_atom_str(enum ltl_atom atom)
+{
+	static const char *const names[] = {
+		"bl_on_rt_mu",
+		"fu_lo_pi",
+		"rt",
+		"us_th",
+		"wo_lo_pr",
+		"wo_so",
+	};
+
+	return names[atom];
+}
+
+enum ltl_buchi_state {
+	S0,
+	RV_NUM_BA_STATES
+};
+static_assert(RV_NUM_BA_STATES <= RV_MAX_BA_STATES);
+
+static void ltl_start(struct task_struct *task, struct ltl_monitor *mon)
+{
+	bool woken_by_softirq = test_bit(LTL_WOKEN_BY_SOFTIRQ, mon->atoms);
+	bool woken_by_lower_prio = test_bit(LTL_WOKEN_BY_LOWER_PRIO, mon->atoms);
+	bool user_thread = test_bit(LTL_USER_THREAD, mon->atoms);
+	bool rt = test_bit(LTL_RT, mon->atoms);
+	bool futex_lock_pi = test_bit(LTL_FUTEX_LOCK_PI, mon->atoms);
+	bool block_on_rt_mutex = test_bit(LTL_BLOCK_ON_RT_MUTEX, mon->atoms);
+	bool val9 = block_on_rt_mutex || futex_lock_pi;
+	bool val6 = !woken_by_softirq;
+	bool val5 = !woken_by_lower_prio;
+	bool val8 = val5 && val6;
+	bool val10 = val8 || val9;
+	bool val3 = !user_thread;
+	bool val2 = !rt;
+	bool val4 = val2 || val3;
+	bool val11 = val4 || val10;
+
+	if (val11)
+		__set_bit(S0, mon->states);
+}
+
+static void
+ltl_possible_next_states(struct ltl_monitor *mon, unsigned int state, unsigned long *next)
+{
+	bool woken_by_softirq = test_bit(LTL_WOKEN_BY_SOFTIRQ, mon->atoms);
+	bool woken_by_lower_prio = test_bit(LTL_WOKEN_BY_LOWER_PRIO, mon->atoms);
+	bool user_thread = test_bit(LTL_USER_THREAD, mon->atoms);
+	bool rt = test_bit(LTL_RT, mon->atoms);
+	bool futex_lock_pi = test_bit(LTL_FUTEX_LOCK_PI, mon->atoms);
+	bool block_on_rt_mutex = test_bit(LTL_BLOCK_ON_RT_MUTEX, mon->atoms);
+	bool val9 = block_on_rt_mutex || futex_lock_pi;
+	bool val6 = !woken_by_softirq;
+	bool val5 = !woken_by_lower_prio;
+	bool val8 = val5 && val6;
+	bool val10 = val8 || val9;
+	bool val3 = !user_thread;
+	bool val2 = !rt;
+	bool val4 = val2 || val3;
+	bool val11 = val4 || val10;
+
+	switch (state) {
+	case S0:
+		if (val11)
+			__set_bit(S0, next);
+		break;
+	}
+}
diff --git a/kernel/trace/rv/monitors/wakeup/wakeup_trace.h b/kernel/trace/rv/monitors/wakeup/wakeup_trace.h
new file mode 100644
index 000000000000..7e056183f920
--- /dev/null
+++ b/kernel/trace/rv/monitors/wakeup/wakeup_trace.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_WAKEUP
+DEFINE_EVENT(event_ltl_monitor_id, event_wakeup,
+	     TP_PROTO(struct task_struct *task, char *states, char *atoms, char *next),
+	     TP_ARGS(task, states, atoms, next));
+DEFINE_EVENT(error_ltl_monitor_id, error_wakeup,
+	     TP_PROTO(struct task_struct *task),
+	     TP_ARGS(task));
+#endif /* CONFIG_RV_MON_WAKEUP */
diff --git a/kernel/trace/rv/rv_trace.h b/kernel/trace/rv/rv_trace.h
index 9622c269789c..2f8a932432c9 100644
--- a/kernel/trace/rv/rv_trace.h
+++ b/kernel/trace/rv/rv_trace.h
@@ -241,6 +241,7 @@ DECLARE_EVENT_CLASS(error_ltl_monitor_id,
 );
 #include <monitors/pagefault/pagefault_trace.h>
 #include <monitors/sleep/sleep_trace.h>
+#include <monitors/wakeup/wakeup_trace.h>
 // Add new monitors based on CONFIG_LTL_MON_EVENTS_ID here
 #endif /* CONFIG_LTL_MON_EVENTS_ID */
 
diff --git a/tools/verification/models/rtapp/wakeup.ltl b/tools/verification/models/rtapp/wakeup.ltl
new file mode 100644
index 000000000000..a5d63ca0811a
--- /dev/null
+++ b/tools/verification/models/rtapp/wakeup.ltl
@@ -0,0 +1,5 @@
+RULE = always (((RT and USER_THREAD) imply
+		(not (WOKEN_BY_LOWER_PRIO or WOKEN_BY_SOFTIRQ)) or ALLOWLIST))
+
+ALLOWLIST = BLOCK_ON_RT_MUTEX
+         or FUTEX_LOCK_PI
-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 3/4] rv/rtapp/sleep: Stop monitoring kernel threads
From: Nam Cao @ 2026-06-19  7:21 UTC (permalink / raw)
  To: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-doc,
	linux-kernel
  Cc: Nam Cao, Sebastian Andrzej Siewior
In-Reply-To: <cover.1781852967.git.namcao@linutronix.de>

The rtapp/sleep monitor's primary purpose is detecting common mistakes
with user-space real-time design. Monitoring real-time issues with
kernel threads is a bonus.

However, accomodating kernel threads complicates the monitor due to
the edge cases which is seen by the monitor as lower-priority task
waking higher-priority task:

  - kthread_stop() wakes up the task in order to stop it.

  - The rcu thread and migration thread can be woken by any task.

  - The ktimerd thread is woken near the end of irq_exit_rcu(), where
    the preempt counter is "broken" and falsely says this is task
    context. This requires the monitor to use the hardirq_context flag
    instead of the preempt counter.

Beside complicating the monitor, the final case also requires enabling
CONFIG_TRACE_IRQFLAGS (so that "hardirq_context" can be used). This
adds overhead to the kernel even when the monitor is not active. This
may be an obstacle to enabling this monitor in distros' kernels.

Furthermore, kernel threads usually are started before the monitor is
enabled. Consequently, the threads' states (i.o.w. the monitor's
atomic propositions for the threads) are not fully known to the
monitor. As a result, the kernel threads mostly cannot be monitored.

Overall, the downsides of accomodating kernel threads outweights the
benefits. Thus, exclude kernel threads to simplify the monitor.

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 Documentation/trace/rv/monitor_rtapp.rst  |  22 ++---
 kernel/trace/rv/monitors/sleep/Kconfig    |   1 -
 kernel/trace/rv/monitors/sleep/sleep.c    |  39 +-------
 kernel/trace/rv/monitors/sleep/sleep.h    | 104 +++++++++-------------
 tools/verification/models/rtapp/sleep.ltl |   7 +-
 5 files changed, 54 insertions(+), 119 deletions(-)

diff --git a/Documentation/trace/rv/monitor_rtapp.rst b/Documentation/trace/rv/monitor_rtapp.rst
index 570be67a8f3b..502d3ea412eb 100644
--- a/Documentation/trace/rv/monitor_rtapp.rst
+++ b/Documentation/trace/rv/monitor_rtapp.rst
@@ -93,9 +93,9 @@ assessment.
 
 The monitor's specification is::
 
-  RULE = always ((RT and SLEEP) imply (RT_FRIENDLY_SLEEP or ALLOWLIST))
+  RULE = always ((RT and SLEEP and USER_THREAD) imply (RT_FRIENDLY_SLEEP or ALLOWLIST))
 
-  RT_FRIENDLY_SLEEP = (RT_VALID_SLEEP_REASON or KERNEL_THREAD)
+  RT_FRIENDLY_SLEEP = RT_VALID_SLEEP_REASON
                   and ((not SCHEDULE_IN) until RT_FRIENDLY_WAKE)
 
   RT_VALID_SLEEP_REASON = FUTEX_WAIT
@@ -110,23 +110,13 @@ The monitor's specification is::
                   or WOKEN_BY_HARDIRQ
                   or WOKEN_BY_NMI
                   or ABORT_SLEEP
-                  or KTHREAD_SHOULD_STOP
 
   ALLOWLIST = BLOCK_ON_RT_MUTEX
            or FUTEX_LOCK_PI
-           or TASK_IS_RCU
-           or TASK_IS_MIGRATION
-
-Beside the scenarios described above, this specification also handle some
-special cases:
-
-  - `KERNEL_THREAD`: kernel tasks do not have any pattern that can be recognized
-    as valid real-time sleeping reasons. Therefore sleeping reason is not
-    checked for kernel tasks.
-  - `KTHREAD_SHOULD_STOP`: a non-real-time thread may stop a real-time kernel
-    thread by waking it and waiting for it to exit (`kthread_stop()`). This
-    wakeup is safe for real-time.
-  - `ALLOWLIST`: to handle known false positives with the kernel.
+
+Beside the scenarios described above, this specification also defines an allow list
+to handle some special cases:
+
   - `BLOCK_ON_RT_MUTEX` is included in the allowlist due to its implementation.
     In the release path of rt_mutex, a boosted task is de-boosted before waking
     the rt_mutex's waiter. Consequently, the monitor may see a real-time-unsafe
diff --git a/kernel/trace/rv/monitors/sleep/Kconfig b/kernel/trace/rv/monitors/sleep/Kconfig
index 6b7a122e7b47..d6ec3e9a91b6 100644
--- a/kernel/trace/rv/monitors/sleep/Kconfig
+++ b/kernel/trace/rv/monitors/sleep/Kconfig
@@ -5,7 +5,6 @@ config RV_MON_SLEEP
 	select RV_LTL_MONITOR
 	depends on HAVE_SYSCALL_TRACEPOINTS
 	depends on RV_MON_RTAPP
-	select TRACE_IRQFLAGS
 	default y
 	select LTL_MON_EVENTS_ID
 	bool "sleep monitor"
diff --git a/kernel/trace/rv/monitors/sleep/sleep.c b/kernel/trace/rv/monitors/sleep/sleep.c
index 638be7d8747f..aa5a984853b5 100644
--- a/kernel/trace/rv/monitors/sleep/sleep.c
+++ b/kernel/trace/rv/monitors/sleep/sleep.c
@@ -43,7 +43,6 @@ static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bo
 	ltl_atom_set(mon, LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO, false);
 
 	if (task_creation) {
-		ltl_atom_set(mon, LTL_KTHREAD_SHOULD_STOP, false);
 		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_REALTIME, false);
 		ltl_atom_set(mon, LTL_NANOSLEEP_TIMER_ABSTIME, false);
 		ltl_atom_set(mon, LTL_CLOCK_NANOSLEEP, false);
@@ -53,33 +52,7 @@ static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bo
 		ltl_atom_set(mon, LTL_BLOCK_ON_RT_MUTEX, false);
 	}
 
-	if (task->flags & PF_KTHREAD) {
-		ltl_atom_set(mon, LTL_KERNEL_THREAD, true);
-
-		/* kernel tasks do not do syscall */
-		ltl_atom_set(mon, LTL_FUTEX_WAIT, false);
-		ltl_atom_set(mon, LTL_FUTEX_LOCK_PI, false);
-		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_REALTIME, false);
-		ltl_atom_set(mon, LTL_NANOSLEEP_TIMER_ABSTIME, false);
-		ltl_atom_set(mon, LTL_CLOCK_NANOSLEEP, false);
-		ltl_atom_set(mon, LTL_EPOLL_WAIT, false);
-
-		if (strstarts(task->comm, "migration/"))
-			ltl_atom_set(mon, LTL_TASK_IS_MIGRATION, true);
-		else
-			ltl_atom_set(mon, LTL_TASK_IS_MIGRATION, false);
-
-		if (strstarts(task->comm, "rcu"))
-			ltl_atom_set(mon, LTL_TASK_IS_RCU, true);
-		else
-			ltl_atom_set(mon, LTL_TASK_IS_RCU, false);
-	} else {
-		ltl_atom_set(mon, LTL_KTHREAD_SHOULD_STOP, false);
-		ltl_atom_set(mon, LTL_KERNEL_THREAD, false);
-		ltl_atom_set(mon, LTL_TASK_IS_RCU, false);
-		ltl_atom_set(mon, LTL_TASK_IS_MIGRATION, false);
-	}
-
+	ltl_atom_set(mon, LTL_USER_THREAD, !(task->flags & PF_KTHREAD));
 }
 
 static void handle_sched_set_state(void *data, struct task_struct *task, int state)
@@ -97,7 +70,7 @@ static void handle_sched_exit(void *data, bool is_switch)
 
 static void handle_sched_waking(void *data, struct task_struct *task)
 {
-	if (this_cpu_read(hardirq_context)) {
+	if (in_hardirq()) {
 		ltl_atom_pulse(task, LTL_WOKEN_BY_HARDIRQ, true);
 	} else if (in_task()) {
 		if (current->prio <= task->prio)
@@ -181,12 +154,6 @@ static void handle_sys_exit(void *data, struct pt_regs *regs, long ret)
 	ltl_atom_update(current, LTL_CLOCK_NANOSLEEP, false);
 }
 
-static void handle_kthread_stop(void *data, struct task_struct *task)
-{
-	/* FIXME: this could race with other tracepoint handlers */
-	ltl_atom_update(task, LTL_KTHREAD_SHOULD_STOP, true);
-}
-
 static int enable_sleep(void)
 {
 	int retval;
@@ -200,7 +167,6 @@ static int enable_sleep(void)
 	rv_attach_trace_probe("rtapp_sleep", sched_set_state_tp, handle_sched_set_state);
 	rv_attach_trace_probe("rtapp_sleep", contention_begin, handle_contention_begin);
 	rv_attach_trace_probe("rtapp_sleep", contention_end, handle_contention_end);
-	rv_attach_trace_probe("rtapp_sleep", sched_kthread_stop, handle_kthread_stop);
 	rv_attach_trace_probe("rtapp_sleep", sys_enter, handle_sys_enter);
 	rv_attach_trace_probe("rtapp_sleep", sys_exit, handle_sys_exit);
 	return 0;
@@ -213,7 +179,6 @@ static void disable_sleep(void)
 	rv_detach_trace_probe("rtapp_sleep", sched_set_state_tp, handle_sched_set_state);
 	rv_detach_trace_probe("rtapp_sleep", contention_begin, handle_contention_begin);
 	rv_detach_trace_probe("rtapp_sleep", contention_end, handle_contention_end);
-	rv_detach_trace_probe("rtapp_sleep", sched_kthread_stop, handle_kthread_stop);
 	rv_detach_trace_probe("rtapp_sleep", sys_enter, handle_sys_enter);
 	rv_detach_trace_probe("rtapp_sleep", sys_exit, handle_sys_exit);
 
diff --git a/kernel/trace/rv/monitors/sleep/sleep.h b/kernel/trace/rv/monitors/sleep/sleep.h
index 2fe2ec7edae8..44e593f41e6a 100644
--- a/kernel/trace/rv/monitors/sleep/sleep.h
+++ b/kernel/trace/rv/monitors/sleep/sleep.h
@@ -18,15 +18,12 @@ enum ltl_atom {
 	LTL_EPOLL_WAIT,
 	LTL_FUTEX_LOCK_PI,
 	LTL_FUTEX_WAIT,
-	LTL_KERNEL_THREAD,
-	LTL_KTHREAD_SHOULD_STOP,
 	LTL_NANOSLEEP_CLOCK_REALTIME,
 	LTL_NANOSLEEP_TIMER_ABSTIME,
 	LTL_RT,
 	LTL_SCHEDULE_IN,
 	LTL_SLEEP,
-	LTL_TASK_IS_MIGRATION,
-	LTL_TASK_IS_RCU,
+	LTL_USER_THREAD,
 	LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO,
 	LTL_WOKEN_BY_HARDIRQ,
 	LTL_WOKEN_BY_NMI,
@@ -43,15 +40,12 @@ static const char *ltl_atom_str(enum ltl_atom atom)
 		"ep_wa",
 		"fu_lo_pi",
 		"fu_wa",
-		"ker_th",
-		"kth_sh_st",
 		"na_cl_re",
 		"na_ti_ab",
 		"rt",
 		"sch_in",
 		"sle",
-		"ta_mi",
-		"ta_rc",
+		"us_th",
 		"wo_eq_hi_pr",
 		"wo_ha",
 		"wo_nm",
@@ -79,46 +73,41 @@ static void ltl_start(struct task_struct *task, struct ltl_monitor *mon)
 	bool woken_by_hardirq = test_bit(LTL_WOKEN_BY_HARDIRQ, mon->atoms);
 	bool woken_by_equal_or_higher_prio = test_bit(LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO,
 	     mon->atoms);
-	bool task_is_rcu = test_bit(LTL_TASK_IS_RCU, mon->atoms);
-	bool task_is_migration = test_bit(LTL_TASK_IS_MIGRATION, mon->atoms);
+	bool user_thread = test_bit(LTL_USER_THREAD, mon->atoms);
 	bool sleep = test_bit(LTL_SLEEP, mon->atoms);
 	bool schedule_in = test_bit(LTL_SCHEDULE_IN, mon->atoms);
 	bool rt = test_bit(LTL_RT, mon->atoms);
 	bool nanosleep_timer_abstime = test_bit(LTL_NANOSLEEP_TIMER_ABSTIME, mon->atoms);
 	bool nanosleep_clock_realtime = test_bit(LTL_NANOSLEEP_CLOCK_REALTIME, mon->atoms);
-	bool kthread_should_stop = test_bit(LTL_KTHREAD_SHOULD_STOP, mon->atoms);
-	bool kernel_thread = test_bit(LTL_KERNEL_THREAD, mon->atoms);
 	bool futex_wait = test_bit(LTL_FUTEX_WAIT, mon->atoms);
 	bool futex_lock_pi = test_bit(LTL_FUTEX_LOCK_PI, mon->atoms);
 	bool epoll_wait = test_bit(LTL_EPOLL_WAIT, mon->atoms);
 	bool clock_nanosleep = test_bit(LTL_CLOCK_NANOSLEEP, mon->atoms);
 	bool block_on_rt_mutex = test_bit(LTL_BLOCK_ON_RT_MUTEX, mon->atoms);
 	bool abort_sleep = test_bit(LTL_ABORT_SLEEP, mon->atoms);
-	bool val41 = task_is_rcu || task_is_migration;
-	bool val42 = futex_lock_pi || val41;
-	bool val5 = block_on_rt_mutex || val42;
-	bool val33 = abort_sleep || kthread_should_stop;
-	bool val34 = woken_by_nmi || val33;
-	bool val35 = woken_by_hardirq || val34;
-	bool val14 = woken_by_equal_or_higher_prio || val35;
+	bool val7 = block_on_rt_mutex || futex_lock_pi;
+	bool val32 = woken_by_nmi || abort_sleep;
+	bool val33 = woken_by_hardirq || val32;
+	bool val14 = woken_by_equal_or_higher_prio || val33;
 	bool val13 = !schedule_in;
 	bool val25 = !nanosleep_clock_realtime;
 	bool val26 = nanosleep_timer_abstime && val25;
 	bool val18 = clock_nanosleep && val26;
 	bool val20 = val18 || epoll_wait;
-	bool val9 = futex_wait || val20;
-	bool val11 = val9 || kernel_thread;
+	bool val11 = futex_wait || val20;
+	bool val3 = !user_thread;
 	bool val2 = !sleep;
+	bool val4 = val2 || val3;
 	bool val1 = !rt;
-	bool val3 = val1 || val2;
+	bool val5 = val1 || val4;
 
-	if (val3)
+	if (val5)
 		__set_bit(S0, mon->states);
 	if (val11 && val13)
 		__set_bit(S1, mon->states);
 	if (val11 && val14)
 		__set_bit(S4, mon->states);
-	if (val5)
+	if (val7)
 		__set_bit(S5, mon->states);
 }
 
@@ -129,130 +118,125 @@ ltl_possible_next_states(struct ltl_monitor *mon, unsigned int state, unsigned l
 	bool woken_by_hardirq = test_bit(LTL_WOKEN_BY_HARDIRQ, mon->atoms);
 	bool woken_by_equal_or_higher_prio = test_bit(LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO,
 	     mon->atoms);
-	bool task_is_rcu = test_bit(LTL_TASK_IS_RCU, mon->atoms);
-	bool task_is_migration = test_bit(LTL_TASK_IS_MIGRATION, mon->atoms);
+	bool user_thread = test_bit(LTL_USER_THREAD, mon->atoms);
 	bool sleep = test_bit(LTL_SLEEP, mon->atoms);
 	bool schedule_in = test_bit(LTL_SCHEDULE_IN, mon->atoms);
 	bool rt = test_bit(LTL_RT, mon->atoms);
 	bool nanosleep_timer_abstime = test_bit(LTL_NANOSLEEP_TIMER_ABSTIME, mon->atoms);
 	bool nanosleep_clock_realtime = test_bit(LTL_NANOSLEEP_CLOCK_REALTIME, mon->atoms);
-	bool kthread_should_stop = test_bit(LTL_KTHREAD_SHOULD_STOP, mon->atoms);
-	bool kernel_thread = test_bit(LTL_KERNEL_THREAD, mon->atoms);
 	bool futex_wait = test_bit(LTL_FUTEX_WAIT, mon->atoms);
 	bool futex_lock_pi = test_bit(LTL_FUTEX_LOCK_PI, mon->atoms);
 	bool epoll_wait = test_bit(LTL_EPOLL_WAIT, mon->atoms);
 	bool clock_nanosleep = test_bit(LTL_CLOCK_NANOSLEEP, mon->atoms);
 	bool block_on_rt_mutex = test_bit(LTL_BLOCK_ON_RT_MUTEX, mon->atoms);
 	bool abort_sleep = test_bit(LTL_ABORT_SLEEP, mon->atoms);
-	bool val41 = task_is_rcu || task_is_migration;
-	bool val42 = futex_lock_pi || val41;
-	bool val5 = block_on_rt_mutex || val42;
-	bool val33 = abort_sleep || kthread_should_stop;
-	bool val34 = woken_by_nmi || val33;
-	bool val35 = woken_by_hardirq || val34;
-	bool val14 = woken_by_equal_or_higher_prio || val35;
+	bool val7 = block_on_rt_mutex || futex_lock_pi;
+	bool val32 = woken_by_nmi || abort_sleep;
+	bool val33 = woken_by_hardirq || val32;
+	bool val14 = woken_by_equal_or_higher_prio || val33;
 	bool val13 = !schedule_in;
 	bool val25 = !nanosleep_clock_realtime;
 	bool val26 = nanosleep_timer_abstime && val25;
 	bool val18 = clock_nanosleep && val26;
 	bool val20 = val18 || epoll_wait;
-	bool val9 = futex_wait || val20;
-	bool val11 = val9 || kernel_thread;
+	bool val11 = futex_wait || val20;
+	bool val3 = !user_thread;
 	bool val2 = !sleep;
+	bool val4 = val2 || val3;
 	bool val1 = !rt;
-	bool val3 = val1 || val2;
+	bool val5 = val1 || val4;
 
 	switch (state) {
 	case S0:
-		if (val3)
+		if (val5)
 			__set_bit(S0, next);
 		if (val11 && val13)
 			__set_bit(S1, next);
 		if (val11 && val14)
 			__set_bit(S4, next);
-		if (val5)
+		if (val7)
 			__set_bit(S5, next);
 		break;
 	case S1:
 		if (val11 && val13)
 			__set_bit(S1, next);
-		if (val13 && val3)
+		if (val13 && val5)
 			__set_bit(S2, next);
-		if (val14 && val3)
+		if (val14 && val5)
 			__set_bit(S3, next);
 		if (val11 && val14)
 			__set_bit(S4, next);
-		if (val13 && val5)
+		if (val13 && val7)
 			__set_bit(S6, next);
-		if (val14 && val5)
+		if (val14 && val7)
 			__set_bit(S7, next);
 		break;
 	case S2:
 		if (val11 && val13)
 			__set_bit(S1, next);
-		if (val13 && val3)
+		if (val13 && val5)
 			__set_bit(S2, next);
-		if (val14 && val3)
+		if (val14 && val5)
 			__set_bit(S3, next);
 		if (val11 && val14)
 			__set_bit(S4, next);
-		if (val13 && val5)
+		if (val13 && val7)
 			__set_bit(S6, next);
-		if (val14 && val5)
+		if (val14 && val7)
 			__set_bit(S7, next);
 		break;
 	case S3:
-		if (val3)
+		if (val5)
 			__set_bit(S0, next);
 		if (val11 && val13)
 			__set_bit(S1, next);
 		if (val11 && val14)
 			__set_bit(S4, next);
-		if (val5)
+		if (val7)
 			__set_bit(S5, next);
 		break;
 	case S4:
-		if (val3)
+		if (val5)
 			__set_bit(S0, next);
 		if (val11 && val13)
 			__set_bit(S1, next);
 		if (val11 && val14)
 			__set_bit(S4, next);
-		if (val5)
+		if (val7)
 			__set_bit(S5, next);
 		break;
 	case S5:
-		if (val3)
+		if (val5)
 			__set_bit(S0, next);
 		if (val11 && val13)
 			__set_bit(S1, next);
 		if (val11 && val14)
 			__set_bit(S4, next);
-		if (val5)
+		if (val7)
 			__set_bit(S5, next);
 		break;
 	case S6:
 		if (val11 && val13)
 			__set_bit(S1, next);
-		if (val13 && val3)
+		if (val13 && val5)
 			__set_bit(S2, next);
-		if (val14 && val3)
+		if (val14 && val5)
 			__set_bit(S3, next);
 		if (val11 && val14)
 			__set_bit(S4, next);
-		if (val13 && val5)
+		if (val13 && val7)
 			__set_bit(S6, next);
-		if (val14 && val5)
+		if (val14 && val7)
 			__set_bit(S7, next);
 		break;
 	case S7:
-		if (val3)
+		if (val5)
 			__set_bit(S0, next);
 		if (val11 && val13)
 			__set_bit(S1, next);
 		if (val11 && val14)
 			__set_bit(S4, next);
-		if (val5)
+		if (val7)
 			__set_bit(S5, next);
 		break;
 	}
diff --git a/tools/verification/models/rtapp/sleep.ltl b/tools/verification/models/rtapp/sleep.ltl
index 5923e58d7810..4d78fdd204c0 100644
--- a/tools/verification/models/rtapp/sleep.ltl
+++ b/tools/verification/models/rtapp/sleep.ltl
@@ -1,6 +1,6 @@
-RULE = always ((RT and SLEEP) imply (RT_FRIENDLY_SLEEP or ALLOWLIST))
+RULE = always ((RT and SLEEP and USER_THREAD) imply (RT_FRIENDLY_SLEEP or ALLOWLIST))
 
-RT_FRIENDLY_SLEEP = (RT_VALID_SLEEP_REASON or KERNEL_THREAD)
+RT_FRIENDLY_SLEEP = RT_VALID_SLEEP_REASON
                 and ((not SCHEDULE_IN) until RT_FRIENDLY_WAKE)
 
 RT_VALID_SLEEP_REASON = FUTEX_WAIT
@@ -15,9 +15,6 @@ RT_FRIENDLY_WAKE = WOKEN_BY_EQUAL_OR_HIGHER_PRIO
                 or WOKEN_BY_HARDIRQ
                 or WOKEN_BY_NMI
                 or ABORT_SLEEP
-                or KTHREAD_SHOULD_STOP
 
 ALLOWLIST = BLOCK_ON_RT_MUTEX
          or FUTEX_LOCK_PI
-         or TASK_IS_RCU
-         or TASK_IS_MIGRATION
-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 2/4] rv/rtapp/sleep: Update nanosleep rule
From: Nam Cao @ 2026-06-19  7:21 UTC (permalink / raw)
  To: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-doc,
	linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781852967.git.namcao@linutronix.de>

CLOCK_REALTIME is the only clock that often is misused in real-time
applications. The other clocks either are safe for real-time uses
(CLOCK_TAI, CLOCK_MONOTONIC, CLOCK_BOOTTIME) or are unlikely to be misused
(CLOCK_AUX, CLOCK_PROCESS_CPUTIME_ID).

Update the monitor to only warn about CLOCK_REALTIME.

While at it, update the out-of-sync documentation.

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 Documentation/trace/rv/monitor_rtapp.rst  | 17 +++++---
 kernel/trace/rv/monitors/sleep/sleep.c    | 12 ++----
 kernel/trace/rv/monitors/sleep/sleep.h    | 52 +++++++++++------------
 tools/verification/models/rtapp/sleep.ltl |  2 +-
 4 files changed, 39 insertions(+), 44 deletions(-)

diff --git a/Documentation/trace/rv/monitor_rtapp.rst b/Documentation/trace/rv/monitor_rtapp.rst
index 01656bf7080a..570be67a8f3b 100644
--- a/Documentation/trace/rv/monitor_rtapp.rst
+++ b/Documentation/trace/rv/monitor_rtapp.rst
@@ -51,12 +51,13 @@ The `sleep` monitor reports real-time threads sleeping in a manner that may
 cause undesirable latency. Real-time applications should only put a real-time
 thread to sleep for one of the following reasons:
 
-  - Cyclic work: real-time thread sleeps waiting for the next cycle. For this
-    case, only the `clock_nanosleep` syscall should be used with `TIMER_ABSTIME`
-    (to avoid time drift) and `CLOCK_MONOTONIC` (to avoid the clock being
-    changed). No other method is safe for real-time. For example, threads
-    waiting for timerfd can be woken by softirq which provides no real-time
-    guarantee.
+  - Cyclic work: real-time thread sleeps waiting for the next
+    cycle. For this case, only the `clock_nanosleep` syscall should be
+    used with `TIMER_ABSTIME` (to avoid time drift). Additionally,
+    `CLOCK_REALTIME` should not be used (to avoid the clock being
+    changed). No other method is safe for real-time. For example,
+    threads waiting for timerfd can be woken by softirq which provides
+    no real-time guarantee.
   - Real-time thread waiting for something to happen (e.g. another thread
     releasing shared resources, or a completion signal from another thread). In
     this case, only futexes (FUTEX_LOCK_PI, FUTEX_LOCK_PI2 or one of
@@ -99,14 +100,16 @@ The monitor's specification is::
 
   RT_VALID_SLEEP_REASON = FUTEX_WAIT
                        or RT_FRIENDLY_NANOSLEEP
+                       or EPOLL_WAIT
 
   RT_FRIENDLY_NANOSLEEP = CLOCK_NANOSLEEP
                       and NANOSLEEP_TIMER_ABSTIME
-                      and NANOSLEEP_CLOCK_MONOTONIC
+                      and not NANOSLEEP_CLOCK_REALTIME
 
   RT_FRIENDLY_WAKE = WOKEN_BY_EQUAL_OR_HIGHER_PRIO
                   or WOKEN_BY_HARDIRQ
                   or WOKEN_BY_NMI
+                  or ABORT_SLEEP
                   or KTHREAD_SHOULD_STOP
 
   ALLOWLIST = BLOCK_ON_RT_MUTEX
diff --git a/kernel/trace/rv/monitors/sleep/sleep.c b/kernel/trace/rv/monitors/sleep/sleep.c
index d6b677fab8f8..638be7d8747f 100644
--- a/kernel/trace/rv/monitors/sleep/sleep.c
+++ b/kernel/trace/rv/monitors/sleep/sleep.c
@@ -44,8 +44,7 @@ static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bo
 
 	if (task_creation) {
 		ltl_atom_set(mon, LTL_KTHREAD_SHOULD_STOP, false);
-		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_MONOTONIC, false);
-		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_TAI, false);
+		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_REALTIME, false);
 		ltl_atom_set(mon, LTL_NANOSLEEP_TIMER_ABSTIME, false);
 		ltl_atom_set(mon, LTL_CLOCK_NANOSLEEP, false);
 		ltl_atom_set(mon, LTL_FUTEX_WAIT, false);
@@ -60,8 +59,7 @@ static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bo
 		/* kernel tasks do not do syscall */
 		ltl_atom_set(mon, LTL_FUTEX_WAIT, false);
 		ltl_atom_set(mon, LTL_FUTEX_LOCK_PI, false);
-		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_MONOTONIC, false);
-		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_TAI, false);
+		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_REALTIME, false);
 		ltl_atom_set(mon, LTL_NANOSLEEP_TIMER_ABSTIME, false);
 		ltl_atom_set(mon, LTL_CLOCK_NANOSLEEP, false);
 		ltl_atom_set(mon, LTL_EPOLL_WAIT, false);
@@ -136,8 +134,7 @@ static void handle_sys_enter(void *data, struct pt_regs *regs, long id)
 	case __NR_clock_nanosleep_time64:
 #endif
 		syscall_get_arguments(current, regs, args);
-		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_MONOTONIC, args[0] == CLOCK_MONOTONIC);
-		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_TAI, args[0] == CLOCK_TAI);
+		ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_REALTIME, args[0] == CLOCK_REALTIME);
 		ltl_atom_set(mon, LTL_NANOSLEEP_TIMER_ABSTIME, args[1] == TIMER_ABSTIME);
 		ltl_atom_update(current, LTL_CLOCK_NANOSLEEP, true);
 		break;
@@ -178,8 +175,7 @@ static void handle_sys_exit(void *data, struct pt_regs *regs, long ret)
 
 	ltl_atom_set(mon, LTL_FUTEX_LOCK_PI, false);
 	ltl_atom_set(mon, LTL_FUTEX_WAIT, false);
-	ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_MONOTONIC, false);
-	ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_TAI, false);
+	ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_REALTIME, false);
 	ltl_atom_set(mon, LTL_NANOSLEEP_TIMER_ABSTIME, false);
 	ltl_atom_set(mon, LTL_EPOLL_WAIT, false);
 	ltl_atom_update(current, LTL_CLOCK_NANOSLEEP, false);
diff --git a/kernel/trace/rv/monitors/sleep/sleep.h b/kernel/trace/rv/monitors/sleep/sleep.h
index 403dc2852c52..2fe2ec7edae8 100644
--- a/kernel/trace/rv/monitors/sleep/sleep.h
+++ b/kernel/trace/rv/monitors/sleep/sleep.h
@@ -20,8 +20,7 @@ enum ltl_atom {
 	LTL_FUTEX_WAIT,
 	LTL_KERNEL_THREAD,
 	LTL_KTHREAD_SHOULD_STOP,
-	LTL_NANOSLEEP_CLOCK_MONOTONIC,
-	LTL_NANOSLEEP_CLOCK_TAI,
+	LTL_NANOSLEEP_CLOCK_REALTIME,
 	LTL_NANOSLEEP_TIMER_ABSTIME,
 	LTL_RT,
 	LTL_SCHEDULE_IN,
@@ -46,8 +45,7 @@ static const char *ltl_atom_str(enum ltl_atom atom)
 		"fu_wa",
 		"ker_th",
 		"kth_sh_st",
-		"na_cl_mo",
-		"na_cl_ta",
+		"na_cl_re",
 		"na_ti_ab",
 		"rt",
 		"sch_in",
@@ -87,8 +85,7 @@ static void ltl_start(struct task_struct *task, struct ltl_monitor *mon)
 	bool schedule_in = test_bit(LTL_SCHEDULE_IN, mon->atoms);
 	bool rt = test_bit(LTL_RT, mon->atoms);
 	bool nanosleep_timer_abstime = test_bit(LTL_NANOSLEEP_TIMER_ABSTIME, mon->atoms);
-	bool nanosleep_clock_tai = test_bit(LTL_NANOSLEEP_CLOCK_TAI, mon->atoms);
-	bool nanosleep_clock_monotonic = test_bit(LTL_NANOSLEEP_CLOCK_MONOTONIC, mon->atoms);
+	bool nanosleep_clock_realtime = test_bit(LTL_NANOSLEEP_CLOCK_REALTIME, mon->atoms);
 	bool kthread_should_stop = test_bit(LTL_KTHREAD_SHOULD_STOP, mon->atoms);
 	bool kernel_thread = test_bit(LTL_KERNEL_THREAD, mon->atoms);
 	bool futex_wait = test_bit(LTL_FUTEX_WAIT, mon->atoms);
@@ -97,17 +94,17 @@ static void ltl_start(struct task_struct *task, struct ltl_monitor *mon)
 	bool clock_nanosleep = test_bit(LTL_CLOCK_NANOSLEEP, mon->atoms);
 	bool block_on_rt_mutex = test_bit(LTL_BLOCK_ON_RT_MUTEX, mon->atoms);
 	bool abort_sleep = test_bit(LTL_ABORT_SLEEP, mon->atoms);
-	bool val42 = task_is_rcu || task_is_migration;
-	bool val43 = futex_lock_pi || val42;
-	bool val5 = block_on_rt_mutex || val43;
-	bool val34 = abort_sleep || kthread_should_stop;
-	bool val35 = woken_by_nmi || val34;
-	bool val36 = woken_by_hardirq || val35;
-	bool val14 = woken_by_equal_or_higher_prio || val36;
+	bool val41 = task_is_rcu || task_is_migration;
+	bool val42 = futex_lock_pi || val41;
+	bool val5 = block_on_rt_mutex || val42;
+	bool val33 = abort_sleep || kthread_should_stop;
+	bool val34 = woken_by_nmi || val33;
+	bool val35 = woken_by_hardirq || val34;
+	bool val14 = woken_by_equal_or_higher_prio || val35;
 	bool val13 = !schedule_in;
-	bool val26 = nanosleep_clock_monotonic || nanosleep_clock_tai;
-	bool val27 = nanosleep_timer_abstime && val26;
-	bool val18 = clock_nanosleep && val27;
+	bool val25 = !nanosleep_clock_realtime;
+	bool val26 = nanosleep_timer_abstime && val25;
+	bool val18 = clock_nanosleep && val26;
 	bool val20 = val18 || epoll_wait;
 	bool val9 = futex_wait || val20;
 	bool val11 = val9 || kernel_thread;
@@ -138,8 +135,7 @@ ltl_possible_next_states(struct ltl_monitor *mon, unsigned int state, unsigned l
 	bool schedule_in = test_bit(LTL_SCHEDULE_IN, mon->atoms);
 	bool rt = test_bit(LTL_RT, mon->atoms);
 	bool nanosleep_timer_abstime = test_bit(LTL_NANOSLEEP_TIMER_ABSTIME, mon->atoms);
-	bool nanosleep_clock_tai = test_bit(LTL_NANOSLEEP_CLOCK_TAI, mon->atoms);
-	bool nanosleep_clock_monotonic = test_bit(LTL_NANOSLEEP_CLOCK_MONOTONIC, mon->atoms);
+	bool nanosleep_clock_realtime = test_bit(LTL_NANOSLEEP_CLOCK_REALTIME, mon->atoms);
 	bool kthread_should_stop = test_bit(LTL_KTHREAD_SHOULD_STOP, mon->atoms);
 	bool kernel_thread = test_bit(LTL_KERNEL_THREAD, mon->atoms);
 	bool futex_wait = test_bit(LTL_FUTEX_WAIT, mon->atoms);
@@ -148,17 +144,17 @@ ltl_possible_next_states(struct ltl_monitor *mon, unsigned int state, unsigned l
 	bool clock_nanosleep = test_bit(LTL_CLOCK_NANOSLEEP, mon->atoms);
 	bool block_on_rt_mutex = test_bit(LTL_BLOCK_ON_RT_MUTEX, mon->atoms);
 	bool abort_sleep = test_bit(LTL_ABORT_SLEEP, mon->atoms);
-	bool val42 = task_is_rcu || task_is_migration;
-	bool val43 = futex_lock_pi || val42;
-	bool val5 = block_on_rt_mutex || val43;
-	bool val34 = abort_sleep || kthread_should_stop;
-	bool val35 = woken_by_nmi || val34;
-	bool val36 = woken_by_hardirq || val35;
-	bool val14 = woken_by_equal_or_higher_prio || val36;
+	bool val41 = task_is_rcu || task_is_migration;
+	bool val42 = futex_lock_pi || val41;
+	bool val5 = block_on_rt_mutex || val42;
+	bool val33 = abort_sleep || kthread_should_stop;
+	bool val34 = woken_by_nmi || val33;
+	bool val35 = woken_by_hardirq || val34;
+	bool val14 = woken_by_equal_or_higher_prio || val35;
 	bool val13 = !schedule_in;
-	bool val26 = nanosleep_clock_monotonic || nanosleep_clock_tai;
-	bool val27 = nanosleep_timer_abstime && val26;
-	bool val18 = clock_nanosleep && val27;
+	bool val25 = !nanosleep_clock_realtime;
+	bool val26 = nanosleep_timer_abstime && val25;
+	bool val18 = clock_nanosleep && val26;
 	bool val20 = val18 || epoll_wait;
 	bool val9 = futex_wait || val20;
 	bool val11 = val9 || kernel_thread;
diff --git a/tools/verification/models/rtapp/sleep.ltl b/tools/verification/models/rtapp/sleep.ltl
index 464c84b9df87..5923e58d7810 100644
--- a/tools/verification/models/rtapp/sleep.ltl
+++ b/tools/verification/models/rtapp/sleep.ltl
@@ -9,7 +9,7 @@ RT_VALID_SLEEP_REASON = FUTEX_WAIT
 
 RT_FRIENDLY_NANOSLEEP = CLOCK_NANOSLEEP
                     and NANOSLEEP_TIMER_ABSTIME
-                    and (NANOSLEEP_CLOCK_MONOTONIC or NANOSLEEP_CLOCK_TAI)
+                    and not NANOSLEEP_CLOCK_REALTIME
 
 RT_FRIENDLY_WAKE = WOKEN_BY_EQUAL_OR_HIGHER_PRIO
                 or WOKEN_BY_HARDIRQ
-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 1/4] rv/rtapp/sleep: Make the error more informative for user
From: Nam Cao @ 2026-06-19  7:21 UTC (permalink / raw)
  To: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-doc,
	linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781852967.git.namcao@linutronix.de>

The rtapp/sleep monitor detects real-time tasks which go to sleep in an
real-time-unsafe manner. If this happen, the monitor triggers a trace event
in the sched_wakeup tracepoint's handler.

However, the invoking context of that trace event is not the most
informative, because of the stack trace of that event is the wakeup's code
path which is not very helpful:

74.669317: rv:error_sleep: condvar[254]: violation detected
    ltl_validate+0x345 ([kernel.kallsyms])
    handle_sched_wakeup+0x34 ([kernel.kallsyms])
    ttwu_do_activate+0xff ([kernel.kallsyms])
    sched_ttwu_pending+0x104 ([kernel.kallsyms])
    __flush_smp_call_function_queue+0x15b ([kernel.kallsyms])
    __sysvec_call_function_single+0x18 ([kernel.kallsyms])
    sysvec_call_function_single+0x66 ([kernel.kallsyms])
    asm_sysvec_call_function_single+0x1a ([kernel.kallsyms])
    pv_native_safe_halt+0xf ([kernel.kallsyms])
    default_idle+0x9 ([kernel.kallsyms])
    default_idle_call+0x33 ([kernel.kallsyms])
    do_idle+0x234 ([kernel.kallsyms])
    cpu_startup_entry+0x24 ([kernel.kallsyms])
    start_secondary+0xf8 ([kernel.kallsyms])
    common_startup_64+0x13e ([kernel.kallsyms])

What would be much more valuable is the stack trace of the task itself.

Instead of using the sched_wakeup tracepoint, use the sched_exit
tracepoint. This makes the event happen in the task's context, making
the stack trace far more informative for user:

rv:error_sleep: condvar[254]: violation detected
    ltl_validate+0x345 ([kernel.kallsyms])
    handle_sched_exit+0x39 ([kernel.kallsyms])
    __schedule+0x80f ([kernel.kallsyms])
    schedule+0x22 ([kernel.kallsyms])
    futex_do_wait+0x33 ([kernel.kallsyms])
    __futex_wait+0x8c ([kernel.kallsyms])
    futex_wait+0x73 ([kernel.kallsyms])
    do_futex+0xc6 ([kernel.kallsyms])
    __x64_sys_futex+0x121 ([kernel.kallsyms])
    do_syscall_64+0xf3 ([kernel.kallsyms])
    entry_SYSCALL_64_after_hwframe+0x77 ([kernel.kallsyms])
    __futex_abstimed_wait_common64+0xc6 (inlined)
    __futex_abstimed_wait_common+0xc6 (/usr/lib/x86_64-linux-gnu/libc.so.6)

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 Documentation/trace/rv/monitor_rtapp.rst  |  2 +-
 kernel/trace/rv/monitors/sleep/sleep.c    | 10 +++++-----
 kernel/trace/rv/monitors/sleep/sleep.h    | 14 +++++++-------
 tools/verification/models/rtapp/sleep.ltl |  2 +-
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/Documentation/trace/rv/monitor_rtapp.rst b/Documentation/trace/rv/monitor_rtapp.rst
index c8104eda924a..01656bf7080a 100644
--- a/Documentation/trace/rv/monitor_rtapp.rst
+++ b/Documentation/trace/rv/monitor_rtapp.rst
@@ -95,7 +95,7 @@ The monitor's specification is::
   RULE = always ((RT and SLEEP) imply (RT_FRIENDLY_SLEEP or ALLOWLIST))
 
   RT_FRIENDLY_SLEEP = (RT_VALID_SLEEP_REASON or KERNEL_THREAD)
-                  and ((not WAKE) until RT_FRIENDLY_WAKE)
+                  and ((not SCHEDULE_IN) until RT_FRIENDLY_WAKE)
 
   RT_VALID_SLEEP_REASON = FUTEX_WAIT
                        or RT_FRIENDLY_NANOSLEEP
diff --git a/kernel/trace/rv/monitors/sleep/sleep.c b/kernel/trace/rv/monitors/sleep/sleep.c
index 8dfe5ec13e19..d6b677fab8f8 100644
--- a/kernel/trace/rv/monitors/sleep/sleep.c
+++ b/kernel/trace/rv/monitors/sleep/sleep.c
@@ -36,7 +36,7 @@ static void ltl_atoms_fetch(struct task_struct *task, struct ltl_monitor *mon)
 static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
 {
 	ltl_atom_set(mon, LTL_SLEEP, false);
-	ltl_atom_set(mon, LTL_WAKE, false);
+	ltl_atom_set(mon, LTL_SCHEDULE_IN, false);
 	ltl_atom_set(mon, LTL_ABORT_SLEEP, false);
 	ltl_atom_set(mon, LTL_WOKEN_BY_HARDIRQ, false);
 	ltl_atom_set(mon, LTL_WOKEN_BY_NMI, false);
@@ -92,9 +92,9 @@ static void handle_sched_set_state(void *data, struct task_struct *task, int sta
 		ltl_atom_pulse(task, LTL_ABORT_SLEEP, true);
 }
 
-static void handle_sched_wakeup(void *data, struct task_struct *task)
+static void handle_sched_exit(void *data, bool is_switch)
 {
-	ltl_atom_pulse(task, LTL_WAKE, true);
+	ltl_atom_pulse(current, LTL_SCHEDULE_IN, true);
 }
 
 static void handle_sched_waking(void *data, struct task_struct *task)
@@ -200,7 +200,7 @@ static int enable_sleep(void)
 		return retval;
 
 	rv_attach_trace_probe("rtapp_sleep", sched_waking, handle_sched_waking);
-	rv_attach_trace_probe("rtapp_sleep", sched_wakeup, handle_sched_wakeup);
+	rv_attach_trace_probe("rtapp_sleep", sched_exit_tp, handle_sched_exit);
 	rv_attach_trace_probe("rtapp_sleep", sched_set_state_tp, handle_sched_set_state);
 	rv_attach_trace_probe("rtapp_sleep", contention_begin, handle_contention_begin);
 	rv_attach_trace_probe("rtapp_sleep", contention_end, handle_contention_end);
@@ -213,7 +213,7 @@ static int enable_sleep(void)
 static void disable_sleep(void)
 {
 	rv_detach_trace_probe("rtapp_sleep", sched_waking, handle_sched_waking);
-	rv_detach_trace_probe("rtapp_sleep", sched_wakeup, handle_sched_wakeup);
+	rv_detach_trace_probe("rtapp_sleep", sched_exit_tp, handle_sched_exit);
 	rv_detach_trace_probe("rtapp_sleep", sched_set_state_tp, handle_sched_set_state);
 	rv_detach_trace_probe("rtapp_sleep", contention_begin, handle_contention_begin);
 	rv_detach_trace_probe("rtapp_sleep", contention_end, handle_contention_end);
diff --git a/kernel/trace/rv/monitors/sleep/sleep.h b/kernel/trace/rv/monitors/sleep/sleep.h
index 95dc2727c059..403dc2852c52 100644
--- a/kernel/trace/rv/monitors/sleep/sleep.h
+++ b/kernel/trace/rv/monitors/sleep/sleep.h
@@ -24,10 +24,10 @@ enum ltl_atom {
 	LTL_NANOSLEEP_CLOCK_TAI,
 	LTL_NANOSLEEP_TIMER_ABSTIME,
 	LTL_RT,
+	LTL_SCHEDULE_IN,
 	LTL_SLEEP,
 	LTL_TASK_IS_MIGRATION,
 	LTL_TASK_IS_RCU,
-	LTL_WAKE,
 	LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO,
 	LTL_WOKEN_BY_HARDIRQ,
 	LTL_WOKEN_BY_NMI,
@@ -50,10 +50,10 @@ static const char *ltl_atom_str(enum ltl_atom atom)
 		"na_cl_ta",
 		"na_ti_ab",
 		"rt",
-		"sl",
+		"sch_in",
+		"sle",
 		"ta_mi",
 		"ta_rc",
-		"wak",
 		"wo_eq_hi_pr",
 		"wo_ha",
 		"wo_nm",
@@ -81,10 +81,10 @@ static void ltl_start(struct task_struct *task, struct ltl_monitor *mon)
 	bool woken_by_hardirq = test_bit(LTL_WOKEN_BY_HARDIRQ, mon->atoms);
 	bool woken_by_equal_or_higher_prio = test_bit(LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO,
 	     mon->atoms);
-	bool wake = test_bit(LTL_WAKE, mon->atoms);
 	bool task_is_rcu = test_bit(LTL_TASK_IS_RCU, mon->atoms);
 	bool task_is_migration = test_bit(LTL_TASK_IS_MIGRATION, mon->atoms);
 	bool sleep = test_bit(LTL_SLEEP, mon->atoms);
+	bool schedule_in = test_bit(LTL_SCHEDULE_IN, mon->atoms);
 	bool rt = test_bit(LTL_RT, mon->atoms);
 	bool nanosleep_timer_abstime = test_bit(LTL_NANOSLEEP_TIMER_ABSTIME, mon->atoms);
 	bool nanosleep_clock_tai = test_bit(LTL_NANOSLEEP_CLOCK_TAI, mon->atoms);
@@ -104,7 +104,7 @@ static void ltl_start(struct task_struct *task, struct ltl_monitor *mon)
 	bool val35 = woken_by_nmi || val34;
 	bool val36 = woken_by_hardirq || val35;
 	bool val14 = woken_by_equal_or_higher_prio || val36;
-	bool val13 = !wake;
+	bool val13 = !schedule_in;
 	bool val26 = nanosleep_clock_monotonic || nanosleep_clock_tai;
 	bool val27 = nanosleep_timer_abstime && val26;
 	bool val18 = clock_nanosleep && val27;
@@ -132,10 +132,10 @@ ltl_possible_next_states(struct ltl_monitor *mon, unsigned int state, unsigned l
 	bool woken_by_hardirq = test_bit(LTL_WOKEN_BY_HARDIRQ, mon->atoms);
 	bool woken_by_equal_or_higher_prio = test_bit(LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO,
 	     mon->atoms);
-	bool wake = test_bit(LTL_WAKE, mon->atoms);
 	bool task_is_rcu = test_bit(LTL_TASK_IS_RCU, mon->atoms);
 	bool task_is_migration = test_bit(LTL_TASK_IS_MIGRATION, mon->atoms);
 	bool sleep = test_bit(LTL_SLEEP, mon->atoms);
+	bool schedule_in = test_bit(LTL_SCHEDULE_IN, mon->atoms);
 	bool rt = test_bit(LTL_RT, mon->atoms);
 	bool nanosleep_timer_abstime = test_bit(LTL_NANOSLEEP_TIMER_ABSTIME, mon->atoms);
 	bool nanosleep_clock_tai = test_bit(LTL_NANOSLEEP_CLOCK_TAI, mon->atoms);
@@ -155,7 +155,7 @@ ltl_possible_next_states(struct ltl_monitor *mon, unsigned int state, unsigned l
 	bool val35 = woken_by_nmi || val34;
 	bool val36 = woken_by_hardirq || val35;
 	bool val14 = woken_by_equal_or_higher_prio || val36;
-	bool val13 = !wake;
+	bool val13 = !schedule_in;
 	bool val26 = nanosleep_clock_monotonic || nanosleep_clock_tai;
 	bool val27 = nanosleep_timer_abstime && val26;
 	bool val18 = clock_nanosleep && val27;
diff --git a/tools/verification/models/rtapp/sleep.ltl b/tools/verification/models/rtapp/sleep.ltl
index 6f26c4810f78..464c84b9df87 100644
--- a/tools/verification/models/rtapp/sleep.ltl
+++ b/tools/verification/models/rtapp/sleep.ltl
@@ -1,7 +1,7 @@
 RULE = always ((RT and SLEEP) imply (RT_FRIENDLY_SLEEP or ALLOWLIST))
 
 RT_FRIENDLY_SLEEP = (RT_VALID_SLEEP_REASON or KERNEL_THREAD)
-                and ((not WAKE) until RT_FRIENDLY_WAKE)
+                and ((not SCHEDULE_IN) until RT_FRIENDLY_WAKE)
 
 RT_VALID_SLEEP_REASON = FUTEX_WAIT
                      or RT_FRIENDLY_NANOSLEEP
-- 
2.47.3


^ permalink raw reply related

* [PATCH v2 0/4] rv: rtapp monitor update
From: Nam Cao @ 2026-06-19  7:21 UTC (permalink / raw)
  To: Gabriele Monaco, Steven Rostedt, linux-trace-kernel, linux-doc,
	linux-kernel
  Cc: Nam Cao

A couple of minor improvements to the rtapp monitor:

  - Making the monitor more informative to user by changing the
    context of the tracepoint into the monitored task itself, not the
    IPI wakeup path.

  - and update the allow list regarding clock_nanosleep syscall.

  - Stop monitoring the kernel threads to simplify the monitors.

  - Add a new rtapp/wakeup monitor to give complement the rtapp/sleep
    monitor.

v2..v1 https://lore.kernel.org/lkml/cover.1779176466.git.namcao@linutronix.de/
  - Use clearer waker/wakee terminologies
  - Fix build issue
  - Add new patch "rv/rtapp/sleep: Stop monitoring kernel threads"
  - Require RV_PER_TASK_MONITORS >= 3

Nam Cao (4):
  rv/rtapp/sleep: Make the error more informative for user
  rv/rtapp/sleep: Update nanosleep rule
  rv/rtapp/sleep: Stop monitoring kernel threads
  rv/rtapp: Add wakeup monitor

 Documentation/trace/rv/monitor_rtapp.rst      |  61 ++++---
 kernel/trace/rv/Kconfig                       |   1 +
 kernel/trace/rv/Makefile                      |   1 +
 kernel/trace/rv/monitors/rtapp/Kconfig        |   2 +-
 kernel/trace/rv/monitors/sleep/Kconfig        |   1 -
 kernel/trace/rv/monitors/sleep/sleep.c        |  59 ++-----
 kernel/trace/rv/monitors/sleep/sleep.h        | 142 +++++++---------
 kernel/trace/rv/monitors/wakeup/Kconfig       |  16 ++
 kernel/trace/rv/monitors/wakeup/wakeup.c      | 153 ++++++++++++++++++
 kernel/trace/rv/monitors/wakeup/wakeup.h      |  92 +++++++++++
 .../trace/rv/monitors/wakeup/wakeup_trace.h   |  14 ++
 kernel/trace/rv/rv_trace.h                    |   1 +
 tools/verification/models/rtapp/sleep.ltl     |  11 +-
 tools/verification/models/rtapp/wakeup.ltl    |   5 +
 14 files changed, 396 insertions(+), 163 deletions(-)
 create mode 100644 kernel/trace/rv/monitors/wakeup/Kconfig
 create mode 100644 kernel/trace/rv/monitors/wakeup/wakeup.c
 create mode 100644 kernel/trace/rv/monitors/wakeup/wakeup.h
 create mode 100644 kernel/trace/rv/monitors/wakeup/wakeup_trace.h
 create mode 100644 tools/verification/models/rtapp/wakeup.ltl

-- 
2.47.3


^ permalink raw reply

* [PATCH v4 13/13] verification/rvgen: Remove dead code
From: Nam Cao @ 2026-06-19  5:52 UTC (permalink / raw)
  To: Steven Rostedt, Gabriele Monaco, Wander Lairson Costa,
	linux-trace-kernel, linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781847583.git.namcao@linutronix.de>

The conversion to use Lark left some dead code behind. Remove them.

Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 tools/verification/rvgen/rvgen/automata.py | 157 ---------------------
 tools/verification/rvgen/rvgen/dot2k.py    |  29 +---
 2 files changed, 1 insertion(+), 185 deletions(-)

diff --git a/tools/verification/rvgen/rvgen/automata.py b/tools/verification/rvgen/rvgen/automata.py
index 2cb7443ea00f..fd37ce304276 100644
--- a/tools/verification/rvgen/rvgen/automata.py
+++ b/tools/verification/rvgen/rvgen/automata.py
@@ -9,9 +9,6 @@
 #   Documentation/trace/rv/deterministic_automata.rst
 
 import ntpath
-import re
-from typing import Iterator
-from itertools import islice
 
 import lark
 
@@ -356,19 +353,6 @@ class State:
         self.name = name
         self.inv = inv
 
-class _ConstraintKey:
-    """Base class for constraint keys."""
-
-class _StateConstraintKey(_ConstraintKey, int):
-    """Key for a state constraint. Under the hood just state_id."""
-    def __new__(cls, state_id: int):
-        return super().__new__(cls, state_id)
-
-class _EventConstraintKey(_ConstraintKey, tuple):
-    """Key for an event constraint. Under the hood just tuple(state_id,event_id)."""
-    def __new__(cls, state_id: int, event_id: int):
-        return super().__new__(cls, (state_id, event_id))
-
 class AutomataError(Exception):
     """Exception raised for errors in automata parsing and validation.
 
@@ -387,28 +371,10 @@ class Automata:
 
     invalid_state_str = "INVALID_STATE"
     init_marker = "__init_"
-    node_marker = "{node"
-    # val can be numerical, uppercase (constant or macro), lowercase (parameter or function)
-    # only numerical values should have units
-    constraint_rule = re.compile(r"""
-        ^
-        (?P<env>[a-zA-Z_][a-zA-Z0-9_]+)  # C-like identifier for the env var
-        (?P<op>[!<=>]{1,2})              # operator
-        (?P<val>
-            [0-9]+ |                     # numerical value
-            [A-Z_]+\(\) |                # macro
-            [A-Z_]+ |                    # constant
-            [a-z_]+\(\) |                # function
-            [a-z_]+                      # parameter
-        )
-        (?P<unit>[a-z]{1,2})?            # optional unit for numerical values
-        """, re.VERBOSE)
-    constraint_reset = re.compile(r"^reset\((?P<env>[a-zA-Z_][a-zA-Z0-9_]+)\)")
 
     def __init__(self, file_path, model_name=None):
         self.__dot_path = file_path
         self.name = model_name or self.__get_model_name()
-        self.__dot_lines = self.__open_dot()
         self.__parse_tree = ParseTree(file_path)
         self.transitions = self.__parse_transitions()
         self.states, self.initial_state, self.final_states = self.__parse_states()
@@ -435,57 +401,6 @@ class Automata:
 
         return model_name
 
-    def __open_dot(self) -> list[str]:
-        dot_lines = []
-        try:
-            with open(self.__dot_path) as dot_file:
-                dot_lines = dot_file.readlines()
-        except OSError as exc:
-            raise AutomataError(exc.strerror) from exc
-
-        if not dot_lines:
-            raise AutomataError(f"{self.__dot_path} is empty")
-
-        # checking the first line:
-        line = dot_lines[0].split()
-
-        if len(line) < 2 or line[0] != "digraph" or line[1] != "state_automaton":
-            raise AutomataError(f"Not a valid .dot format: {self.__dot_path}")
-
-        return dot_lines
-
-    def __get_cursor_begin_states(self) -> int:
-        for cursor, line in enumerate(self.__dot_lines):
-            split_line = line.split()
-
-            if len(split_line) and split_line[0] == self.node_marker:
-                return cursor
-
-        raise AutomataError("Could not find a beginning state")
-
-    def __get_cursor_begin_events(self) -> int:
-        state = 0
-        cursor = 0 # make pyright happy
-
-        for cursor, line in enumerate(self.__dot_lines):
-            line = line.split()
-            if not line:
-                continue
-
-            if state == 0:
-                if line[0] == self.node_marker:
-                    state = 1
-            elif line[0] != self.node_marker:
-                break
-        else:
-            raise AutomataError("Could not find beginning event")
-
-        cursor += 1 # skip initial state transition
-        if cursor == len(self.__dot_lines):
-            raise AutomataError("Dot file ended after event beginning")
-
-        return cursor
-
     def __parse_transitions(self):
         transitions = []
 
@@ -542,51 +457,6 @@ class Automata:
         states.insert(0, initial_state)
         return states, initial_state, final_states
 
-    def __get_state_variables(self) -> tuple[list[str], str, list[str]]:
-        # wait for node declaration
-        states = []
-        final_states = []
-        initial_state = ""
-
-        has_final_states = False
-        cursor = self.__get_cursor_begin_states()
-
-        # process nodes
-        for line in islice(self.__dot_lines, cursor, None):
-            split_line = line.split()
-            if not split_line or split_line[0] != self.node_marker:
-                break
-
-            raw_state = split_line[-1]
-
-            #  "enabled_fired"}; -> enabled_fired
-            state = raw_state.replace('"', '').replace('};', '').replace(',', '_')
-            if state.startswith(self.init_marker):
-                initial_state = state[len(self.init_marker):]
-            else:
-                states.append(state)
-                if "doublecircle" in line:
-                    final_states.append(state)
-                    has_final_states = True
-
-                if "ellipse" in line:
-                    final_states.append(state)
-                    has_final_states = True
-
-        if not initial_state:
-            raise AutomataError("The automaton doesn't have an initial state")
-
-        states = sorted(set(states))
-        states.remove(initial_state)
-
-        # Insert the initial state at the beginning of the states
-        states.insert(0, initial_state)
-
-        if not has_final_states:
-            final_states.append(initial_state)
-
-        return states, initial_state, final_states
-
     def __get_event_variables(self) -> tuple[list[str], list[str]]:
         events: list[str] = []
         envs: list[str] = []
@@ -609,26 +479,6 @@ class Automata:
 
         return sorted(set(events)), sorted(set(envs))
 
-    def _split_constraint_expr(self, constr: list[str]) -> Iterator[tuple[str,
-                                                                          str | None]]:
-        """
-        Get a list of strings of the type constr1 && constr2 and returns a list of
-        constraints and separators: [[constr1,"&&"],[constr2,None]]
-        """
-        exprs = []
-        seps = []
-        for c in constr:
-            while "&&" in c or "||" in c:
-                a = c.find("&&")
-                o = c.find("||")
-                pos = a if o < 0 or 0 < a < o else o
-                exprs.append(c[:pos].replace(" ", ""))
-                seps.append(c[pos:pos + 2].replace(" ", ""))
-                c = c[pos + 2:].replace(" ", "")
-            exprs.append(c)
-            seps.append(None)
-        return zip(exprs, seps)
-
     def __extract_env_var(self, constraint: ConstraintCondition):
         if constraint.unit:
             self.env_types[constraint.env] = constraint.unit
@@ -697,10 +547,3 @@ class Automata:
 
     def is_hybrid_automata(self) -> bool:
         return bool(self.envs)
-
-    def is_event_constraint(self, key: _ConstraintKey) -> bool:
-        """
-        Given the key in self.constraints return true if it is an event
-        constraint, false if it is a state constraint
-        """
-        return isinstance(key, _EventConstraintKey)
diff --git a/tools/verification/rvgen/rvgen/dot2k.py b/tools/verification/rvgen/rvgen/dot2k.py
index b93d0ccc9bdf..4d39f229c970 100644
--- a/tools/verification/rvgen/rvgen/dot2k.py
+++ b/tools/verification/rvgen/rvgen/dot2k.py
@@ -8,12 +8,9 @@
 # For further information, see:
 #   Documentation/trace/rv/da_monitor_synthesis.rst
 
-from collections import deque
 from .dot2c import Dot2c
 from .generator import Monitor
-from .automata import _EventConstraintKey, _StateConstraintKey, AutomataError
-from .automata import ConstraintCondition
-
+from .automata import ConstraintCondition, AutomataError
 
 class dot2k(Monitor, Dot2c):
     template_dir = "dot2k"
@@ -217,9 +214,6 @@ class ha2k(dot2k):
                 value *= 10**9
         return str(value) + "ull"
 
-    def __parse_single_constraint(self, rule: dict, value: str) -> str:
-        return f"ha_get_env(ha_mon, {rule["env"]}{self.enum_suffix}, time_ns) {rule["op"]} {value}"
-
     def __parse_guard_rule(self, rule) -> list[str]:
         buff = []
         for c, sep in rule.rules:
@@ -233,12 +227,6 @@ class ha2k(dot2k):
             buff.append(cond)
         return buff
 
-    def __get_constraint_env(self, constr: str) -> str:
-        """Extract the second argument from an ha_ function"""
-        env = constr.split("(")[1].split()[1].rstrip(")").rstrip(",")
-        assert env.removesuffix(f"_{self.name}") in self.envs
-        return env
-
     def __start_to_invariant_check(self, inv: ConstraintCondition) -> str:
         # by default assume the timer has ns expiration
         clock_type = "ns"
@@ -249,21 +237,6 @@ class ha2k(dot2k):
 
         return f"return ha_check_invariant_{clock_type}(ha_mon, {inv.env}_{self.name}, time_ns, {value})"
 
-    def __start_to_conv(self, constr: str) -> str:
-        """
-        Undo the storage conversion done by ha_start_timer_
-        """
-        return "ha_inv_to_guard" + constr[constr.find("("):]
-
-    def __parse_timer_constraint(self, rule: dict, value: str) -> str:
-        # by default assume the timer has ns expiration
-        clock_type = "ns"
-        if self.env_types.get(rule["env"]) == "j":
-            clock_type = "jiffy"
-
-        return (f"ha_start_timer_{clock_type}(ha_mon, {rule["env"]}{self.enum_suffix},"
-                f" {value}, time_ns)")
-
     def __parse_invariant(self, inv):
         # by default assume the timer has ns expiration
         clock_type = "ns"
-- 
2.47.3


^ permalink raw reply related

* [PATCH v4 11/13] verification/rvgen: Switch __create_matrix() to Lark
From: Nam Cao @ 2026-06-19  5:52 UTC (permalink / raw)
  To: Steven Rostedt, Gabriele Monaco, Wander Lairson Costa,
	linux-trace-kernel, linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781847583.git.namcao@linutronix.de>

Switch __create_matrix() to use the transitions parsed by Lark to avoid all
the raw text parsing.

Also stop parsing constraints in __create_matrix(), that is not used
anymore.

Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 tools/verification/rvgen/rvgen/automata.py | 47 ++++++----------------
 tools/verification/rvgen/rvgen/dot2k.py    |  2 +-
 2 files changed, 13 insertions(+), 36 deletions(-)

diff --git a/tools/verification/rvgen/rvgen/automata.py b/tools/verification/rvgen/rvgen/automata.py
index c34e916516ba..1726e82546a7 100644
--- a/tools/verification/rvgen/rvgen/automata.py
+++ b/tools/verification/rvgen/rvgen/automata.py
@@ -418,7 +418,7 @@ class Automata:
         self.constraint_vars = set()
         self.self_loop_reset_events = set()
         self.events, self.envs = self.__get_event_variables()
-        self.function, self.constraints = self.__create_matrix()
+        self.function = self.__create_matrix()
         self.events_start, self.events_start_run = self.__store_init_events()
         self.env_stored = sorted(self.env_stored)
         self.constraint_vars = sorted(self.constraint_vars)
@@ -636,10 +636,10 @@ class Automata:
         if constraint.val[0].isalpha():
             self.constraint_vars.add(constraint.val)
 
-    def __create_matrix(self) -> tuple[list[list[str]], dict[_ConstraintKey, list[str]]]:
+    def __create_matrix(self) -> list[list[str]]:
         # transform the array into a dictionary
         events = self.events
-        states = self.states
+        states = [s.name for s in self._states]
         events_dict = {}
         states_dict = {}
         nr_event = 0
@@ -654,39 +654,16 @@ class Automata:
 
         # declare the matrix....
         matrix = [[self.invalid_state_str for _ in range(nr_event)] for _ in range(nr_state)]
-        constraints: dict[_ConstraintKey, list[str]] = {}
 
-        # and we are back! Let's fill the matrix
-        cursor = self.__get_cursor_begin_events()
-
-        for line in map(str.lstrip,
-                        islice(self.__dot_lines, cursor, None)):
-
-            if not line or line[0] != '"':
-                break
-
-            split_line = line.split()
-
-            if len(split_line) > 2 and split_line[1] == "->":
-                origin_state = split_line[0].replace('"', '').replace(',', '_')
-                dest_state = split_line[2].replace('"', '').replace(',', '_')
-                possible_events = "".join(split_line[split_line.index("label") + 2:-1]).replace('"', '')
-                for event in possible_events.split("\\n"):
-                    event, *constr = event.split(";")
-                    if constr:
-                        key = _EventConstraintKey(states_dict[origin_state], events_dict[event])
-                        constraints[key] = constr
-                        # those events reset also on self loops
-                        if origin_state == dest_state and "reset" in "".join(constr):
-                            self.self_loop_reset_events.add(event)
-                    matrix[states_dict[origin_state]][events_dict[event]] = dest_state
-            else:
-                state = line.split("label")[1].split('"')[1]
-                state, *constr = state.replace(" ", "").split("\\n")
-                if constr:
-                    constraints[_StateConstraintKey(states_dict[state])] = constr
-
-        return matrix, constraints
+        for transition in self.transitions:
+            src, dst = transition.src, transition.dst
+            event = transition.event
+            if src == dst and transition.reset:
+                # those events reset also on self loops
+                self.self_loop_reset_events.add(event)
+            matrix[states_dict[src]][events_dict[event]] = dst
+
+        return matrix
 
     def __store_init_events(self) -> tuple[list[bool], list[bool]]:
         events_start = [False] * len(self.events)
diff --git a/tools/verification/rvgen/rvgen/dot2k.py b/tools/verification/rvgen/rvgen/dot2k.py
index f1f5fa297adb..03141e9ef45d 100644
--- a/tools/verification/rvgen/rvgen/dot2k.py
+++ b/tools/verification/rvgen/rvgen/dot2k.py
@@ -407,7 +407,7 @@ f"""static inline void ha_setup_invariants(struct ha_monitor *ha_mon,
 
     def __fill_constr_func(self) -> list[str]:
         buff = []
-        if not self.constraints:
+        if not self.has_invariant and not self.has_guard:
             return []
 
         buff.append(
-- 
2.47.3


^ permalink raw reply related

* [PATCH v4 12/13] verification/rvgen: Remove the old state variables
From: Nam Cao @ 2026-06-19  5:52 UTC (permalink / raw)
  To: Steven Rostedt, Gabriele Monaco, Wander Lairson Costa,
	linux-trace-kernel, linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781847583.git.namcao@linutronix.de>

The state variables (states, initial_state, final_states) only capture the
states' names and have less information than their Lark-based counterparts.

Switch to use the new state variables and delete these old ones.

Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 tools/verification/rvgen/rvgen/automata.py |  9 ++++-----
 tools/verification/rvgen/rvgen/dot2c.py    | 10 +++++-----
 tools/verification/rvgen/rvgen/dot2k.py    |  8 ++++----
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/tools/verification/rvgen/rvgen/automata.py b/tools/verification/rvgen/rvgen/automata.py
index 1726e82546a7..2cb7443ea00f 100644
--- a/tools/verification/rvgen/rvgen/automata.py
+++ b/tools/verification/rvgen/rvgen/automata.py
@@ -411,8 +411,7 @@ class Automata:
         self.__dot_lines = self.__open_dot()
         self.__parse_tree = ParseTree(file_path)
         self.transitions = self.__parse_transitions()
-        self._states, self._initial_state, self._final_states = self.__parse_states()
-        self.states, self.initial_state, self.final_states = self.__get_state_variables()
+        self.states, self.initial_state, self.final_states = self.__parse_states()
         self.env_types = {}
         self.env_stored = set()
         self.constraint_vars = set()
@@ -603,7 +602,7 @@ class Automata:
                     envs.append(c.env)
                     self.__extract_env_var(c)
 
-        for state in self._states:
+        for state in self.states:
             if state.inv:
                 envs.append(state.inv.env)
                 self.__extract_env_var(state.inv)
@@ -639,7 +638,7 @@ class Automata:
     def __create_matrix(self) -> list[list[str]]:
         # transform the array into a dictionary
         events = self.events
-        states = [s.name for s in self._states]
+        states = [s.name for s in self.states]
         events_dict = {}
         states_dict = {}
         nr_event = 0
@@ -675,7 +674,7 @@ class Automata:
             for j in range(len(self.states)):
                 if self.function[j][i] != self.invalid_state_str:
                     curr_event_used += 1
-                if self.function[j][i] == self.initial_state:
+                if self.function[j][i] == self.initial_state.name:
                     curr_event_will_init += 1
             if self.function[0][i] != self.invalid_state_str:
                 curr_event_from_init = True
diff --git a/tools/verification/rvgen/rvgen/dot2c.py b/tools/verification/rvgen/rvgen/dot2c.py
index fc85ba1f649e..22938ce1bf6c 100644
--- a/tools/verification/rvgen/rvgen/dot2c.py
+++ b/tools/verification/rvgen/rvgen/dot2c.py
@@ -29,10 +29,10 @@ class Dot2c(Automata):
 
     def __get_enum_states_content(self) -> list[str]:
         buff = []
-        buff.append(f"\t{self.initial_state}{self.enum_suffix},")
+        buff.append(f"\t{self.initial_state.name}{self.enum_suffix},")
         for state in self.states:
             if state != self.initial_state:
-                buff.append(f"\t{state}{self.enum_suffix},")
+                buff.append(f"\t{state.name}{self.enum_suffix},")
         buff.append(f"\tstate_max{self.enum_suffix},")
 
         return buff
@@ -142,7 +142,7 @@ class Dot2c(Automata):
     def format_aut_init_states_string(self) -> list[str]:
         buff = []
         buff.append("\t.state_names = {")
-        buff.append(self.__get_string_vector_per_line_content(self.states))
+        buff.append(self.__get_string_vector_per_line_content([s.name for s in self.states]))
         buff.append("\t},")
 
         return buff
@@ -159,7 +159,7 @@ class Dot2c(Automata):
         return buff
 
     def __get_max_strlen_of_states(self) -> int:
-        max_state_name = len(max(self.states, key=len))
+        max_state_name = max((len(s.name) for s in self.states))
         return max(max_state_name, len(self.invalid_state_str))
 
     def get_aut_init_function(self) -> str:
@@ -199,7 +199,7 @@ class Dot2c(Automata):
         return buff
 
     def get_aut_init_initial_state(self) -> str:
-        return self.initial_state
+        return self.initial_state.name
 
     def format_aut_init_initial_state(self) -> list[str]:
         buff = []
diff --git a/tools/verification/rvgen/rvgen/dot2k.py b/tools/verification/rvgen/rvgen/dot2k.py
index 03141e9ef45d..b93d0ccc9bdf 100644
--- a/tools/verification/rvgen/rvgen/dot2k.py
+++ b/tools/verification/rvgen/rvgen/dot2k.py
@@ -179,7 +179,7 @@ class ha2k(dot2k):
         self.trace_h = self._read_template_file("trace_hybrid.h")
         self.has_invariant = False
         self.has_guard = False
-        for state in self._states:
+        for state in self.states:
             if state.inv:
                 self.has_invariant = True
         for transition in self.transitions:
@@ -318,7 +318,7 @@ f"""static inline bool ha_verify_invariants(struct ha_monitor *ha_mon,
 {{"""]
 
         _else = ""
-        for state in self._states:
+        for state in self.states:
             if not state.inv:
                 continue
 
@@ -386,7 +386,7 @@ f"""static inline void ha_setup_invariants(struct ha_monitor *ha_mon,
         buff.append(f"\tif ({condition_str})\n\t\treturn;")
 
         _else = ""
-        for state in self._states:
+        for state in self.states:
             inv = state.inv
             if not inv:
                 continue
@@ -395,7 +395,7 @@ f"""static inline void ha_setup_invariants(struct ha_monitor *ha_mon,
             buff.append(f"\t\t{inv};")
             _else = "else "
 
-        for state in self._states:
+        for state in self.states:
             inv = state.inv
             if not inv:
                 continue
-- 
2.47.3


^ permalink raw reply related

* [PATCH v4 10/13] verification/rvgen: Switch __get_event_variables() to Lark
From: Nam Cao @ 2026-06-19  5:52 UTC (permalink / raw)
  To: Steven Rostedt, Gabriele Monaco, Wander Lairson Costa,
	linux-trace-kernel, linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781847583.git.namcao@linutronix.de>

Switch __get_event_variables() to use the parsed results from Lark, instead
of raw text processing.

Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 tools/verification/rvgen/rvgen/automata.py | 78 ++++++----------------
 1 file changed, 19 insertions(+), 59 deletions(-)

diff --git a/tools/verification/rvgen/rvgen/automata.py b/tools/verification/rvgen/rvgen/automata.py
index ea7eabd4c173..c34e916516ba 100644
--- a/tools/verification/rvgen/rvgen/automata.py
+++ b/tools/verification/rvgen/rvgen/automata.py
@@ -591,45 +591,22 @@ class Automata:
     def __get_event_variables(self) -> tuple[list[str], list[str]]:
         events: list[str] = []
         envs: list[str] = []
-        # here we are at the begin of transitions, take a note, we will return later.
-        cursor = self.__get_cursor_begin_events()
 
-        for line in map(str.lstrip, islice(self.__dot_lines, cursor, None)):
-            if not line.startswith('"'):
-                break
+        for transition in self.transitions:
+            events.append(transition.event)
 
-            # transitions have the format:
-            # "all_fired" -> "both_fired" [ label = "disable_irq" ];
-            #  ------------ event is here ------------^^^^^
-            split_line = line.split()
-            if len(split_line) > 1 and split_line[1] == "->":
-                event = "".join(split_line[split_line.index("label") + 2:-1]).replace('"', '')
-
-                # when a transition has more than one label, they are like this
-                # "local_irq_enable\nhw_local_irq_enable_n"
-                # so split them.
-
-                for i in event.split("\\n"):
-                    # if the event contains a constraint (hybrid automata),
-                    # it will be separated by a ";":
-                    # "sched_switch;x<1000;reset(x)"
-                    ev, *constr = i.split(";")
-                    if constr:
-                        if len(constr) > 2:
-                            raise AutomataError("Only 1 constraint and 1 reset are supported")
-                        envs += self.__extract_env_var(constr)
-                    events.append(ev)
-            else:
-                # state labels have the format:
-                # "enable_fired" [label = "enable_fired\ncondition"];
-                #  ----- label is here -----^^^^^
-                # label and node name must be the same, condition is optional
-                state = line.split("label")[1].split('"')[1]
-                _, *constr = state.split("\\n")
-                if constr:
-                    if len(constr) > 1:
-                        raise AutomataError("Only 1 constraint is supported in the state")
-                    envs += self.__extract_env_var([constr[0].replace(" ", "")])
+            if transition.reset:
+                envs.append(transition.reset.env)
+                self.env_stored.add(transition.reset.env)
+            if transition.rule:
+                for c, _ in transition.rule.rules:
+                    envs.append(c.env)
+                    self.__extract_env_var(c)
+
+        for state in self._states:
+            if state.inv:
+                envs.append(state.inv.env)
+                self.__extract_env_var(state.inv)
 
         return sorted(set(events)), sorted(set(envs))
 
@@ -653,28 +630,11 @@ class Automata:
             seps.append(None)
         return zip(exprs, seps)
 
-    def __extract_env_var(self, constraint: list[str]) -> list[str]:
-        env = []
-        for c, _ in self._split_constraint_expr(constraint):
-            rule = self.constraint_rule.search(c)
-            reset = self.constraint_reset.search(c)
-            if rule:
-                env.append(rule["env"])
-                if rule.groupdict().get("unit"):
-                    self.env_types[rule["env"]] = rule["unit"]
-                if rule["val"][0].isalpha():
-                    self.constraint_vars.add(rule["val"])
-                # try to infer unit from constants or parameters
-                val_for_unit = rule["val"].lower().replace("()", "")
-                if val_for_unit.endswith("_ns"):
-                    self.env_types[rule["env"]] = "ns"
-                if val_for_unit.endswith("_jiffies"):
-                    self.env_types[rule["env"]] = "j"
-            if reset:
-                env.append(reset["env"])
-                # environment variables that are reset need a storage
-                self.env_stored.add(reset["env"])
-        return env
+    def __extract_env_var(self, constraint: ConstraintCondition):
+        if constraint.unit:
+            self.env_types[constraint.env] = constraint.unit
+        if constraint.val[0].isalpha():
+            self.constraint_vars.add(constraint.val)
 
     def __create_matrix(self) -> tuple[list[list[str]], dict[_ConstraintKey, list[str]]]:
         # transform the array into a dictionary
-- 
2.47.3


^ permalink raw reply related

* [PATCH v4 07/13] rv: Simplify hybrid automata monitors's clock variables
From: Nam Cao @ 2026-06-19  5:52 UTC (permalink / raw)
  To: Steven Rostedt, Gabriele Monaco, Wander Lairson Costa,
	linux-trace-kernel, linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781847583.git.namcao@linutronix.de>

Hybrid automata monitors's clock variables have two different
representations:

  - The invariant representation, which is the timestamp when the invariant
    expires

  - The guard representation, which is the timestamp when the clock is last
    reset

This dual representation makes the logic quite difficult to follow (well,
at least for me). It also complicates the monitors and the generation tool,
as it requires conversion back and forth between the representation.

Simplify by using the clock variables for a single purpose: storing the
time stamp since the clock is last reset.

This also allows simplifying rvgen, which will be done in a follow-up
commit.

Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 include/rv/ha_monitor.h                  | 64 ++++++------------------
 kernel/trace/rv/monitors/nomiss/nomiss.c | 18 +------
 kernel/trace/rv/monitors/stall/stall.c   |  2 +-
 3 files changed, 19 insertions(+), 65 deletions(-)

diff --git a/include/rv/ha_monitor.h b/include/rv/ha_monitor.h
index 28d3c74cabfc..9144b4c06f3f 100644
--- a/include/rv/ha_monitor.h
+++ b/include/rv/ha_monitor.h
@@ -327,19 +327,8 @@ static inline void __ha_monitor_timer_callback(struct ha_monitor *ha_mon)
 }
 
 /*
- * The clock variables have 2 different representations in the env_store:
- * - The guard representation is the timestamp of the last reset
- * - The invariant representation is the timestamp when the invariant expires
- * As the representations are incompatible, care must be taken when switching
- * between them: the invariant representation can only be used when starting a
- * timer when the previous representation was guard (e.g. no other invariant
- * started since the last reset operation).
- * Likewise, switching from invariant to guard representation without a reset
- * can be done only by subtracting the exact value used to start the invariant.
- *
- * Reading the environment variable (ha_get_clk) also reflects this difference
- * any reads in states that have an invariant return the (possibly negative)
- * time since expiration, other reads return the time since last reset.
+ * The clock variables store the time epoch - the timestamp when the clock was last reset.
+ * They are read by subtracting the time epoch from the current time.
  */
 
 /*
@@ -353,31 +342,21 @@ static inline void ha_reset_clk_ns(struct ha_monitor *ha_mon, enum envs env, u64
 {
 	WRITE_ONCE(ha_mon->env_store[env], time_ns);
 }
-static inline void ha_set_invariant_ns(struct ha_monitor *ha_mon, enum envs env,
-				       u64 value, u64 time_ns)
-{
-	WRITE_ONCE(ha_mon->env_store[env], time_ns + value);
-}
-static inline bool ha_check_invariant_ns(struct ha_monitor *ha_mon,
-					 enum envs env, u64 time_ns)
+static inline bool ha_check_invariant_ns(struct ha_monitor *ha_mon, enum envs env,
+					 u64 time_ns, u64 expire_ns)
 {
-	return READ_ONCE(ha_mon->env_store[env]) >= time_ns;
+	return READ_ONCE(ha_mon->env_store[env]) >= time_ns - expire_ns;
 }
 /*
  * ha_invariant_passed_ns - prepare the invariant and return the time since reset
  */
-static inline u64 ha_invariant_passed_ns(struct ha_monitor *ha_mon, enum envs env,
-				   u64 expire, u64 time_ns)
+static inline u64 ha_invariant_passed_ns(struct ha_monitor *ha_mon, enum envs env, u64 time_ns)
 {
-	u64 passed = 0;
-
 	if (env < 0 || env >= ENV_MAX_STORED)
 		return 0;
 	if (ha_monitor_env_invalid(ha_mon, env))
 		return 0;
-	passed = ha_get_env(ha_mon, env, time_ns);
-	ha_set_invariant_ns(ha_mon, env, expire - passed, time_ns);
-	return passed;
+	return ha_get_env(ha_mon, env, time_ns);
 }
 
 /*
@@ -391,32 +370,21 @@ static inline void ha_reset_clk_jiffy(struct ha_monitor *ha_mon, enum envs env)
 {
 	WRITE_ONCE(ha_mon->env_store[env], get_jiffies_64());
 }
-static inline void ha_set_invariant_jiffy(struct ha_monitor *ha_mon,
-					  enum envs env, u64 value)
+static inline bool ha_check_invariant_jiffy(struct ha_monitor *ha_mon, enum envs env,
+					    u64 time_ns, u64 expire_jiffy)
 {
-	WRITE_ONCE(ha_mon->env_store[env], get_jiffies_64() + value);
-}
-static inline bool ha_check_invariant_jiffy(struct ha_monitor *ha_mon,
-					    enum envs env, u64 time_ns)
-{
-	return time_after64(READ_ONCE(ha_mon->env_store[env]), get_jiffies_64());
-
+	return time_after64(READ_ONCE(ha_mon->env_store[env]), get_jiffies_64() - expire_jiffy);
 }
 /*
  * ha_invariant_passed_jiffy - prepare the invariant and return the time since reset
  */
-static inline u64 ha_invariant_passed_jiffy(struct ha_monitor *ha_mon, enum envs env,
-				      u64 expire, u64 time_ns)
+static inline u64 ha_invariant_passed_jiffy(struct ha_monitor *ha_mon, enum envs env, u64 time_ns)
 {
-	u64 passed = 0;
-
 	if (env < 0 || env >= ENV_MAX_STORED)
 		return 0;
 	if (ha_monitor_env_invalid(ha_mon, env))
 		return 0;
-	passed = ha_get_env(ha_mon, env, time_ns);
-	ha_set_invariant_jiffy(ha_mon, env, expire - passed);
-	return passed;
+	return ha_get_env(ha_mon, env, time_ns);
 }
 
 /*
@@ -463,14 +431,14 @@ static inline void ha_setup_timer(struct ha_monitor *ha_mon)
 static inline void ha_start_timer_jiffy(struct ha_monitor *ha_mon, enum envs env,
 					u64 expire, u64 time_ns)
 {
-	u64 passed = ha_invariant_passed_jiffy(ha_mon, env, expire, time_ns);
+	u64 passed = ha_invariant_passed_jiffy(ha_mon, env, time_ns);
 
 	mod_timer(&ha_mon->timer, get_jiffies_64() + expire - passed);
 }
 static inline void ha_start_timer_ns(struct ha_monitor *ha_mon, enum envs env,
 				     u64 expire, u64 time_ns)
 {
-	u64 passed = ha_invariant_passed_ns(ha_mon, env, expire, time_ns);
+	u64 passed = ha_invariant_passed_ns(ha_mon, env, time_ns);
 
 	ha_start_timer_jiffy(ha_mon, ENV_MAX_STORED,
 			     nsecs_to_jiffies(expire - passed + TICK_NSEC - 1), time_ns);
@@ -516,7 +484,7 @@ static inline void ha_start_timer_ns(struct ha_monitor *ha_mon, enum envs env,
 				     u64 expire, u64 time_ns)
 {
 	int mode = HRTIMER_MODE_REL_HARD;
-	u64 passed = ha_invariant_passed_ns(ha_mon, env, expire, time_ns);
+	u64 passed = ha_invariant_passed_ns(ha_mon, env, time_ns);
 
 	if (RV_MON_TYPE == RV_MON_PER_CPU)
 		mode |= HRTIMER_MODE_PINNED;
@@ -525,7 +493,7 @@ static inline void ha_start_timer_ns(struct ha_monitor *ha_mon, enum envs env,
 static inline void ha_start_timer_jiffy(struct ha_monitor *ha_mon, enum envs env,
 					u64 expire, u64 time_ns)
 {
-	u64 passed = ha_invariant_passed_jiffy(ha_mon, env, expire, time_ns);
+	u64 passed = ha_invariant_passed_jiffy(ha_mon, env, time_ns);
 
 	ha_start_timer_ns(ha_mon, ENV_MAX_STORED,
 			  jiffies_to_nsecs(expire - passed), time_ns);
diff --git a/kernel/trace/rv/monitors/nomiss/nomiss.c b/kernel/trace/rv/monitors/nomiss/nomiss.c
index 8ead8783c29f..515ece5ce0ca 100644
--- a/kernel/trace/rv/monitors/nomiss/nomiss.c
+++ b/kernel/trace/rv/monitors/nomiss/nomiss.c
@@ -57,24 +57,12 @@ static inline bool ha_verify_invariants(struct ha_monitor *ha_mon,
 					enum states next_state, u64 time_ns)
 {
 	if (curr_state == ready_nomiss)
-		return ha_check_invariant_ns(ha_mon, clk_nomiss, time_ns);
+		return ha_check_invariant_ns(ha_mon, clk_nomiss, time_ns, DEADLINE_NS(ha_mon));
 	else if (curr_state == running_nomiss)
-		return ha_check_invariant_ns(ha_mon, clk_nomiss, time_ns);
+		return ha_check_invariant_ns(ha_mon, clk_nomiss, time_ns, DEADLINE_NS(ha_mon));
 	return true;
 }
 
-static inline void ha_convert_inv_guard(struct ha_monitor *ha_mon,
-					enum states curr_state, enum events event,
-					enum states next_state, u64 time_ns)
-{
-	if (curr_state == next_state)
-		return;
-	if (curr_state == ready_nomiss)
-		ha_inv_to_guard(ha_mon, clk_nomiss, DEADLINE_NS(ha_mon), time_ns);
-	else if (curr_state == running_nomiss)
-		ha_inv_to_guard(ha_mon, clk_nomiss, DEADLINE_NS(ha_mon), time_ns);
-}
-
 static inline bool ha_verify_guards(struct ha_monitor *ha_mon,
 				    enum states curr_state, enum events event,
 				    enum states next_state, u64 time_ns)
@@ -122,8 +110,6 @@ static bool ha_verify_constraint(struct ha_monitor *ha_mon,
 	if (!ha_verify_invariants(ha_mon, curr_state, event, next_state, time_ns))
 		return false;
 
-	ha_convert_inv_guard(ha_mon, curr_state, event, next_state, time_ns);
-
 	if (!ha_verify_guards(ha_mon, curr_state, event, next_state, time_ns))
 		return false;
 
diff --git a/kernel/trace/rv/monitors/stall/stall.c b/kernel/trace/rv/monitors/stall/stall.c
index 3c38fb1a0159..b265578f845c 100644
--- a/kernel/trace/rv/monitors/stall/stall.c
+++ b/kernel/trace/rv/monitors/stall/stall.c
@@ -38,7 +38,7 @@ static inline bool ha_verify_invariants(struct ha_monitor *ha_mon,
 					enum states next_state, u64 time_ns)
 {
 	if (curr_state == enqueued_stall)
-		return ha_check_invariant_jiffy(ha_mon, clk_stall, time_ns);
+		return ha_check_invariant_jiffy(ha_mon, clk_stall, time_ns, threshold_jiffies);
 	return true;
 }
 
-- 
2.47.3


^ permalink raw reply related

* [PATCH v4 09/13] verification/rvgen: Delete __parse_constraint()
From: Nam Cao @ 2026-06-19  5:52 UTC (permalink / raw)
  To: Steven Rostedt, Gabriele Monaco, Wander Lairson Costa,
	linux-trace-kernel, linux-kernel
  Cc: Nam Cao
In-Reply-To: <cover.1781847583.git.namcao@linutronix.de>

All previous users of self.invariants and self.guards have been converted
to the Lark parser, delete __parse_constraints() and its associates.

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 tools/verification/rvgen/rvgen/dot2k.py | 67 ++-----------------------
 1 file changed, 4 insertions(+), 63 deletions(-)

diff --git a/tools/verification/rvgen/rvgen/dot2k.py b/tools/verification/rvgen/rvgen/dot2k.py
index 4ea1ecc55c80..f1f5fa297adb 100644
--- a/tools/verification/rvgen/rvgen/dot2k.py
+++ b/tools/verification/rvgen/rvgen/dot2k.py
@@ -177,7 +177,6 @@ class ha2k(dot2k):
         if not self.is_hybrid_automata():
             raise AutomataError("Detected deterministic automaton, use the 'da' class")
         self.trace_h = self._read_template_file("trace_hybrid.h")
-        self.__parse_constraints()
         self.has_invariant = False
         self.has_guard = False
         for state in self._states:
@@ -308,64 +307,6 @@ class ha2k(dot2k):
         separator = "\n\t\t      " if sum(len(r) for r in rules) > 80 else " "
         return ["res = " + separator.join(rules) + ";"]
 
-    def __validate_constraint(self, key: tuple[int, int] | int, constr: str,
-                              rule, reset) -> None:
-        # event constrains are tuples and allow both rules and reset
-        # state constraints are only used for expirations (e.g. clk<N)
-        if self.is_event_constraint(key):
-            if not rule and not reset:
-                raise AutomataError("Unrecognised event constraint "
-                                    f"({self.states[key[0]]}/{self.events[key[1]]}: {constr})")
-            if rule and (rule["env"] in self.env_types and
-                         rule["env"] not in self.env_stored):
-                raise AutomataError("Clocks in hybrid automata always require a storage"
-                                    f" ({rule["env"]})")
-        else:
-            if not rule:
-                raise AutomataError("Unrecognised state constraint "
-                                    f"({self.states[key]}: {constr})")
-            if rule["env"] not in self.env_stored:
-                raise AutomataError("State constraints always require a storage "
-                                    f"({rule["env"]})")
-            if rule["op"] not in ["<", "<="]:
-                raise AutomataError("State constraints must be clock expirations like"
-                                    f" clk<N ({rule.string})")
-
-    def __parse_constraints(self) -> None:
-        self.guards: dict[_EventConstraintKey, str] = {}
-        self.invariants: dict[_StateConstraintKey, str] = {}
-        for key, constraint in self.constraints.items():
-            rules = []
-            resets = []
-            for c, sep in self._split_constraint_expr(constraint):
-                rule = self.constraint_rule.search(c)
-                reset = self.constraint_reset.search(c)
-                self.__validate_constraint(key, c, rule, reset)
-                if rule:
-                    value = rule["val"]
-                    value_len = len(rule["val"])
-                    unit = None
-                    if rule.groupdict().get("unit"):
-                        value_len += len(rule["unit"])
-                        unit = rule["unit"]
-                    c = c[:-(value_len)]
-                    value = self.__adjust_value(value, unit)
-                    if self.is_event_constraint(key):
-                        c = self.__parse_single_constraint(rule, value)
-                        if sep:
-                            c += f" {sep}"
-                    else:
-                        c = self.__parse_timer_constraint(rule, value)
-                    rules.append(c)
-                if reset:
-                    c = f"ha_reset_env(ha_mon, {reset["env"]}{self.enum_suffix}, time_ns)"
-                    resets.append(c)
-            if self.is_event_constraint(key):
-                res = self.__format_guard_rules(rules) + resets
-                self.guards[key] = ";".join(res)
-            else:
-                self.invariants[key] = rules[0]
-
     def __fill_verify_invariants_func(self) -> list[str]:
         if not self.has_invariant:
             return []
@@ -490,15 +431,15 @@ f"""static bool ha_verify_constraint(struct ha_monitor *ha_mon,
 \t\t\t\t enum {self.enum_states_def} next_state, u64 time_ns)
 {{""")
 
-        if self.invariants:
+        if self.has_invariant:
             buff.append("\tif (!ha_verify_invariants(ha_mon, curr_state, "
                         "event, next_state, time_ns))\n\t\treturn false;\n")
 
-        if self.guards:
+        if self.has_guard:
             buff.append("\tif (!ha_verify_guards(ha_mon, curr_state, event, "
                         "next_state, time_ns))\n\t\treturn false;\n")
 
-        if self.invariants:
+        if self.has_invariant:
             buff.append("\tha_setup_invariants(ha_mon, curr_state, event, next_state, time_ns);\n")
 
         buff.append("\treturn true;\n}\n")
@@ -575,7 +516,7 @@ f"""static bool ha_verify_constraint(struct ha_monitor *ha_mon,
         return self.__fill_hybrid_get_reset_functions() + self.__fill_constr_func()
 
     def _fill_timer_type(self) -> list:
-        if self.invariants:
+        if self.has_invariant:
             return [
                     "/* XXX: If the monitor has several instances, consider HA_TIMER_WHEEL */",
                     "#define HA_TIMER_TYPE HA_TIMER_HRTIMER"
-- 
2.47.3


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox