* [RFC v8 2/7] ext4: lockdep: handle i_data_sem subclassing for special inodes
From: Li Chen @ 2026-05-15 9:18 UTC (permalink / raw)
To: Zhang Yi, Theodore Ts'o, Andreas Dilger, Baokun Li, Jan Kara,
Ojaswin Mujoo, Ritesh Harjani (IBM), Zhang Yi, linux-ext4,
linux-kernel
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
linux-trace-kernel
In-Reply-To: <20260515091829.194810-1-me@linux.beauty>
Fast commit can hold s_fc_lock while writing journal blocks. Mapping the
journal inode can take its i_data_sem. Normal inode update paths can take a
data inode i_data_sem and then s_fc_lock, which makes lockdep report a
circular dependency.
lockdep treats all i_data_sem instances as one lock class and cannot
distinguish the journal inode i_data_sem from a regular inode i_data_sem.
The journal inode is not tracked by fast commit and no FC waiters ever
depend on it, so this is not a real ABBA deadlock. Assign the journal inode
a dedicated i_data_sem lockdep subclass to avoid the false positive.
Inode cache objects can be recycled, so also reset i_data_sem to
I_DATA_SEM_NORMAL when allocating an ext4 inode. Otherwise a new inode may
inherit an old subclass (journal/quota/ea) and trigger lockdep warnings.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
---
Changes v6:
- Rebase onto linux-next master as of 2026-04-08.
- Refresh the patch context around upstream ext4_alloc_inode() changes,
without changing the subclassing logic.
fs/ext4/ext4.h | 4 +++-
fs/ext4/super.c | 8 ++++++++
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index e337a37bb6fb..115a3c94db16 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1015,12 +1015,14 @@ do { \
* than the first
* I_DATA_SEM_QUOTA - Used for quota inodes only
* I_DATA_SEM_EA - Used for ea_inodes only
+ * I_DATA_SEM_JOURNAL - Used for journal inode only
*/
enum {
I_DATA_SEM_NORMAL = 0,
I_DATA_SEM_OTHER,
I_DATA_SEM_QUOTA,
- I_DATA_SEM_EA
+ I_DATA_SEM_EA,
+ I_DATA_SEM_JOURNAL
};
struct ext4_fc_inode_snap;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 6a77db4d3124..3c869f0001c5 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1431,6 +1431,9 @@ static struct inode *ext4_alloc_inode(struct super_block *sb)
ext4_fc_init_inode(&ei->vfs_inode);
spin_lock_init(&ei->i_fc_lock);
mmb_init(&ei->i_metadata_bhs, &ei->vfs_inode.i_data);
+#ifdef CONFIG_LOCKDEP
+ lockdep_set_subclass(&ei->i_data_sem, I_DATA_SEM_NORMAL);
+#endif
return &ei->vfs_inode;
}
@@ -5910,6 +5913,11 @@ static struct inode *ext4_get_journal_inode(struct super_block *sb,
return ERR_PTR(-EFSCORRUPTED);
}
+#ifdef CONFIG_LOCKDEP
+ lockdep_set_subclass(&EXT4_I(journal_inode)->i_data_sem,
+ I_DATA_SEM_JOURNAL);
+#endif
+
ext4_debug("Journal inode found at %p: %lld bytes\n",
journal_inode, journal_inode->i_size);
return journal_inode;
--
2.53.0
^ permalink raw reply related
* [RFC v8 1/7] ext4: fast commit: snapshot inode state before writing log
From: Li Chen @ 2026-05-15 9:18 UTC (permalink / raw)
To: Zhang Yi, Theodore Ts'o, Andreas Dilger, Baokun Li, Jan Kara,
Ojaswin Mujoo, Ritesh Harjani (IBM), Zhang Yi, linux-ext4,
linux-kernel
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
linux-trace-kernel
In-Reply-To: <20260515091829.194810-1-me@linux.beauty>
Fast commit writes inode metadata and data range updates after unlocking
journal updates. New handles can start at that point, so the log writing
path must not look at live inode state.
Add a commit-time per-inode snapshot and populate it while journal updates
are locked and existing handles are drained. Store the snapshot behind
ext4_inode_info->i_fc_snap so ext4_inode_info only grows by one pointer.
The snapshot contains a copy of the on-disk inode plus the data range
records needed for fast commit TLVs.
Snapshotting runs under jbd2_journal_lock_updates(). Avoid triggering I/O
there by using ext4_get_inode_loc_noio() and falling back to full commit
if the inode table block is not present or not uptodate.
Log writing then only serializes the snapshot, so it no longer needs to
call ext4_map_blocks() and take i_data_sem under s_fc_lock. The snapshot
is installed and freed under s_fc_lock and is released from fast commit
cleanup and inode eviction.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
---
Changes in v7:
- Drop the stale i_fc_wait initialization after rebasing onto the new
linux-next base.
Changes in v6:
- Rebase onto linux-next master as of 2026-04-08.
- Fix the inode debug print format after rebasing.
fs/ext4/ext4.h | 22 ++-
fs/ext4/fast_commit.c | 331 +++++++++++++++++++++++++++++++++++-------
fs/ext4/inode.c | 51 +++++++
3 files changed, 352 insertions(+), 52 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 6569d1d575a0..e337a37bb6fb 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1023,6 +1023,7 @@ enum {
I_DATA_SEM_EA
};
+struct ext4_fc_inode_snap;
/*
* fourth extended file system inode data in memory
@@ -1079,6 +1080,22 @@ struct ext4_inode_info {
/* End of lblk range that needs to be committed in this fast commit */
ext4_lblk_t i_fc_lblk_len;
+ /*
+ * Commit-time fast commit snapshots.
+ *
+ * i_fc_snap is installed and freed under sbi->s_fc_lock. The fast
+ * commit log writing path reads the snapshot under sbi->s_fc_lock while
+ * serializing fast commit TLVs.
+ *
+ * The snapshot lifetime is bounded by EXT4_STATE_FC_COMMITTING and the
+ * corresponding cleanup / eviction paths.
+ *
+ * i_fc_snap points to per-inode snapshot data for fast commit:
+ * - a raw inode snapshot for EXT4_FC_TAG_INODE
+ * - data range records for EXT4_FC_TAG_{ADD,DEL}_RANGE
+ */
+ struct ext4_fc_inode_snap *i_fc_snap;
+
spinlock_t i_raw_lock; /* protects updates to the raw inode */
/*
@@ -3100,8 +3117,9 @@ extern int ext4_file_getattr(struct mnt_idmap *, const struct path *,
struct kstat *, u32, unsigned int);
extern void ext4_dirty_inode(struct inode *, int);
extern int ext4_change_inode_journal_flag(struct inode *, int);
-extern int ext4_get_inode_loc(struct inode *, struct ext4_iloc *);
-extern int ext4_get_fc_inode_loc(struct super_block *sb, unsigned long ino,
+int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc);
+int ext4_get_inode_loc_noio(struct inode *inode, struct ext4_iloc *iloc);
+int ext4_get_fc_inode_loc(struct super_block *sb, unsigned long ino,
struct ext4_iloc *iloc);
extern int ext4_inode_attach_jinode(struct inode *inode);
extern int ext4_can_truncate(struct inode *inode);
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
index 1775bce9649a..0c49144e8ca2 100644
--- a/fs/ext4/fast_commit.c
+++ b/fs/ext4/fast_commit.c
@@ -56,21 +56,23 @@
* deleted while it is being flushed.
* [2] Flush data buffers to disk and clear "EXT4_STATE_FC_FLUSHING_DATA"
* state.
- * [3] Lock the journal by calling jbd2_journal_lock_updates. This ensures that
- * all the exsiting handles finish and no new handles can start.
- * [4] Mark all the fast commit eligible inodes as undergoing fast commit
- * by setting "EXT4_STATE_FC_COMMITTING" state.
- * [5] Unlock the journal by calling jbd2_journal_unlock_updates. This allows
+ * [3] Lock the journal by calling jbd2_journal_lock_updates(). This ensures
+ * that all the existing handles finish and no new handles can start.
+ * [4] Mark all the fast commit eligible inodes as undergoing fast commit by
+ * setting "EXT4_STATE_FC_COMMITTING" state, and snapshot the inode state
+ * needed for log writing.
+ * [5] Unlock the journal by calling jbd2_journal_unlock_updates(). This allows
* starting of new handles. If new handles try to start an update on
* any of the inodes that are being committed, ext4_fc_track_inode()
* will block until those inodes have finished the fast commit.
* [6] Commit all the directory entry updates in the fast commit space.
- * [7] Commit all the changed inodes in the fast commit space and clear
- * "EXT4_STATE_FC_COMMITTING" for these inodes.
+ * [7] Commit all the changed inodes in the fast commit space.
* [8] Write tail tag (this tag ensures the atomicity, please read the following
* section for more details).
+ * [9] Clear "EXT4_STATE_FC_COMMITTING" and wake up waiters in
+ * ext4_fc_cleanup().
*
- * All the inode updates must be enclosed within jbd2_jounrnal_start()
+ * All the inode updates must be enclosed within jbd2_journal_start()
* and jbd2_journal_stop() similar to JBD2 journaling.
*
* Fast Commit Ineligibility
@@ -200,6 +202,8 @@ static void ext4_end_buffer_io_sync(struct buffer_head *bh, int uptodate)
unlock_buffer(bh);
}
+static void ext4_fc_free_inode_snap(struct inode *inode);
+
static inline void ext4_fc_reset_inode(struct inode *inode)
{
struct ext4_inode_info *ei = EXT4_I(inode);
@@ -216,6 +220,7 @@ void ext4_fc_init_inode(struct inode *inode)
ext4_clear_inode_state(inode, EXT4_STATE_FC_COMMITTING);
INIT_LIST_HEAD(&ei->i_fc_list);
INIT_LIST_HEAD(&ei->i_fc_dilist);
+ ei->i_fc_snap = NULL;
}
static bool ext4_fc_disabled(struct super_block *sb)
@@ -248,6 +253,7 @@ void ext4_fc_del(struct inode *inode)
alloc_ctx = ext4_fc_lock(inode->i_sb);
if (list_empty(&ei->i_fc_list) && list_empty(&ei->i_fc_dilist)) {
+ ext4_fc_free_inode_snap(inode);
ext4_fc_unlock(inode->i_sb, alloc_ctx);
return;
}
@@ -281,6 +287,7 @@ void ext4_fc_del(struct inode *inode)
}
finish_wait(wq, &wait.wq_entry);
}
+ ext4_fc_free_inode_snap(inode);
list_del_init(&ei->i_fc_list);
/*
@@ -817,6 +824,21 @@ static bool ext4_fc_add_dentry_tlv(struct super_block *sb, u32 *crc,
return true;
}
+struct ext4_fc_range {
+ struct list_head list;
+ u16 tag;
+ ext4_lblk_t lblk;
+ ext4_lblk_t len;
+ ext4_fsblk_t pblk;
+ bool unwritten;
+};
+
+struct ext4_fc_inode_snap {
+ struct list_head data_list;
+ unsigned int inode_len;
+ u8 inode_buf[];
+};
+
/*
* Writes inode in the fast commit space under TLV with tag @tag.
* Returns 0 on success, error on failure.
@@ -824,21 +846,21 @@ static bool ext4_fc_add_dentry_tlv(struct super_block *sb, u32 *crc,
static int ext4_fc_write_inode(struct inode *inode, u32 *crc)
{
struct ext4_inode_info *ei = EXT4_I(inode);
- int inode_len = EXT4_GOOD_OLD_INODE_SIZE;
- int ret;
- struct ext4_iloc iloc;
+ struct ext4_fc_inode_snap *snap = ei->i_fc_snap;
struct ext4_fc_inode fc_inode;
struct ext4_fc_tl tl;
u8 *dst;
+ u8 *src;
+ int inode_len;
+ int ret;
- ret = ext4_get_inode_loc(inode, &iloc);
- if (ret)
- return ret;
+ if (!snap)
+ return -ECANCELED;
- if (ext4_test_inode_flag(inode, EXT4_INODE_INLINE_DATA))
- inode_len = EXT4_INODE_SIZE(inode->i_sb);
- else if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE)
- inode_len += ei->i_extra_isize;
+ src = snap->inode_buf;
+ inode_len = snap->inode_len;
+ if (!src || inode_len == 0)
+ return -ECANCELED;
fc_inode.fc_ino = cpu_to_le32(inode->i_ino);
tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_INODE);
@@ -854,10 +876,9 @@ static int ext4_fc_write_inode(struct inode *inode, u32 *crc)
dst += EXT4_FC_TAG_BASE_LEN;
memcpy(dst, &fc_inode, sizeof(fc_inode));
dst += sizeof(fc_inode);
- memcpy(dst, (u8 *)ext4_raw_inode(&iloc), inode_len);
+ memcpy(dst, src, inode_len);
ret = 0;
err:
- brelse(iloc.bh);
return ret;
}
@@ -867,12 +888,74 @@ static int ext4_fc_write_inode(struct inode *inode, u32 *crc)
*/
static int ext4_fc_write_inode_data(struct inode *inode, u32 *crc)
{
- ext4_lblk_t old_blk_size, cur_lblk_off, new_blk_size;
struct ext4_inode_info *ei = EXT4_I(inode);
- struct ext4_map_blocks map;
+ struct ext4_fc_inode_snap *snap = ei->i_fc_snap;
struct ext4_fc_add_range fc_ext;
struct ext4_fc_del_range lrange;
struct ext4_extent *ex;
+ struct ext4_fc_range *range;
+
+ if (!snap)
+ return -ECANCELED;
+
+ list_for_each_entry(range, &snap->data_list, list) {
+ if (range->tag == EXT4_FC_TAG_DEL_RANGE) {
+ lrange.fc_ino = cpu_to_le32(inode->i_ino);
+ lrange.fc_lblk = cpu_to_le32(range->lblk);
+ lrange.fc_len = cpu_to_le32(range->len);
+ if (!ext4_fc_add_tlv(inode->i_sb, EXT4_FC_TAG_DEL_RANGE,
+ sizeof(lrange), (u8 *)&lrange, crc))
+ return -ENOSPC;
+ continue;
+ }
+
+ fc_ext.fc_ino = cpu_to_le32(inode->i_ino);
+ ex = (struct ext4_extent *)&fc_ext.fc_ex;
+ ex->ee_block = cpu_to_le32(range->lblk);
+ ex->ee_len = cpu_to_le16(range->len);
+ ext4_ext_store_pblock(ex, range->pblk);
+ if (range->unwritten)
+ ext4_ext_mark_unwritten(ex);
+ else
+ ext4_ext_mark_initialized(ex);
+
+ if (!ext4_fc_add_tlv(inode->i_sb, EXT4_FC_TAG_ADD_RANGE,
+ sizeof(fc_ext), (u8 *)&fc_ext, crc))
+ return -ENOSPC;
+ }
+
+ return 0;
+}
+
+static void ext4_fc_free_ranges(struct list_head *head)
+{
+ struct ext4_fc_range *range, *range_n;
+
+ list_for_each_entry_safe(range, range_n, head, list) {
+ list_del(&range->list);
+ kfree(range);
+ }
+}
+
+static void ext4_fc_free_inode_snap(struct inode *inode)
+{
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ struct ext4_fc_inode_snap *snap = ei->i_fc_snap;
+
+ if (!snap)
+ return;
+
+ ext4_fc_free_ranges(&snap->data_list);
+ kfree(snap);
+ ei->i_fc_snap = NULL;
+}
+
+static int ext4_fc_snapshot_inode_data(struct inode *inode,
+ struct list_head *ranges)
+{
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ ext4_lblk_t start_lblk, end_lblk, cur_lblk;
+ struct ext4_map_blocks map;
int ret;
spin_lock(&ei->i_fc_lock);
@@ -880,18 +963,21 @@ static int ext4_fc_write_inode_data(struct inode *inode, u32 *crc)
spin_unlock(&ei->i_fc_lock);
return 0;
}
- old_blk_size = ei->i_fc_lblk_start;
- new_blk_size = ei->i_fc_lblk_start + ei->i_fc_lblk_len - 1;
+ start_lblk = ei->i_fc_lblk_start;
+ end_lblk = ei->i_fc_lblk_start + ei->i_fc_lblk_len - 1;
ei->i_fc_lblk_len = 0;
spin_unlock(&ei->i_fc_lock);
- cur_lblk_off = old_blk_size;
- ext4_debug("will try writing %d to %d for inode %llu\n",
- cur_lblk_off, new_blk_size, inode->i_ino);
+ cur_lblk = start_lblk;
+ ext4_debug("snapshot data ranges %u-%u for inode %llu\n",
+ start_lblk, end_lblk,
+ (unsigned long long)inode->i_ino);
+
+ while (cur_lblk <= end_lblk) {
+ struct ext4_fc_range *range;
- while (cur_lblk_off <= new_blk_size) {
- map.m_lblk = cur_lblk_off;
- map.m_len = new_blk_size - cur_lblk_off + 1;
+ map.m_lblk = cur_lblk;
+ map.m_len = end_lblk - cur_lblk + 1;
ret = ext4_map_blocks(NULL, inode, &map,
EXT4_GET_BLOCKS_IO_SUBMIT |
EXT4_EX_NOCACHE);
@@ -899,17 +985,21 @@ static int ext4_fc_write_inode_data(struct inode *inode, u32 *crc)
return -ECANCELED;
if (map.m_len == 0) {
- cur_lblk_off++;
+ cur_lblk++;
continue;
}
+ range = kmalloc(sizeof(*range), GFP_NOFS);
+ if (!range)
+ return -ENOMEM;
+
+ range->lblk = map.m_lblk;
+ range->len = map.m_len;
+ range->pblk = 0;
+ range->unwritten = false;
+
if (ret == 0) {
- lrange.fc_ino = cpu_to_le32(inode->i_ino);
- lrange.fc_lblk = cpu_to_le32(map.m_lblk);
- lrange.fc_len = cpu_to_le32(map.m_len);
- if (!ext4_fc_add_tlv(inode->i_sb, EXT4_FC_TAG_DEL_RANGE,
- sizeof(lrange), (u8 *)&lrange, crc))
- return -ENOSPC;
+ range->tag = EXT4_FC_TAG_DEL_RANGE;
} else {
unsigned int max = (map.m_flags & EXT4_MAP_UNWRITTEN) ?
EXT_UNWRITTEN_MAX_LEN : EXT_INIT_MAX_LEN;
@@ -917,26 +1007,67 @@ static int ext4_fc_write_inode_data(struct inode *inode, u32 *crc)
/* Limit the number of blocks in one extent */
map.m_len = min(max, map.m_len);
- fc_ext.fc_ino = cpu_to_le32(inode->i_ino);
- ex = (struct ext4_extent *)&fc_ext.fc_ex;
- ex->ee_block = cpu_to_le32(map.m_lblk);
- ex->ee_len = cpu_to_le16(map.m_len);
- ext4_ext_store_pblock(ex, map.m_pblk);
- if (map.m_flags & EXT4_MAP_UNWRITTEN)
- ext4_ext_mark_unwritten(ex);
- else
- ext4_ext_mark_initialized(ex);
- if (!ext4_fc_add_tlv(inode->i_sb, EXT4_FC_TAG_ADD_RANGE,
- sizeof(fc_ext), (u8 *)&fc_ext, crc))
- return -ENOSPC;
+ range->tag = EXT4_FC_TAG_ADD_RANGE;
+ range->len = map.m_len;
+ range->pblk = map.m_pblk;
+ range->unwritten = !!(map.m_flags & EXT4_MAP_UNWRITTEN);
}
- cur_lblk_off += map.m_len;
+ INIT_LIST_HEAD(&range->list);
+ list_add_tail(&range->list, ranges);
+
+ cur_lblk += map.m_len;
}
return 0;
}
+static int ext4_fc_snapshot_inode(struct inode *inode)
+{
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ struct ext4_fc_inode_snap *snap;
+ int inode_len = EXT4_GOOD_OLD_INODE_SIZE;
+ struct ext4_iloc iloc;
+ LIST_HEAD(ranges);
+ int ret;
+ int alloc_ctx;
+
+ ret = ext4_get_inode_loc_noio(inode, &iloc);
+ if (ret)
+ return ret;
+
+ if (ext4_test_inode_flag(inode, EXT4_INODE_INLINE_DATA))
+ inode_len = EXT4_INODE_SIZE(inode->i_sb);
+ else if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE)
+ inode_len += ei->i_extra_isize;
+
+ snap = kmalloc(struct_size(snap, inode_buf, inode_len), GFP_NOFS);
+ if (!snap) {
+ brelse(iloc.bh);
+ return -ENOMEM;
+ }
+ INIT_LIST_HEAD(&snap->data_list);
+ snap->inode_len = inode_len;
+
+ memcpy(snap->inode_buf, (u8 *)ext4_raw_inode(&iloc), inode_len);
+ brelse(iloc.bh);
+
+ ret = ext4_fc_snapshot_inode_data(inode, &ranges);
+ if (ret) {
+ kfree(snap);
+ ext4_fc_free_ranges(&ranges);
+ return ret;
+ }
+
+ alloc_ctx = ext4_fc_lock(inode->i_sb);
+ ext4_fc_free_inode_snap(inode);
+ ei->i_fc_snap = snap;
+ list_splice_tail_init(&ranges, &snap->data_list);
+ ext4_fc_unlock(inode->i_sb, alloc_ctx);
+
+ return 0;
+}
+
/* Flushes data of all the inodes in the commit queue. */
static int ext4_fc_flush_data(journal_t *journal)
@@ -987,6 +1118,11 @@ static int ext4_fc_commit_dentry_updates(journal_t *journal, u32 *crc)
*/
if (list_empty(&fc_dentry->fcd_dilist))
continue;
+ /*
+ * For EXT4_FC_TAG_CREAT, fcd_dilist is linked on the created
+ * inode's i_fc_dilist list (kept singular), so we can recover the
+ * inode through it.
+ */
ei = list_first_entry(&fc_dentry->fcd_dilist,
struct ext4_inode_info, i_fc_dilist);
inode = &ei->vfs_inode;
@@ -1011,6 +1147,88 @@ static int ext4_fc_commit_dentry_updates(journal_t *journal, u32 *crc)
return 0;
}
+static int ext4_fc_snapshot_inodes(journal_t *journal)
+{
+ struct super_block *sb = journal->j_private;
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ struct ext4_inode_info *iter;
+ struct ext4_fc_dentry_update *fc_dentry;
+ struct inode **inodes;
+ unsigned int nr_inodes = 0;
+ unsigned int i = 0;
+ int ret = 0;
+ int alloc_ctx;
+
+ alloc_ctx = ext4_fc_lock(sb);
+ list_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list)
+ nr_inodes++;
+
+ list_for_each_entry(fc_dentry, &sbi->s_fc_dentry_q[FC_Q_MAIN], fcd_list) {
+ struct ext4_inode_info *ei;
+
+ if (fc_dentry->fcd_op != EXT4_FC_TAG_CREAT)
+ continue;
+ if (list_empty(&fc_dentry->fcd_dilist))
+ continue;
+
+ /* See the comment in ext4_fc_commit_dentry_updates(). */
+ ei = list_first_entry(&fc_dentry->fcd_dilist,
+ struct ext4_inode_info, i_fc_dilist);
+ if (!list_empty(&ei->i_fc_list))
+ continue;
+
+ nr_inodes++;
+ }
+ ext4_fc_unlock(sb, alloc_ctx);
+
+ if (!nr_inodes)
+ return 0;
+
+ inodes = kvcalloc(nr_inodes, sizeof(*inodes), GFP_NOFS);
+ if (!inodes)
+ return -ENOMEM;
+
+ alloc_ctx = ext4_fc_lock(sb);
+ list_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
+ inodes[i] = igrab(&iter->vfs_inode);
+ if (inodes[i])
+ i++;
+ }
+
+ list_for_each_entry(fc_dentry, &sbi->s_fc_dentry_q[FC_Q_MAIN], fcd_list) {
+ struct ext4_inode_info *ei;
+
+ if (fc_dentry->fcd_op != EXT4_FC_TAG_CREAT)
+ continue;
+ if (list_empty(&fc_dentry->fcd_dilist))
+ continue;
+
+ /* See the comment in ext4_fc_commit_dentry_updates(). */
+ ei = list_first_entry(&fc_dentry->fcd_dilist,
+ struct ext4_inode_info, i_fc_dilist);
+ if (!list_empty(&ei->i_fc_list))
+ continue;
+
+ inodes[i] = igrab(&ei->vfs_inode);
+ if (inodes[i])
+ i++;
+ }
+ ext4_fc_unlock(sb, alloc_ctx);
+
+ for (nr_inodes = 0; nr_inodes < i; nr_inodes++) {
+ ret = ext4_fc_snapshot_inode(inodes[nr_inodes]);
+ if (ret)
+ break;
+ }
+
+ for (nr_inodes = 0; nr_inodes < i; nr_inodes++) {
+ if (inodes[nr_inodes])
+ iput(inodes[nr_inodes]);
+ }
+ kvfree(inodes);
+ return ret;
+}
+
static int ext4_fc_perform_commit(journal_t *journal)
{
struct super_block *sb = journal->j_private;
@@ -1082,7 +1300,11 @@ static int ext4_fc_perform_commit(journal_t *journal)
EXT4_STATE_FC_COMMITTING);
}
ext4_fc_unlock(sb, alloc_ctx);
+
+ ret = ext4_fc_snapshot_inodes(journal);
jbd2_journal_unlock_updates(journal);
+ if (ret)
+ return ret;
/*
* Step 5: If file system device is different from journal device,
@@ -1281,6 +1503,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
struct ext4_inode_info,
i_fc_list);
list_del_init(&ei->i_fc_list);
+ ext4_fc_free_inode_snap(&ei->vfs_inode);
ext4_clear_inode_state(&ei->vfs_inode,
EXT4_STATE_FC_COMMITTING);
if (tid_geq(tid, ei->i_sync_tid)) {
@@ -1313,6 +1536,14 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
struct ext4_fc_dentry_update,
fcd_list);
list_del_init(&fc_dentry->fcd_list);
+ if (fc_dentry->fcd_op == EXT4_FC_TAG_CREAT &&
+ !list_empty(&fc_dentry->fcd_dilist)) {
+ /* See the comment in ext4_fc_commit_dentry_updates(). */
+ ei = list_first_entry(&fc_dentry->fcd_dilist,
+ struct ext4_inode_info,
+ i_fc_dilist);
+ ext4_fc_free_inode_snap(&ei->vfs_inode);
+ }
list_del_init(&fc_dentry->fcd_dilist);
release_dentry_name_snapshot(&fc_dentry->fcd_name);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c2c2d6ac7f3d..4678612f82e8 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5025,6 +5025,57 @@ int ext4_get_inode_loc(struct inode *inode, struct ext4_iloc *iloc)
return ret;
}
+/*
+ * ext4_get_inode_loc_noio() is a best-effort variant of ext4_get_inode_loc().
+ * It looks up the inode table block in the buffer cache and returns -EAGAIN if
+ * the block is not present or not uptodate, without starting any I/O.
+ */
+int ext4_get_inode_loc_noio(struct inode *inode, struct ext4_iloc *iloc)
+{
+ struct super_block *sb = inode->i_sb;
+ struct ext4_group_desc *gdp;
+ struct buffer_head *bh;
+ ext4_fsblk_t block;
+ int inodes_per_block, inode_offset;
+ unsigned long ino = inode->i_ino;
+
+ iloc->bh = NULL;
+ if (ino < EXT4_ROOT_INO ||
+ ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count))
+ return -EFSCORRUPTED;
+
+ iloc->block_group = (ino - 1) / EXT4_INODES_PER_GROUP(sb);
+ gdp = ext4_get_group_desc(sb, iloc->block_group, NULL);
+ if (!gdp)
+ return -EIO;
+
+ /* Figure out the offset within the block group inode table. */
+ inodes_per_block = EXT4_SB(sb)->s_inodes_per_block;
+ inode_offset = ((ino - 1) % EXT4_INODES_PER_GROUP(sb));
+ iloc->offset = (inode_offset % inodes_per_block) * EXT4_INODE_SIZE(sb);
+
+ block = ext4_inode_table(sb, gdp);
+ if (block <= le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block) ||
+ block >= ext4_blocks_count(EXT4_SB(sb)->s_es)) {
+ ext4_error(sb,
+ "Invalid inode table block %llu in block_group %u",
+ block, iloc->block_group);
+ return -EFSCORRUPTED;
+ }
+ block += inode_offset / inodes_per_block;
+
+ bh = sb_find_get_block(sb, block);
+ if (!bh)
+ return -EAGAIN;
+ if (!ext4_buffer_uptodate(bh)) {
+ brelse(bh);
+ return -EAGAIN;
+ }
+
+ iloc->bh = bh;
+ return 0;
+}
+
int ext4_get_fc_inode_loc(struct super_block *sb, unsigned long ino,
struct ext4_iloc *iloc)
--
2.53.0
^ permalink raw reply related
* [RFC v8 0/7] ext4: fast commit: snapshot inode state for FC log
From: Li Chen @ 2026-05-15 9:18 UTC (permalink / raw)
To: Zhang Yi, Theodore Ts'o, Andreas Dilger
Cc: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, linux-ext4,
linux-trace-kernel, linux-kernel
Hi,
(This RFC v8 series is rebased onto linux-next master as of 2026-05-09,
commit e98d21c170b0 ("Add linux-next specific files for 20260508"), and
depends on patch "ext4: fix fast commit wait/wake bit mapping on
64-bit" [0]).
Zhang Yi in RFC v3 review pointed out that postponing lockdep assertions only
masks the issue, and that sleeping in ext4_fc_track_inode() while holding
i_data_sem can form a real ABBA deadlock if the fast commit writer also needs
i_data_sem while the inode is in FC_COMMITTING.
Zhang Yi suggested two possible directions to address the root cause:
1. "Ha, the solution seems to have already been listed in the TODOs in
fast_commit.c.
Change ext4_fc_commit() to lookup logical to physical mapping using extent
status tree. This would get rid of the need to call ext4_fc_track_inode()
before acquiring i_data_sem. To do that we would need to ensure that
modified extents from the extent status tree are not evicted from memory."
2. "Alternatively, recording the mapped range of tracking might also be
feasible."
This series implements a hybrid way: it implements approach 2 by snapshotting inode image
and mapped ranges at commit time, and consuming only snapshots during log
writing.
Approach 2 still needs a mapping source while building the snapshot
(logical-to-physical and unwritten/hole semantics). Calling ext4_map_blocks()
there would take i_data_sem and can block inside the
jbd2_journal_lock_updates() window, which risks deadlocks or unbounded stalls.
So the snapshot path uses approach 1's extent status lookups as a best-effort
mapping source to avoid ext4_map_blocks().
I did not fully implement approach 1 (making extent status lookups
authoritative by preventing reclaim of needed entries) because that would need
additional pinning/integration under memory pressure and a larger correctness
surface. Instead, the extent status tree is treated as a cache and the
snapshot path falls back to full commit on cache misses or unstable mappings
(e.g. delayed allocation).
Lock inversion / deadlock model (before):
CPU0 (metadata update) CPU1 (fast commit)
-------------------- -----------------
... hold i_data_sem (A) mutex_lock(s_fc_lock) (B)
ext4_fc_track_inode() ext4_fc_write_inode_data()
mutex_lock(s_fc_lock) (B) ext4_map_blocks()
wait FC_COMMITTING (sleep) down_read(i_data_sem) (A)
This creates i_data_sem (A) -> s_fc_lock (B) on update paths, and
s_fc_lock (B) -> i_data_sem (A) on commit paths. Once CPU0 sleeps while
holding (A), CPU1 can block on (A) while holding (B), completing the ABBA
cycle.
New model (this series):
CPU0 (metadata update) CPU1 (fast commit)
-------------------- -----------------
... maybe hold i_data_sem (A) jbd2_journal_lock_updates()
ext4_fc_track_*() snapshot inode + ranges (no map_blocks)
mutex_lock(s_fc_lock) (B) jbd2_journal_unlock_updates()
if FC_COMMITTING: set FC_REQUEUE s_fc_lock (B)
no sleep write FC log from snapshots only
cleanup: clear COMMITTING, requeue if set
The commit path no longer takes i_data_sem while holding s_fc_lock, and
tracking no longer sleeps waiting for FC_COMMITTING. If an inode is updated
during a fast commit, EXT4_STATE_FC_REQUEUE records that fact and the inode
is moved to FC_Q_STAGING for the next commit.
The only remaining FC_COMMITTING waiter is ext4_fc_del(). It checks
FC_COMMITTING and FC_FLUSHING_DATA while holding s_fc_lock, drops s_fc_lock
around the sleep, and rechecks FC_COMMITTING after a FC_FLUSHING_DATA wait
before deleting the inode from the fast commit lists. This keeps inode
lifetime/deletion synchronized with the commit thread's transition from
FLUSHING_DATA to COMMITTING.
This series snapshots the on-disk inode and tracked data ranges while journal
updates are locked and existing handles are drained. The log writing phase then
serializes only snapshots, so it no longer needs to call ext4_map_blocks() and
take i_data_sem under s_fc_lock. This is done in two steps: patch 1 drops
ext4_map_blocks() from log writing by introducing commit-time snapshots, and
patch 5 drops ext4_map_blocks() from the snapshot path by using the extent
status cache. The snapshot also records whether a mapped extent is unwritten,
so the ADD_RANGE records (and replay) preserve unwritten semantics.
Snapshotting runs under jbd2_journal_lock_updates(). Since a cache miss in
ext4_get_inode_loc() can start synchronous inode table I/O and stall handle
starts for milliseconds, patch 1 uses ext4_get_inode_loc_noio() and falls back
to full commit if the inode table block is not present or not uptodate.
ext4_fc_track_inode() also stops waiting for FC_COMMITTING. Updates during an
ongoing fast commit are marked with EXT4_STATE_FC_REQUEUE and are replayed in
the next fast commit, while ext4_fc_del() waits for FC_COMMITTING so an inode
cannot be removed while the commit thread is still using it.
The extent status tree is a cache, not an authoritative source, so the snapshot
path falls back to full commit on cache misses or unstable mappings (e.g.
delayed allocation). This includes cases where extent status entries are not
present (or have been reclaimed) under memory pressure. The snapshot path does
not try to rebuild mappings by calling ext4_map_blocks(); instead it simply
marks the transaction fast commit ineligible.
To keep the updates-locked window bounded, the snapshot path caps the number of
snapshotted inodes and ranges per fast commit (currently 1024 inodes and 2048
ranges) and falls back to full commit when the cap is exceeded. The series also
handles the journal inode i_data_sem lockdep false positive via subclassing;
journal inode mapping may still take i_data_sem even when data inode mapping is
avoided.
Patch 6 adds the ext4_fc_lock_updates tracepoint to quantify the updates-locked
window and snapshot fallback reasons. Patch 7 extends
/proc/fs/ext4/<sb_id>/fc_info with best-effort snapshot counters. If the /proc
interface is undesirable, I can drop patch 7 and keep the tracepoint only, or
drop even both.
Testing and measurement were done on a QEMU/KVM guest with virtio-pmem + dax
(ext4 -O fast_commit, mounted dax,noatime). The workload does python3 500x
{4K write + fsync}, fallocate 256M, and python3 500x {creat + fsync(dir)}.
Over 3 cold boots, ext4_fc_lock_updates reported locked_ns p50 2.88-2.92 us,
p99 <= 6.71 us, and max <= 102.71 us, with snap_err always 0. Under stress-ng
memory pressure (stress-ng --vm 4 --vm-bytes 75% --timeout 60s), locked_ns p50
2.94 us, p99 <= 4.97 us, and max <= 20.07 us. The fc_info snapshot failure
counters stayed at 0.
These hold times are in the low microseconds range, and the caps keep the
worst case bounded.
Comments and guidance are very welcome. Please let me know if there are any
concerns about correctness, corner cases, or better approaches.
RFC v7 -> RFC v8:
- Base the series on "ext4: fix fast commit wait/wake bit
mapping on 64-bit", which the fast commit wait/wake paths now depend on.
- Factor out small ext4_fc_wait_inode_state()/ext4_fc_wake_inode_state()
helpers so the repeated FC state wait/wake mapping stays in one place.
- Use trace_call__ext4_fc_lock_updates() at the guarded tracepoint call site
so the static branch is not checked twice.
- Address the Sashiko feedback around the commit-time snapshot lifecycle and
snapshot stats accounting, including the FC_COMMITTING /
FC_FLUSHING_DATA transition and stale snapshot sizing fallback. [3][4]
RFC v6 -> RFC v7:
- Rebase onto linux-next master as of 2026-05-09, commit e98d21c170b0
("Add linux-next specific files for 20260508").
- Address Sashiko review feedback for RFC v6. [2]
- Fix the reported snapshot range arithmetic issue near EXT_MAX_BLOCKS to
avoid cur_lblk / range wraparound in the snapshot walk.
- Report successfully snapshotted inode counts in ext4_fc_lock_updates when
snapshotting stops early, as reported by Sashiko.
- Use READ_ONCE() + div64_u64() for the fc_info lock_updates average, as
reported by Sashiko.
RFC v5 -> RFC v6:
- Rebase onto linux-next master as of 2026-04-08.
- Address tracepoint review feedback by relying on enum auto-increment for
snap_err values and by switching the guarded ext4_fc_lock_updates call site
to trace_call__ext4_fc_lock_updates() to avoid the double static_branch. [1]
- Keep lock window accounting unconditional for fc_info while using the guarded
direct tracepoint call.
- Fix the inode debug print format exposed by the rebase.
RFC v4 -> RFC v5:
- Patch 6: Make ext4_fc_lock_updates snap_err human readable via
TRACE_DEFINE_ENUM() + __print_symbolic(), using a single TRACE_SNAP_ERR
mapping while keeping the enum values stable for tooling.
RFC v3 -> RFC v4:
- Replace lockdep_assert movement with removing the wait in
ext4_fc_track_inode() and using EXT4_STATE_FC_REQUEUE to capture updates
during an ongoing fast commit.
- Replace dropping s_fc_lock around log writing with commit-time snapshots of
inode image and mapped ranges (recording the mapped range of tracking as
suggested by Zhang Yi) so log writing consumes only snapshots.
- Avoid inode table I/O under jbd2_journal_lock_updates() via
ext4_get_inode_loc_noio() and fallback to full commit on cache misses.
- Use the extent status cache for snapshot mappings and fall back to full
commit on cache misses or unstable mappings (e.g. delayed allocation).
- Add tracepoint and /proc snapshot stats to quantify the updates-locked window
and snapshot fallback reasons.
RFC v2 -> RFC v3:
- rebase on top of
https://lore.kernel.org/linux-ext4/20251223131342.287864-1-me@linux.beauty/T/#u
RFC v1 -> RFC v2:
- patch 1: move comments to correct place
- patch 2: add it to patchset.
- add missing RFC prefix
RFC v1: https://lore.kernel.org/linux-ext4/20251222032655.87056-1-me@linux.beauty/T/#u
RFC v2: https://lore.kernel.org/linux-ext4/20251222151906.24607-1-me@linux.beauty/T/#t
RFC v3: https://lore.kernel.org/linux-ext4/20251224032943.134063-1-me@linux.beauty/
RFC v4: https://lore.kernel.org/all/20260120112538.132774-1-me@linux.beauty/
RFC v5: https://lore.kernel.org/all/20260317084624.457185-1-me@linux.beauty/t/#u
RFC v6: https://lore.kernel.org/all/20260408112020.716706-1-me@linux.beauty/
RFC v7: https://lore.kernel.org/all/20260511084304.1559557-1-me@linux.beauty/
[0]: https://lore.kernel.org/all/20260513085818.552432-1-me@linux.beauty/
[1]: https://lore.kernel.org/all/acZJl8QUYEq8voqQ@BLRRASHENOY1.amd.com/T/#u
[2]: https://sashiko.dev/#/patchset/20260408112020.716706-1-me%40linux.beauty
[3]: https://sashiko.dev/#/patchset/20260511084304.1559557-1-me%40linux.beauty?part=4
[4]: https://sashiko.dev/#/patchset/20260511084304.1559557-1-me%40linux.beauty?part=7
Thanks,
Li Chen (7):
ext4: fast commit: snapshot inode state before writing log
ext4: lockdep: handle i_data_sem subclassing for special inodes
ext4: fast commit: avoid waiting for FC_COMMITTING
ext4: fast commit: avoid self-deadlock in inode snapshotting
ext4: fast commit: avoid i_data_sem by dropping ext4_map_blocks() in
snapshots
ext4: fast commit: add lock_updates tracepoint
ext4: fast commit: export snapshot stats in fc_info
fs/ext4/ext4.h | 73 ++++-
fs/ext4/fast_commit.c | 768 +++++++++++++++++++++++++++++++++++---------
fs/ext4/inode.c | 51 +++
fs/ext4/super.c | 9 +
include/trace/events/ext4.h | 53 ++++
5 files changed, 805 insertions(+), 149 deletions(-)
--
2.53.0
^ permalink raw reply
* [PATCH] Re: Re: [RFC PATCH v2 04/10] rv/da: add pre-allocated storage pool for per-object monitors
From: Gabriele Monaco @ 2026-05-15 8:30 UTC (permalink / raw)
To: wen.yang; +Cc: linux-kernel, linux-trace-kernel, rostedt, Gabriele Monaco
In-Reply-To: <668f83581c58644a84cab5e6736864a439bb8e28.camel@redhat.com>
So this is what I meant. It's quick and dirty but seems to work as far
as I could test it.
I didn't change too much around to avoid confusing more, but it probably
needs a refactor for the functions positions and names. Some AI can do
that later after we agree on how it should look.
The main idea is (using current function names):
da_handle_start_[run_]_event() calls da_prepare_storage(), this
makes sure the storage is there and usable based on the strategy:
1. da_create_storage() plain allocation with kmalloc_nolock
2. da_create_or_get_pool() get a slot from the pool
3. da_fill_empty_storage() only set the target in a storage manually
allocated before
The reason why you'd need 3. is that since da_handle_start_event() is
called from a tracepoint, you may in no way be able to allocate from
there, then you use manually somewhere else with
da_create_empty_storage() if you don't have the target and
da_create_or_get() if you do (this one is misleading, we should probably
simplify further).
The newly created 2. might be useful if you aren't on preempt-rt and
cannot sleep but also don't want a manual allocation (beware of lock
dependencies, it doesn't always work).
Now, I left your da_create_or_get_kmalloc() unwired because I don't
really see the use case (you use kmalloc_nolock because you cannot lock,
so if it fails you don't try a kmalloc). But if we really want to offer
a possibility to allocate with GFP_KERNEL, we can make 1. more
configurable.
Does this make sense to you?
Thanks,
Gabriele
---
include/rv/da_monitor.h | 160 ++++++++++-------------
kernel/trace/rv/monitors/nomiss/nomiss.c | 2 +-
kernel/trace/rv/monitors/tlob/tlob.c | 15 +--
3 files changed, 74 insertions(+), 103 deletions(-)
diff --git a/include/rv/da_monitor.h b/include/rv/da_monitor.h
index 74aa95d9a284..3b4a36245531 100644
--- a/include/rv/da_monitor.h
+++ b/include/rv/da_monitor.h
@@ -21,6 +21,7 @@
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/hashtable.h>
+#include <linux/mempool.h>
/*
* Per-cpu variables require a unique name although static in some
@@ -67,6 +68,35 @@ static struct rv_monitor rv_this;
#define da_id_type int
#endif
+#define DA_ALLOC_AUTO 0
+#define DA_ALLOC_POOL 1
+#define DA_ALLOC_MANUAL 2
+
+/*
+ * Allow the per-object monitors to run allocation manually, necessary if the
+ * start condition is in a context problematic for allocation (e.g. scheduling).
+ * In such case, if the storage was pre-allocated without a target, set it now.
+ */
+#ifndef DA_MON_ALLOCATION_STRATEGY
+#define DA_MON_ALLOCATION_STRATEGY DA_ALLOC_AUTO
+#endif
+#ifndef DA_MON_POOL_SIZE
+#define DA_MON_POOL_SIZE 0
+#endif
+#if DA_MON_ALLOCATION_STRATEGY == DA_ALLOC_MANUAL
+#define da_prepare_storage da_fill_empty_storage
+
+#elif DA_MON_ALLOCATION_STRATEGY == DA_ALLOC_POOL
+#define da_prepare_storage da_create_or_get_pool
+#if DA_MON_POOL_SIZE == 0
+#error "DA_ALLOC_POOL requires DA_MON_POOL_SIZE to be non-zero"
+#endif
+
+#else
+#define da_prepare_storage da_create_storage
+#endif /* DA_MON_ALLOCATION_STRATEGY */
+
+
static void react(enum states curr_state, enum events event)
{
rv_react(&rv_this,
@@ -448,62 +478,38 @@ static inline monitor_target da_get_target_by_id(da_id_type id)
}
/*
- * Per-object pool state.
- *
- * Zero-initialised by default (storage == NULL ⟹ kmalloc mode). A monitor
- * opts into pool mode by calling da_monitor_init_prealloc(N) instead of
- * da_monitor_init(), which sets storage to a non-NULL kcalloc'd array.
- *
- * Because every field is wrapped in this struct and the struct itself is a
- * per-TU static, each monitor that includes this header gets a completely
- * independent pool. A kmalloc monitor (e.g. nomiss) and a pool monitor
- * (e.g. tlob) therefore coexist without any interference.
- *
- * da_pool_return_cb runs from softirq on non-PREEMPT_RT, so irqsave is
- * required to prevent deadlock with task-context callers. On PREEMPT_RT
- * it runs from an rcuc kthread where spinlock_t is a sleeping lock.
- */
-struct da_per_obj_pool {
- struct da_monitor_storage *storage; /* non-NULL ⟹ pool mode */
- struct da_monitor_storage **free; /* kmalloc'd pointer stack */
- unsigned int free_top;
- spinlock_t lock;
-};
-
-static struct da_per_obj_pool da_pool = {
- .lock = __SPIN_LOCK_UNLOCKED(da_pool.lock),
-};
+ * Per-object pool state using kmem_cache and mempool.
+ */
+static struct kmem_cache *da_mon_cache;
+static mempool_t *da_mon_pool;
static void da_pool_return_cb(struct rcu_head *head)
{
struct da_monitor_storage *ms =
container_of(head, struct da_monitor_storage, rcu);
- unsigned long flags;
-
- spin_lock_irqsave(&da_pool.lock, flags);
- da_pool.free[da_pool.free_top++] = ms;
- spin_unlock_irqrestore(&da_pool.lock, flags);
+ mempool_free(ms, da_mon_pool);
}
-/* Pops a slot from the pre-allocated pool; returns -ENOSPC if exhausted. */
-static inline int da_create_or_get_pool(da_id_type id, monitor_target target)
+/* Pops a slot from the pre-allocated pool; returns NULL if exhausted. */
+static inline struct da_monitor *da_create_or_get_pool(da_id_type id,
+ monitor_target target,
+ struct da_monitor *da_mon)
{
struct da_monitor_storage *mon_storage;
- unsigned long flags;
- spin_lock_irqsave(&da_pool.lock, flags);
- if (!da_pool.free_top) {
- spin_unlock_irqrestore(&da_pool.lock, flags);
- return -ENOSPC;
- }
- mon_storage = da_pool.free[--da_pool.free_top];
- spin_unlock_irqrestore(&da_pool.lock, flags);
+ if (da_mon)
+ return da_mon;
+ mon_storage = mempool_alloc_preallocated(da_mon_pool);
+ if (!mon_storage)
+ return NULL;
+
+ memset(mon_storage, 0, sizeof(*mon_storage));
mon_storage->id = id;
mon_storage->target = target;
guard(rcu)();
hash_add_rcu(da_monitor_ht, &mon_storage->node, id);
- return 0;
+ return &mon_storage->rv.da_mon;
}
/*
@@ -547,11 +553,12 @@ static inline int da_create_or_get_kmalloc(da_id_type id, monitor_target target)
}
/* Create the per-object storage if not already there. */
-static inline int da_create_or_get(da_id_type id, monitor_target target)
+// NOTE: this is only needed for manual allocation!
+// we can refactor to have it only defined there, leaving it for now
+static inline void da_create_or_get(da_id_type id, monitor_target target)
{
- if (da_pool.storage)
- return da_create_or_get_pool(id, target);
- return da_create_or_get_kmalloc(id, target);
+ guard(rcu)();
+ da_create_storage(id, target, da_get_monitor(id, target));
}
/*
@@ -573,7 +580,7 @@ static inline void da_destroy_storage(da_id_type id)
return;
da_monitor_reset_hook(&mon_storage->rv.da_mon);
hash_del_rcu(&mon_storage->node);
- if (da_pool.storage)
+ if (DA_MON_ALLOCATION_STRATEGY == DA_ALLOC_POOL)
call_rcu(&mon_storage->rcu, da_pool_return_cb);
else
kfree_rcu(mon_storage, rcu);
@@ -591,41 +598,26 @@ static __maybe_unused void da_monitor_reset_all(void)
}
/*
- * da_monitor_init_prealloc - initialise with a pre-allocated storage pool
- *
- * Allocates @prealloc_count storage slots up-front so that da_create_or_get()
- * and da_destroy_storage() never call kmalloc/kfree. Must be called instead
- * of da_monitor_init() for monitors that require pool mode.
+ * da_monitor_init - initialise in kmalloc mode (no pre-allocation)
*/
-static inline int da_monitor_init_prealloc(unsigned int prealloc_count)
+static inline int da_monitor_init(void)
{
hash_init(da_monitor_ht);
+ if (DA_MON_ALLOCATION_STRATEGY != DA_ALLOC_POOL)
+ return 0;
- da_pool.storage = kcalloc(prealloc_count, sizeof(*da_pool.storage),
- GFP_KERNEL);
- if (!da_pool.storage)
+ da_mon_cache = kmem_cache_create(__stringify(DA_MON_NAME) "_cache",
+ sizeof(struct da_monitor_storage),
+ 0, 0, NULL);
+ if (!da_mon_cache)
return -ENOMEM;
- da_pool.free = kmalloc_array(prealloc_count, sizeof(*da_pool.free),
- GFP_KERNEL);
- if (!da_pool.free) {
- kfree(da_pool.storage);
- da_pool.storage = NULL;
+ da_mon_pool = mempool_create_slab_pool(DA_MON_POOL_SIZE, da_mon_cache);
+ if (!da_mon_pool) {
+ kmem_cache_destroy(da_mon_cache);
+ da_mon_cache = NULL;
return -ENOMEM;
}
-
- da_pool.free_top = 0;
- for (unsigned int i = 0; i < prealloc_count; i++)
- da_pool.free[da_pool.free_top++] = &da_pool.storage[i];
- return 0;
-}
-
-/*
- * da_monitor_init - initialise in kmalloc mode (no pre-allocation)
- */
-static inline int da_monitor_init(void)
-{
- hash_init(da_monitor_ht);
return 0;
}
@@ -641,11 +633,10 @@ static inline void da_monitor_destroy_pool(void)
* pending callback.
*/
rcu_barrier();
- kfree(da_pool.storage);
- da_pool.storage = NULL;
- kfree(da_pool.free);
- da_pool.free = NULL;
- da_pool.free_top = 0;
+ mempool_destroy(da_mon_pool);
+ da_mon_pool = NULL;
+ kmem_cache_destroy(da_mon_cache);
+ da_mon_cache = NULL;
}
static inline void da_monitor_destroy_kmalloc(void)
@@ -676,23 +667,12 @@ static inline void da_monitor_destroy_kmalloc(void)
*/
static inline void da_monitor_destroy(void)
{
- if (da_pool.storage)
+ if (DA_MON_ALLOCATION_STRATEGY == DA_ALLOC_POOL)
da_monitor_destroy_pool();
else
da_monitor_destroy_kmalloc();
}
-/*
- * Allow the per-object monitors to run allocation manually, necessary if the
- * start condition is in a context problematic for allocation (e.g. scheduling).
- * In such case, if the storage was pre-allocated without a target, set it now.
- */
-#ifdef DA_SKIP_AUTO_ALLOC
-#define da_prepare_storage da_fill_empty_storage
-#else
-#define da_prepare_storage da_create_storage
-#endif /* DA_SKIP_AUTO_ALLOC */
-
#endif /* RV_MON_TYPE */
#if RV_MON_TYPE == RV_MON_GLOBAL || RV_MON_TYPE == RV_MON_PER_CPU
diff --git a/kernel/trace/rv/monitors/nomiss/nomiss.c b/kernel/trace/rv/monitors/nomiss/nomiss.c
index 31f90f3638d8..f089cc0e2f10 100644
--- a/kernel/trace/rv/monitors/nomiss/nomiss.c
+++ b/kernel/trace/rv/monitors/nomiss/nomiss.c
@@ -18,7 +18,7 @@
#define RV_MON_TYPE RV_MON_PER_OBJ
#define HA_TIMER_TYPE HA_TIMER_WHEEL
/* The start condition is on sched_switch, it's dangerous to allocate there */
-#define DA_SKIP_AUTO_ALLOC
+#define DA_MON_ALLOCATION_STRATEGY DA_ALLOC_MANUAL
typedef struct sched_dl_entity *monitor_target;
#include "nomiss.h"
#include <rv/ha_monitor.h>
diff --git a/kernel/trace/rv/monitors/tlob/tlob.c b/kernel/trace/rv/monitors/tlob/tlob.c
index 90e7035a0b55..486a6bac5cf9 100644
--- a/kernel/trace/rv/monitors/tlob/tlob.c
+++ b/kernel/trace/rv/monitors/tlob/tlob.c
@@ -88,8 +88,8 @@ struct tlob_task_state {
#define RV_MON_TYPE RV_MON_PER_OBJ
#define HA_TIMER_TYPE HA_TIMER_HRTIMER
-/* Pool mode: da_handle_start_event uses da_fill_empty_storage, not kmalloc. */
-#define DA_SKIP_AUTO_ALLOC
+#define DA_MON_ALLOCATION_STRATEGY DA_ALLOC_POOL
+#define DA_MON_POOL_SIZE TLOB_MAX_MONITORED
/* Type for da_monitor_storage.target; must be defined before the includes. */
typedef struct tlob_task_state *monitor_target;
@@ -428,7 +428,6 @@ int tlob_start_task(struct task_struct *task, u64 threshold_us)
struct da_monitor *da_mon;
struct ha_monitor *ha_mon;
u64 now_ns;
- int ret;
if (!da_monitor_enabled())
return -ENODEV;
@@ -457,14 +456,6 @@ int tlob_start_task(struct task_struct *task, u64 threshold_us)
ws->last_ts = ktime_get();
raw_spin_lock_init(&ws->entry_lock);
- /* Claim a pool slot (no kmalloc; DA_SKIP_AUTO_ALLOC + prealloc). */
- ret = da_create_or_get(task->pid, ws);
- if (ret) {
- put_task_struct(task);
- kmem_cache_free(tlob_state_cache, ws);
- return ret;
- }
-
atomic_inc(&tlob_num_monitored);
/* Hold RCU across handle + timer setup to keep da_mon valid. */
@@ -955,7 +946,7 @@ static int __tlob_init_monitor(void)
atomic_set(&tlob_num_monitored, 0);
- retval = da_monitor_init_prealloc(TLOB_MAX_MONITORED);
+ retval = da_monitor_init();
if (retval) {
kmem_cache_destroy(tlob_state_cache);
tlob_state_cache = NULL;
--
2.54.0
^ permalink raw reply related
* Re: [PATCH v7 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE
From: Lance Yang @ 2026-05-15 7:03 UTC (permalink / raw)
To: leitao
Cc: linmiaohe, akpm, david, ljs, vbabka, rppt, surenb, mhocko, shuah,
nao.horiguchi, rostedt, mhiramat, mathieu.desnoyers, corbet,
skhan, liam, linux-mm, linux-kernel, linux-doc, linux-kselftest,
linux-trace-kernel, kernel-team, Lance Yang
In-Reply-To: <agXcPleVC9LGVCmj@gmail.com>
On Thu, May 14, 2026 at 07:37:14AM -0700, Breno Leitao wrote:
>On Thu, May 14, 2026 at 09:28:30PM +0800, Lance Yang wrote:
>>
>> On Wed, May 13, 2026 at 08:39:33AM -0700, Breno Leitao wrote:
>> >get_any_page() collapses three different failure modes into a single
>> >-EIO return:
>> >
>> > * the put_page race in the !count_increased path;
>> > * the HWPoisonHandlable() rejection that bounces out of
>> > __get_hwpoison_page() with -EBUSY and exhausts shake_page() retries;
>> > * the HWPoisonHandlable() rejection that goes through the
>> > count_increased / put_page / shake_page retry loop.
>> >
>> >The first is transient (the page is racing with the allocator). The
>> >second can be either transient (a userspace folio briefly off LRU
>> >during migration/compaction) or stable (slab/vmalloc/page-table/
>> >kernel-stack pages). The third describes a stable kernel-owned page
>> >that the count_increased=true caller already held a reference on.
>> >
>> >Distinguish them on the return path: keep -EIO for both the put_page
>> >race and the -EBUSY-after-retries branch (shake_page() cannot drag a
>> >folio back from active migration, so we cannot prove the page is
>> >permanently kernel-owned from there), keep -EBUSY for the allocation
>> >race (unchanged), and return -ENOTRECOVERABLE only from the
>> >count_increased-true HWPoisonHandlable() rejection that exhausts its
>> >retries -- the caller's reference is structural evidence that the
>> >page is owned by the kernel.
>> >
>> >Extend the unhandlable-page pr_err() to fire for either errno and
>> >update the get_hwpoison_page() kerneldoc.
>> >
>> >memory_failure() still folds every negative return into
>> >MF_MSG_GET_HWPOISON via its existing "else if (res < 0)" branch, so
>> >this patch is a no-op for users of memory_failure() and only changes
>> >the errno that soft_offline_page() can propagate to its callers. A
>> >follow-up wires the new return code through memory_failure() and
>> >reports MF_MSG_KERNEL for the unrecoverable cases.
>> >
>> >Suggested-by: David Hildenbrand <david@kernel.org>
>> >Signed-off-by: Breno Leitao <leitao@debian.org>
>> >---
>> > mm/memory-failure.c | 18 +++++++++++++++---
>> > 1 file changed, 15 insertions(+), 3 deletions(-)
>> >
>> >diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> >index 49bcfbd04d213..bae883df3ccb2 100644
>> >--- a/mm/memory-failure.c
>> >+++ b/mm/memory-failure.c
>> >@@ -1408,6 +1408,15 @@ static int get_any_page(struct page *p, unsigned long flags)
>> > shake_page(p);
>> > goto try_again;
>> > }
>> >+ /*
>> >+ * Return -EIO rather than -ENOTRECOVERABLE: this
>> >+ * branch is also reached for pages that are merely
>> >+ * off-LRU transiently (e.g. a folio in the middle
>> >+ * of migration or compaction), which shake_page()
>> >+ * cannot drag back. The caller cannot prove the
>> >+ * page is permanently kernel-owned from here, so
>> >+ * keep it on the recoverable errno.
>> >+ */
>> > ret = -EIO;
>> > goto out;
>> > }
>> >@@ -1427,10 +1436,10 @@ static int get_any_page(struct page *p, unsigned long flags)
>> > goto try_again;
>> > }
>> > put_page(p);
>> >- ret = -EIO;
>> >+ ret = -ENOTRECOVERABLE;
>> > }
>> > out:
>> >- if (ret == -EIO)
>> >+ if (ret == -EIO || ret == -ENOTRECOVERABLE)
>> > pr_err("%#lx: unhandlable page.\n", page_to_pfn(p));
>> >
>> > return ret;
>> >@@ -1487,7 +1496,10 @@ static int __get_unpoison_page(struct page *page)
>> > * -EIO for pages on which we can not handle memory errors,
>> > * -EBUSY when get_hwpoison_page() has raced with page lifecycle
>> > * operations like allocation and free,
>> >- * -EHWPOISON when the page is hwpoisoned and taken off from buddy.
>> >+ * -EHWPOISON when the page is hwpoisoned and taken off from buddy,
>> >+ * -ENOTRECOVERABLE for stable kernel-owned pages the handler
>> >+ * cannot recover (PG_reserved, slab, vmalloc, page tables,
>> >+ * kernel stacks, and similar non-LRU/non-buddy pages).
>>
>> Did you test this patch series? I don't see how we ever get to
>> -ENOTRECOVERABLE there ...
>
>Yes, I did. I am using the following test case:
Okay.
>https://github.com/leitao/linux/commit/cfebe84ddeab5ac34ed456331db980d57e7025dc
>
> # RUN_DESTRUCTIVE=1 tools/testing/selftests/mm/hwpoison-panic.sh
> # enabling /proc/sys/vm/panic_on_unrecoverable_memory_failure
> # injecting hwpoison at phys 0x2a00000 (Kernel rodata)
> # expecting kernel panic: 'Memory failure: <pfn>: unrecoverable page'
> [ 501.113256] Memory failure: 0x2a00: recovery action for reserved kernel page: Ignored
> [ 501.113956] Kernel panic - not syncing: Memory failure: 0x2a00: unrecoverable page
>
>
>> Even with MF_COUNT_INCREASED, the first pass does:
>>
>> if (flags & MF_COUNT_INCREASED)
>> count_increased = true;
>>
>> [...]
>>
>> if (PageHuge(p) || HWPoisonHandlable(p, flags)) {
>> ret = 1;
>> } else {
>> if (pass++ < GET_PAGE_MAX_RETRY_NUM) { <-
>> put_page(p);
>> shake_page(p);
>> count_increased = false;
>> goto try_again; <-
>> }
>> put_page(p);
>> ret = -ENOTRECOVERABLE;
>> }
>>
>> Then we come back with count_increased=false:
>>
>> try_again:
>> if (!count_increased) {
>> ret = __get_hwpoison_page(p, flags); <-
>> if (!ret) {
>> [...]
>> } else if (ret == -EBUSY) { <-
>> [...]
>> ret = -EIO;
>> goto out; <-
>> }
>> }
>>
>> For slab/vmalloc/page-table pages, __get_hwpoison_page() returns -EBUSY:
>>
>> if (!HWPoisonHandlable(&folio->page, flags))
>> return -EBUSY;
>>
>> so they still seem to end up as -EIO ... Am I missing something?
>
>You are not, and thanks for catching this. I traced it again and the
>-ENOTRECOVERABLE branch is unreachable for slab/vmalloc/page-table pages
>exactly as you described. The __get_hwpoison_page() → -EBUSY → shake → retry
>loop catches them first and they exit as -EIO.
Wonder if it would be simpler to just do a positive check near the top
of get_any_page() instead. Something like:
static bool hwpoison_unrecoverable_kernel_page(struct page *page,
unsigned long flags)
{
if ((flags & MF_SOFT_OFFLINE) && page_has_movable_ops(page))
return false;
return PageReserved(page) || PageSlab(page) ||
PageTable(page) || PageLargeKmalloc(page);
}
static int get_any_page(struct page *p, unsigned long flags)
{
int ret = 0, pass = 0;
bool count_increased = false;
if (flags & MF_COUNT_INCREASED)
count_increased = true;
if (hwpoison_unrecoverable_kernel_page(p, flags)) {
if (count_increased)
put_page(p);
ret = -ENOTRECOVERABLE;
goto out;
}
[...]
}
Then get_any_page() could return -ENOTRECOVERABLE only for page types we
can positively identify as kernel-owned.
These types always fail HWPoisonHandlable(), so retrying does not really
buy us anything for them.
Won't cover everything (vmalloc, kernel stacks, etc. have no page_type
to key off), but that's fine - best effort, right?
Cheers, Lance
>
>The selftest I am using (link above) only validated the PageReserved
>short-circuit added in patch 3, which lives in memory_failure() and never
>reaches get_any_page().
>
>I even thought about this code path, and I was not convinced we should return
>-ENOTRECOVERABLE, thus I documented the following (as in this current patch)
>
> @@ -1408,6 +1408,15 @@ static int get_any_page(struct page *p, unsigned long flags)
> shake_page(p);
> goto try_again;
> }
> + /*
> + * Return -EIO rather than -ENOTRECOVERABLE: this
> + * branch is also reached for pages that are merely
> + * off-LRU transiently (e.g. a folio in the middle
> + * of migration or compaction), which shake_page()
> + * cannot drag back. The caller cannot prove the
> + * page is permanently kernel-owned from here, so
> + * keep it on the recoverable errno.
> + */
> ret = -EIO;
>
^ permalink raw reply
* Re: [PATCH v7 2/6] mm/memory-failure: surface unhandlable kernel pages as -ENOTRECOVERABLE
From: Miaohe Lin @ 2026-05-15 3:04 UTC (permalink / raw)
To: Breno Leitao
Cc: linux-mm, linux-kernel, linux-doc, linux-kselftest,
linux-trace-kernel, kernel-team, Andrew Morton, David Hildenbrand,
Lorenzo Stoakes, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Naoya Horiguchi,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Jonathan Corbet, Shuah Khan, Liam R. Howlett
In-Reply-To: <20260513-ecc_panic-v7-2-be2e578e61da@debian.org>
On 2026/5/13 23:39, Breno Leitao wrote:
> get_any_page() collapses three different failure modes into a single
> -EIO return:
>
> * the put_page race in the !count_increased path;
> * the HWPoisonHandlable() rejection that bounces out of
> __get_hwpoison_page() with -EBUSY and exhausts shake_page() retries;
> * the HWPoisonHandlable() rejection that goes through the
> count_increased / put_page / shake_page retry loop.
>
> The first is transient (the page is racing with the allocator). The
> second can be either transient (a userspace folio briefly off LRU
> during migration/compaction) or stable (slab/vmalloc/page-table/
> kernel-stack pages). The third describes a stable kernel-owned page
> that the count_increased=true caller already held a reference on.
>
> Distinguish them on the return path: keep -EIO for both the put_page
> race and the -EBUSY-after-retries branch (shake_page() cannot drag a
> folio back from active migration, so we cannot prove the page is
> permanently kernel-owned from there), keep -EBUSY for the allocation
> race (unchanged), and return -ENOTRECOVERABLE only from the
> count_increased-true HWPoisonHandlable() rejection that exhausts its
> retries -- the caller's reference is structural evidence that the
> page is owned by the kernel.
>
> Extend the unhandlable-page pr_err() to fire for either errno and
> update the get_hwpoison_page() kerneldoc.
>
> memory_failure() still folds every negative return into
> MF_MSG_GET_HWPOISON via its existing "else if (res < 0)" branch, so
> this patch is a no-op for users of memory_failure() and only changes
> the errno that soft_offline_page() can propagate to its callers. A
> follow-up wires the new return code through memory_failure() and
> reports MF_MSG_KERNEL for the unrecoverable cases.
>
> Suggested-by: David Hildenbrand <david@kernel.org>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> mm/memory-failure.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 49bcfbd04d213..bae883df3ccb2 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1408,6 +1408,15 @@ static int get_any_page(struct page *p, unsigned long flags)
> shake_page(p);
> goto try_again;
> }
> + /*
> + * Return -EIO rather than -ENOTRECOVERABLE: this
> + * branch is also reached for pages that are merely
> + * off-LRU transiently (e.g. a folio in the middle
> + * of migration or compaction), which shake_page()
> + * cannot drag back. The caller cannot prove the
> + * page is permanently kernel-owned from here, so
> + * keep it on the recoverable errno.
> + */
> ret = -EIO;
> goto out;
> }
> @@ -1427,10 +1436,10 @@ static int get_any_page(struct page *p, unsigned long flags)
> goto try_again;
> }
> put_page(p);
> - ret = -EIO;
> + ret = -ENOTRECOVERABLE;
Theoretically, pages that are merely off-LRU transiently as you commented above could
reach here too? Or am I miss something?
Thanks.
.
> }
> out:
> - if (ret == -EIO)
> + if (ret == -EIO || ret == -ENOTRECOVERABLE)
> pr_err("%#lx: unhandlable page.\n", page_to_pfn(p));
>
> return ret;
> @@ -1487,7 +1496,10 @@ static int __get_unpoison_page(struct page *page)
> * -EIO for pages on which we can not handle memory errors,
> * -EBUSY when get_hwpoison_page() has raced with page lifecycle
> * operations like allocation and free,
> - * -EHWPOISON when the page is hwpoisoned and taken off from buddy.
> + * -EHWPOISON when the page is hwpoisoned and taken off from buddy,
> + * -ENOTRECOVERABLE for stable kernel-owned pages the handler
> + * cannot recover (PG_reserved, slab, vmalloc, page tables,
> + * kernel stacks, and similar non-LRU/non-buddy pages).
> */
> static int get_hwpoison_page(struct page *p, unsigned long flags)
> {
>
^ permalink raw reply
* Re: [PATCH v7 1/6] mm/memory-failure: drop dead error_states[] entry for reserved pages
From: Miaohe Lin @ 2026-05-15 2:48 UTC (permalink / raw)
To: Breno Leitao
Cc: linux-mm, linux-kernel, linux-doc, linux-kselftest,
linux-trace-kernel, kernel-team, Andrew Morton, David Hildenbrand,
Lorenzo Stoakes, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Shuah Khan, Naoya Horiguchi,
Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Jonathan Corbet, Shuah Khan, Liam R. Howlett
In-Reply-To: <20260513-ecc_panic-v7-1-be2e578e61da@debian.org>
On 2026/5/13 23:39, Breno Leitao wrote:
> The first entry of error_states[],
>
> { reserved, reserved, MF_MSG_KERNEL, me_kernel },
>
> is unreachable. identify_page_state() has two callers, and neither
> one can dispatch a PG_reserved page to me_kernel():
>
> * memory_failure() reaches identify_page_state() only after
> get_hwpoison_page() returned 1. get_any_page() reaches that
> return only via __get_hwpoison_page(), which gates the refcount
> on HWPoisonHandlable(). HWPoisonHandlable() rejects PG_reserved
> pages, so they fail with -EBUSY/-EIO long before
> identify_page_state() runs.
>
> * try_memory_failure_hugetlb() reaches identify_page_state() on
> the MF_HUGETLB_IN_USED branch, but the page is necessarily a
> hugetlb folio there. The first table entry that matches a
> hugetlb folio is { head, head, MF_MSG_HUGE, me_huge_page }, so
> they dispatch to me_huge_page() before the (now-removed)
> reserved entry would have matched, regardless of whether
> PG_reserved happens to be set on the head page.
>
> me_kernel() never executes and the entry exists only to be matched
> against by code that cannot see it.
>
> Drop the entry, the me_kernel() helper, and the now-unused
> "reserved" macro. Leave the MF_MSG_KERNEL enum value in place: it
> remains part of the tracepoint and pr_err() string tables, and
> follow-on work to classify unrecoverable kernel pages can reuse it
> without churning the user-visible enum.
>
> No functional change.
As the code evolves, this entry is no longer needed. Thanks for cleanup.
>
> Suggested-by: David Hildenbrand <david@kernel.org>
> Signed-off-by: Breno Leitao <leitao@debian.org>
With David's comments addressed, this patch looks good to me:
Acked-by: Miaohe Lin <linmiaohe@huawei.com>
Thanks.
.
^ permalink raw reply
* [RFC PATCH v2.2 18/28] mm/damon: trace probe_hits
From: SeongJae Park @ 2026-05-15 0:44 UTC (permalink / raw)
Cc: SeongJae Park, Andrew Morton, Masami Hiramatsu, Mathieu Desnoyers,
Steven Rostedt, damon, linux-kernel, linux-mm, linux-trace-kernel
In-Reply-To: <20260515004433.128933-1-sj@kernel.org>
Introduce a new tracepoint for exposing the per-region per-probe
positive sample count via tracefs.
Signed-off-by: SeongJae Park <sj@kernel.org>
---
include/trace/events/damon.h | 38 ++++++++++++++++++++++++++++++++++++
mm/damon/core.c | 9 +++++++++
2 files changed, 47 insertions(+)
diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h
index 24fc402ab3c85..ec1e317923fd3 100644
--- a/include/trace/events/damon.h
+++ b/include/trace/events/damon.h
@@ -130,6 +130,44 @@ TRACE_EVENT(damon_monitor_intervals_tune,
TP_printk("sample_us=%lu", __entry->sample_us)
);
+TRACE_EVENT_CONDITION(damon_aggregated_v2,
+
+ TP_PROTO(unsigned int target_id, struct damon_region *r,
+ unsigned int nr_regions, unsigned int nr_probes),
+
+ TP_ARGS(target_id, r, nr_regions, nr_probes),
+
+ TP_CONDITION(nr_probes > 0),
+
+ TP_STRUCT__entry(
+ __field(unsigned long, target_id)
+ __field(unsigned long, start)
+ __field(unsigned long, end)
+ __field(unsigned int, nr_regions)
+ __field(unsigned int, nr_accesses)
+ __field(unsigned int, age)
+ __dynamic_array(unsigned char, probe_hits, nr_probes)
+ ),
+
+ TP_fast_assign(
+ __entry->target_id = target_id;
+ __entry->start = r->ar.start;
+ __entry->end = r->ar.end;
+ __entry->nr_regions = nr_regions;
+ __entry->nr_accesses = r->nr_accesses;
+ __entry->age = r->age;
+ memcpy(__get_dynamic_array(probe_hits), r->probe_hits,
+ sizeof(*r->probe_hits) * nr_probes);
+ ),
+
+ TP_printk("target_id=%lu nr_regions=%u %lu-%lu: %u %u probe_hits=%s",
+ __entry->target_id, __entry->nr_regions,
+ __entry->start, __entry->end,
+ __entry->nr_accesses, __entry->age,
+ __print_hex(__get_dynamic_array(probe_hits),
+ __get_dynamic_array_len(probe_hits)))
+);
+
TRACE_EVENT(damon_aggregated,
TP_PROTO(unsigned int target_id, struct damon_region *r,
diff --git a/mm/damon/core.c b/mm/damon/core.c
index 1c9d2fb69f98d..ab8ac9ec8450d 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1881,6 +1881,13 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
{
struct damon_target *t;
unsigned int ti = 0; /* target's index */
+ unsigned int nr_probes = 0;
+ struct damon_probe *probe;
+
+ if (trace_damon_aggregated_v2_enabled()) {
+ damon_for_each_probe(probe, c)
+ nr_probes++;
+ }
damon_for_each_target(t, c) {
struct damon_region *r;
@@ -1889,6 +1896,8 @@ static void kdamond_reset_aggregated(struct damon_ctx *c)
int i;
trace_damon_aggregated(ti, r, damon_nr_regions(t));
+ trace_damon_aggregated_v2(ti, r, damon_nr_regions(t),
+ nr_probes);
damon_warn_fix_nr_accesses_corruption(r);
r->last_nr_accesses = r->nr_accesses;
r->nr_accesses = 0;
--
2.47.3
^ permalink raw reply related
* [RFC PATCH v2.2 00/28] mm/damon: introduce data attributes monitoring
From: SeongJae Park @ 2026-05-15 0:44 UTC (permalink / raw)
Cc: SeongJae Park, Liam R. Howlett, Andrew Morton, David Hildenbrand,
Jonathan Corbet, Lorenzo Stoakes, Masami Hiramatsu,
Mathieu Desnoyers, Michal Hocko, Mike Rapoport, Shuah Khan,
Shuah Khan, Steven Rostedt, Suren Baghdasaryan, Vlastimil Babka,
damon, linux-doc, linux-kernel, linux-kselftest, linux-mm,
linux-trace-kernel
TL; DR
======
Extend DAMON for monitoring general data attributes other than accesses.
The short term motivation is lightweight page type (e.g., belonging
cgroup) aware monitoring. In long term, this will help extending DAMON
for multiple access events capture primitives (e.g., page faults and
PMU) and eventually pivotting DAMON to a "Data Attributes Monitoring and
Operations eNgine" in long term.
Background: High Cost of Page Level Properties Monitoring
=========================================================
DAMON is initially introduced as a Data Access MONitor. It has been
extended for not only access monitoring but also data access-aware
system operations (DAMOS). But still the monitoring part is only for
data accesses.
Data access patterns is good information, but some users need more
holistic views. Particularly, users want to show the access pattern
information together with the types of the memory. For example, users
who work for making huge pages efficiently want to know how much of
DAMON-found hot/cold regions are backed by huge pages. Users who run
multiple workloads with different cgroups want to know how much of
DAMON-found hot/cold regions belong to specific cgroups.
For the user demand, we developed a DAMOS extension for page level
properties based monitoring [1], which has landed on 6.14. Using the
feature, users can inform the page level data properties that they are
interested in, in a flexible format that uses DAMOS filters. Then,
DAMON applies the filters to each folio of the entire DAMON region and
lets users know how many bytes of memory in each DAMON region passed the
given filters.
This gives page level detailed and deterministic information to users.
But, because the operation is done at page level, the overhead is
proportional to the memory size. It was useful for test or debugging
purposes on a small number of machines. But it was obviously too heavy
to be enabled always on all machines running the real user workloads.
For real world workloads, it was recommended to use the feature with
user-space controlled sampling approaches. For example, users could do
the page level monitoring only once per hour, on randomly selected one
percent of machines of their fleet. If the runtime and the size of the
fleet is long and big enough, it should provide statistically meaningful
data.
But users are too busy to implement such controls on their own.
Data Attributes Monitoring
==========================
Extend DAMON to monitor not only data accesses, but also general data
attributes. Do the extension while keeping the main promise of DAMON,
the bounded and best-effort minimum overhead.
Allow users to specify what data attributes in addition to the data
access they want to monitor. Users can install one 'data probe' per
data attribute of their interest for this purpose. The 'data probe'
should be able to be applied to any memory, and determine if the given
memory has the appropriate data attribute. E.g., if memory of physical
address 42 belongs to cgroup A. Each 'data probe' is configured with
filters that are very similar to the DAMOS filters.
When DAMON checks if each sampling address memory of each region is
accessed since the last check, it applies data probes if registered.
Same to the number of access check-positive samples accounting
(nr_accesses), it accounts the number of each data probe-positive
samples in another per-region counters array, namely 'probe_hits'. When
DAMON resets nr_accesses every aggregation interval, it resets
'probe_hits' together.
Users can read 'probe_hits' just before the values are reset. In this
way, users can know how many hot/cold memory regions have data
attributes of their interest. E.g., 30 percent of this system's hot
memory is belonging to cgroup A, and 80 percent of the cgroup
A-belonging hot memory is backed by huge pages.
Patches Sequence
================
First eight patches implement the core feature, interface and the
working support. Patch 1 introduces data probe data structure, namely
damon_probe. Patch 2 extends damon_ctx for installing data probes.
Patch 3 introduces another data structure for filters of each data
probe, namely damon_filter. Patch 4 updates damon_ctx commit function
to handle the probes. Patch 5 extends damon_region for the per-region
per-probe positive samples counter, namely probe_hits. Patch 6 extends
damon_operations for applying probes on the underlying DAMON operations
implementation. Patch 7 updates kdamond_fn() to invoke the probes
applying callback. Patch 8 finally implements the probes support on
paddr ops.
Ten changes for user interface (patches 9-18) come next. Patches 9-13
implements sysfs directories and files for setting data probes, namely
probes directory, probe directory, filters directory, filter directory
and filter directory internal files, respectively. Patch 14 connects
the user inputs that are made via the sysfs files to DAMON core.
Following three patches (patches 15-17) implement sysfs directories and
files for showing the probe_hits to users, namely probes directory,
probe directory and hits files, respectively. Patch 18 introduces a new
tracepoint for showing the probe_hits via tracefs.
Patch 19 adds a selftest for the sysfs files.
Patches 20 and 21 documents the design and usage of the new feature,
respectively.
Seven additional patches (patches 22-28) for monitoring belonging memory
cgroup follow. Depending on the feedback, this part might be separated
to another series in future. Patch 22 defines the DAMON filter type for
the new attribute, namely DAMON_FILTER_TYPE_MEMCG. Patch 23 add the
support on paddr ops. Patch 24 updates the sysfs interface for setup of
the target memcg. Patch 25 move code for easy reuse of the filter
target memcg setup. Patch 26 connects the user input to the core layer.
Finally, patches 27 and 28 update the design and usage documents for the
memcg attribute monitoring support.
Discussions
===========
This allows the page properties monitoring with overhead that is low
enough to be enabled always on real world workloads. Because the
sampling time for access check is reused for data attributes check, the
upper-bounded and best-effort minimum overhead of DAMON is kept.
Because the sampling memory for access check is reused for data
attributes check, additional overhead is minimum.
Still DAMOS-based page level properties monitoring should be useful,
because it provides a deterministic page level information. When in
doubt of the sampling based information, running DAMOS-based one
together and comparing the results would be useful, for debugging and
tuning.
Plan for Dropping RFC tag
=========================
I'm considering renaming the tracepoint for exposing probe_hits
(damon_aggregated_v2).
Making changes for feedback from myself, humans and Sashiko should be
the major remaining work.
I'm currently hoping to drop the RFC tag by 7.2-rc1.
Future Works: Mid Term
========================
This version of implementation is limiting the maximum number of data
probes to four. I will try to find a way to remove the limit in future.
I personally think it should be enough for common use cases, though, and
therefore not giving high priority at the moment.
Future Works: Long Term
=======================
There are user requests for extending DAMON with detailed access
information, for example, per-CPUs/threads/read/writes monitoring. For
that, I was working [2] on extending DAMON to use page fault events as
another access check primitives, and making the infrastructure flexible
for future use of yet another access check primitive. Actually there is
another ongoing work [3] for extending DAMON with PMU events. The
motivation of the work is reducing the overhead, though.
In my work [2], I was introducing a new interface for access sampling
primitives control. Now I think this data probe interface can be used
for that, too. That is, data access becomes just one type of data
attribute. Also, pg_idle-confirmed access, page fault-confirmed access,
and PMU event-confirmed access will be different types of data
attributes.
The regions adjustment mechanism is currently working based on the
access information. That's because DAMON is designed for data access
monitoring. That is, data access information is the primary interest,
and therefore DAMON adjusts regions in a way that can best-present the
information.
Once data access becomes just one of data attributes, there is no reason
to think data access that special. There might be some users not
interested in access at all but want to know the location of memory of
specific type. Data probes interface will allow doing that. Further,
we could extend the interface to let users set any data attribute as the
'primary' attribute. Then, DAMON will split and merge regions in a way
that can best-present the 'primary' attributes.
DAMOS will also be extended, to specify targets based on not only the
data access pattern, but all user-registered data attributes. From this
stage, we may be able to call DAMON as a "Data Attributes Monitoring and
Operations eNgine".
[1] https://lore.kernel.org/20250106193401.109161-1-sj@kernel.org
[2] https://lore.kernel.org/20251208062943.68824-1-sj@kernel.org/
[3] https://lore.kernel.org/20260423004211.7037-1-akinobu.mita@gmail.com
Changes from RFC v2.1
- rfc v2.1: https://lore.kernel.org/20260514140904.119781-1-sj@kernel.org
- Rebase to mm-stable (7.1-rc3) to avoid Sashiko patch apply failure.
Changes from RFC v2
- rfc v2: https://lore.kernel.org/20260512143645.113201-1-sj@kernel.org
- Optimize nr_probes calculation for probe_hits tracepoint.
- Use TRACE_EVENT_CONDITION() for probe_hits tracepoint.
- Rebase to latest mm-new.
Changes from RFC
- rfc: https://lore.kernel.org/all/20260426205222.93895-1-sj@kernel.org/
- Support memcg DAMON filter.
- Use per-probe probe_hits sysfs file.
- Use dynamic_array for probe_hits tracing.
- Fix filter matching field.
- Fix folio leaking in damon_pa_filter_pass().
- Move nr_regions of damon_aggregated_v2 tracepoint after end.
- Rename DAMON_TEST_TYPE_ANON to DAMON_FILTER_TYPE_ANON.
SeongJae Park (28):
mm/damon/core: introduce struct damon_probe
mm/damon/core: embed damon_probe objects in damon_ctx
mm/damon/core: introduce damon_filter
mm/damon/core: commit probes
mm/damon/core: introduce damon_region->probe_hits
mm/damon/core: introduce damon_ops->apply_probes
mm/damon/core: do data attributes monitoring
mm/damon/paddr: support data attributes monitoring
mm/damon/sysfs: implement probes dir
mm/damon/sysfs: implement probe dir
mm/damon/sysfs: implement filters directory
mm/damon/sysfs: implement filter dir
mm/damon/sysfs: implement filter dir files
mm/damon/sysfs: setup probes on DAMON core API parameters
mm/damon/sysfs-schemes: implement tried_regions/<r>/probes/
mm/damon/sysfs-schemes: implement probe dir
mm/damon/sysfs-schemes: implement probe/hits file
mm/damon: trace probe_hits
selftests/damon/sysfs.sh: test probes dir
Docs/mm/damon/design: document data attributes monitoring
Docs/admin-guide/mm/damon/usage: document data attributes monitoring
mm/damon/core: introduce DAMON_FILTER_TYPE_MEMCG
mm/damon/paddr: support DAMON_FILTER_TYPE_MEMCG
mm/damon/sysfs: add filters/<F>/path file
mm/damon/sysfs-schemes: move memcg_path_to_id() to sysfs-common
mm/damon/sysfs: setup damon_filter->memcg_id from path
Docs/mm/damon/design: update for memcg damon filter
Docs/admin-guide/mm/damon/usage: update for memcg damon filter
Documentation/admin-guide/mm/damon/usage.rst | 48 +-
Documentation/mm/damon/design.rst | 39 ++
include/linux/damon.h | 67 +++
include/trace/events/damon.h | 38 ++
mm/damon/core.c | 197 +++++++
mm/damon/paddr.c | 76 +++
mm/damon/sysfs-common.c | 41 ++
mm/damon/sysfs-common.h | 2 +
mm/damon/sysfs-schemes.c | 221 ++++++--
mm/damon/sysfs.c | 557 +++++++++++++++++++
tools/testing/selftests/damon/sysfs.sh | 48 ++
11 files changed, 1284 insertions(+), 50 deletions(-)
base-commit: 5d6919055dec134de3c40167a490f33c74c12581
--
2.47.3
^ permalink raw reply
* Re: [RFC PATCH v2.1 00/28] mm/damon: introduce data attributes monitoring
From: SeongJae Park @ 2026-05-15 0:41 UTC (permalink / raw)
To: SeongJae Park
Cc: Liam R. Howlett, Andrew Morton, David Hildenbrand,
Jonathan Corbet, Lorenzo Stoakes, Masami Hiramatsu,
Mathieu Desnoyers, Michal Hocko, Mike Rapoport, Shuah Khan,
Shuah Khan, Steven Rostedt, Suren Baghdasaryan, Vlastimil Babka,
damon, linux-doc, linux-kernel, linux-kselftest, linux-mm,
linux-trace-kernel
In-Reply-To: <20260514140904.119781-1-sj@kernel.org>
On Thu, 14 May 2026 07:08:33 -0700 SeongJae Park <sj@kernel.org> wrote:
> TL; DR
> ======
>
> Extend DAMON for monitoring general data attributes other than accesses.
> The short term motivation is lightweight page type (e.g., belonging
> cgroup) aware monitoring. In long term, this will help extending DAMON
> for multiple access events capture primitives (e.g., page faults and
> PMU) and eventually pivotting DAMON to a "Data Attributes Monitoring and
> Operations eNgine" in long term.
Sashiko failed [1] reviewing this due to a problem at finding a fresh baseline
commit. I will shortly post the next version (RFC v2.2) after rebasing to
mm-stable (7.1-rc3) for avoiding the issue.
[1] https://lore.kernel.org/20260514205555.51653-1-sj@kernel.org
Thanks,
SJ
[...]
^ permalink raw reply
* Re: [PATCH] sched/clock: Provide !HAVE_UNSTABLE_SCHED_CLOCK stub for sched_clock_stable()
From: Steven Rostedt @ 2026-05-14 19:36 UTC (permalink / raw)
To: Yiyang Chen
Cc: peterz, mingo, vincent.guittot, mhiramat, mathieu.desnoyers,
linux-kernel, linux-trace-kernel
In-Reply-To: <56e45338858946cd9581b75c8bd45dd37dba52c5.1778773587.git.cyyzero16@gmail.com>
On Fri, 15 May 2026 00:05:05 +0800
Yiyang Chen <cyyzero16@gmail.com> wrote:
> When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is disabled, sched_clock() is
> already assumed to provide stable semantics, but the public header
> doesn't provide a sched_clock_stable() stub for that case.
>
> Add a header stub that always returns true and clean up the duplicate
> local stub in ring_buffer.c, so callers can use sched_clock_stable()
> unconditionally.
>
> Signed-off-by: Yiyang Chen <cyyzero16@gmail.com>
> ---
> include/linux/sched/clock.h | 5 +++++
> kernel/trace/ring_buffer.c | 7 -------
> 2 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
> index 196f0ca351a2..39f0a7f94bfc 100644
> --- a/include/linux/sched/clock.h
> +++ b/include/linux/sched/clock.h
> @@ -33,6 +33,11 @@ extern u64 sched_clock_cpu(int cpu);
> extern void sched_clock_init(void);
>
> #ifndef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
> +static inline int sched_clock_stable(void)
> +{
> + return 1;
> +}
> +
> static inline void sched_clock_tick(void)
> {
> }
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 5326924615a4..02691c3c6dd6 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -3769,13 +3769,6 @@ rb_add_time_stamp(struct ring_buffer_per_cpu *cpu_buffer,
> return skip_time_extend(event);
> }
>
> -#ifndef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
> -static inline bool sched_clock_stable(void)
> -{
> - return true;
> -}
> -#endif
> -
> static void
> rb_check_timestamp(struct ring_buffer_per_cpu *cpu_buffer,
> struct rb_event_info *info)
Thanks for moving this out of the ring buffer code.
Acked-by: Steven Rostedt <rostedt@goodmis.org>
-- Steve
^ permalink raw reply
* Re: [PATCH v2] tracing: Allow perf to read synthetic events
From: Steven Rostedt @ 2026-05-14 18:47 UTC (permalink / raw)
To: LKML, Linux Trace Kernel
Cc: Masami Hiramatsu, Mathieu Desnoyers, Arnaldo Carvalho de Melo,
Jiri Olsa, Namhyung Kim, Peter Zijlstra, Ian Rogers, sashiko
In-Reply-To: <20260513150007.3b280e87@gandalf.local.home>
Replying to Sashiko's comments:
https://sashiko.dev/?list=org.kernel.vger.linux-trace-kernel#/patchset/20260513150007.3b280e87%40gandalf.local.home
> commit b3956023bf5f1ccf5838187c1c14f3dcf5e37c3d
> Author: Steven Rostedt <rostedt@goodmis.org>
> tracing: Allow perf to read synthetic events
> This commit adds support for perf to read synthetic events. It extracts the
> field sizing and entry writing logic into separate helper functions, and
> implements a new probe function to attach perf events.
> > diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
> > index 39ac4eba07024..e6871230bde96 100644
> > --- a/kernel/trace/trace_events_synth.c
> > +++ b/kernel/trace/trace_events_synth.c
> > @@ -499,28 +499,19 @@ static unsigned int trace_stack(struct synth_trace_event *entry,
> [ ... ]
> > +static __always_inline void write_synth_entry(struct synth_event *event,
> > + struct synth_trace_event *entry,
> > + u64 *var_ref_vals,
> > + unsigned int *var_ref_idx)
> > +{
> > + int data_size = 0;
> Since get_field_size() and write_synth_entry() are being refactored here, is
> there a risk of a time-of-check to time-of-use buffer overflow with
> dynamic string fields?
> get_field_size() computes the string length for allocation by calling
> fetch_store_strlen(). The ring buffer event slot is then allocated based on
> this length.
> write_synth_entry() populates the buffer by calling trace_string(), which
> calls fetch_store_strlen() a second time to recalculate the string's length,
> assigning it to data->as_dynamic.len.
> If a string from user space grows between the two fetch_store_strlen() calls,
> fetch_store_string() will blindly copy more bytes than the allocated buffer
> space. Can this cause out-of-bounds writes and corrupt kernel memory?
> This regression wasn't introduced by this patch, but pre-existed in the
> ftrace implementation.
The string cannot grow between the two calls. When a dynamic event is
enabled (as it is required to be here) deleting or modifying it will be
denied via an -EBUSY return. That's the only way I'm aware of that can have
user space change the size of the string.
> [ ... ]
> > +#ifdef CONFIG_PERF_EVENTS
> > +static void perf_event_raw_event_synth(void *__data,
> > + u64 *var_ref_vals,
> > + unsigned int *var_ref_idx)
> > +{
> [ ... ]
> > + size = ALIGN(sizeof(*entry) + fields_size, 8);
> > +
> > + entry = perf_trace_buf_alloc(size, ®s, &context);
> > +
> > + if (unlikely(!entry))
> > + return;
> > +
> > + write_synth_entry(event, entry, var_ref_vals, var_ref_idx);
> > +
> > + perf_fetch_caller_regs(regs);
> > +
> > + perf_trace_buf_submit(entry, size, context,
> > + call->event.type, 1, regs,
> > + perf_head, NULL);
> > +}
> > +#endif
> Could this leak uninitialized per-CPU kernel memory to userspace?
> perf_trace_buf_alloc() and trace_event_buffer_reserve() allocate memory from
> per-CPU buffers without zero-initializing it (except for trailing alignment
> padding).
Both of theses per-CPU memory locations are zeroed when created. It is only
populated with trace data. Yeah, it may leak previous trace data, but that
data should also be visible for whoever created it in the first place.
-- Steve
> write_synth_entry() populates an array of 8-byte union trace_synth_field.
> When a field is smaller than 8 bytes (e.g., as_u8), only those specific bytes
> are written, leaving the remaining padding bytes uninitialized. Similarly, for
> empty stacktraces, get_field_size() reserves 8 bytes, but trace_stack()
> terminates immediately and writes 0 bytes, leaving the 8-byte gap completely
> uninitialized.
> perf_trace_buf_submit() copies the entire requested buffer size to the perf
> ring buffer. Can userspace read this buffer, leaking uninitialized kernel
> memory from previous events or kernel operations?
> This regression also wasn't introduced by this patch, but pre-existed in the
> ftrace implementation.
^ permalink raw reply
* Re: [PATCH v2] tracing: Allow perf to read synthetic events
From: Namhyung Kim @ 2026-05-14 18:28 UTC (permalink / raw)
To: Steven Rostedt
Cc: LKML, Linux Trace Kernel, Masami Hiramatsu, Mathieu Desnoyers,
Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ian Rogers
In-Reply-To: <20260513150007.3b280e87@gandalf.local.home>
On Wed, May 13, 2026 at 03:00:07PM -0400, Steven Rostedt wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Currently, perf can not enable synthetic events. When it does, it either
> causes a warning in the kernel or errors with "no such device".
>
> Add the necessary code to allow perf to also attach to synthetic events.
>
> Reported-by: Ian Rogers <irogers@google.com>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
> ---
> Changes since v1: https://patch.msgid.link/20251217113920.50b56246@gandalf.local.home
>
> - Forward ported to v7.1-rc2
>
> kernel/trace/trace_events_synth.c | 121 +++++++++++++++++++++++-------
> 1 file changed, 94 insertions(+), 27 deletions(-)
>
> diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
> index 39ac4eba0702..e6871230bde9 100644
> --- a/kernel/trace/trace_events_synth.c
> +++ b/kernel/trace/trace_events_synth.c
> @@ -499,28 +499,19 @@ static unsigned int trace_stack(struct synth_trace_event *entry,
> return len;
> }
>
> -static void trace_event_raw_event_synth(void *__data,
> - u64 *var_ref_vals,
> - unsigned int *var_ref_idx)
> +static __always_inline int get_field_size(struct synth_event *event,
> + u64 *var_ref_vals,
> + unsigned int *var_ref_idx)
> {
> - unsigned int i, n_u64, val_idx, len, data_size = 0;
> - struct trace_event_file *trace_file = __data;
> - struct synth_trace_event *entry;
> - struct trace_event_buffer fbuffer;
> - struct trace_buffer *buffer;
> - struct synth_event *event;
> - int fields_size = 0;
> -
> - event = trace_file->event_call->data;
> -
> - if (trace_trigger_soft_disabled(trace_file))
> - return;
> + int fields_size;
>
> fields_size = event->n_u64 * sizeof(u64);
>
> - for (i = 0; i < event->n_dynamic_fields; i++) {
> + for (int i = 0; i < event->n_dynamic_fields; i++) {
> unsigned int field_pos = event->dynamic_fields[i]->field_pos;
> char *str_val;
> + int val_idx;
> + int len;
>
> val_idx = var_ref_idx[field_pos];
> str_val = (char *)(long)var_ref_vals[val_idx];
> @@ -535,18 +526,18 @@ static void trace_event_raw_event_synth(void *__data,
>
> fields_size += len;
> }
> + return fields_size;
> +}
>
> - /*
> - * Avoid ring buffer recursion detection, as this event
> - * is being performed within another event.
> - */
> - buffer = trace_file->tr->array_buffer.buffer;
> - guard(ring_buffer_nest)(buffer);
> -
> - entry = trace_event_buffer_reserve(&fbuffer, trace_file,
> - sizeof(*entry) + fields_size);
> - if (!entry)
> - return;
> +static __always_inline void write_synth_entry(struct synth_event *event,
> + struct synth_trace_event *entry,
> + u64 *var_ref_vals,
> + unsigned int *var_ref_idx)
> +{
> + int data_size = 0;
> + int i, n_u64;
> + int val_idx;
> + int len;
>
> for (i = 0, n_u64 = 0; i < event->n_fields; i++) {
> val_idx = var_ref_idx[i];
> @@ -587,10 +578,83 @@ static void trace_event_raw_event_synth(void *__data,
> n_u64++;
> }
> }
> +}
> +
> +static void trace_event_raw_event_synth(void *__data,
> + u64 *var_ref_vals,
> + unsigned int *var_ref_idx)
> +{
> + struct trace_event_file *trace_file = __data;
> + struct synth_trace_event *entry;
> + struct trace_event_buffer fbuffer;
> + struct trace_buffer *buffer;
> + struct synth_event *event;
> + int fields_size;
> +
> + event = trace_file->event_call->data;
> +
> + if (trace_trigger_soft_disabled(trace_file))
> + return;
> +
> + fields_size = get_field_size(event, var_ref_vals, var_ref_idx);
> +
> + /*
> + * Avoid ring buffer recursion detection, as this event
> + * is being performed within another event.
> + */
> + buffer = trace_file->tr->array_buffer.buffer;
> + guard(ring_buffer_nest)(buffer);
> +
> + entry = trace_event_buffer_reserve(&fbuffer, trace_file,
> + sizeof(*entry) + fields_size);
> + if (!entry)
> + return;
> +
> + write_synth_entry(event, entry, var_ref_vals, var_ref_idx);
>
> trace_event_buffer_commit(&fbuffer);
> }
>
> +#ifdef CONFIG_PERF_EVENTS
> +static void perf_event_raw_event_synth(void *__data,
> + u64 *var_ref_vals,
> + unsigned int *var_ref_idx)
> +{
> + struct trace_event_call *call = __data;
> + struct synth_trace_event *entry;
> + struct hlist_head *perf_head;
> + struct synth_event *event;
> + struct pt_regs *regs;
> + int fields_size;
> + size_t size;
> + int context;
> +
> + event = call->data;
> +
> + perf_head = this_cpu_ptr(call->perf_events);
> +
> + if (!perf_head || hlist_empty(perf_head))
> + return;
> +
> + fields_size = get_field_size(event, var_ref_vals, var_ref_idx);
> +
> + size = ALIGN(sizeof(*entry) + fields_size, 8);
> +
> + entry = perf_trace_buf_alloc(size, ®s, &context);
> +
> + if (unlikely(!entry))
> + return;
> +
> + write_synth_entry(event, entry, var_ref_vals, var_ref_idx);
> +
> + perf_fetch_caller_regs(regs);
> +
> + perf_trace_buf_submit(entry, size, context,
> + call->event.type, 1, regs,
> + perf_head, NULL);
> +}
> +#endif
> +
> static void free_synth_event_print_fmt(struct trace_event_call *call)
> {
> if (call) {
> @@ -917,6 +981,9 @@ static int register_synth_event(struct synth_event *event)
> call->flags = TRACE_EVENT_FL_TRACEPOINT;
> call->class->reg = synth_event_reg;
> call->class->probe = trace_event_raw_event_synth;
> +#ifdef CONFIG_PERF_EVENTS
> + call->class->perf_probe = perf_event_raw_event_synth;
> +#endif
> call->data = event;
> call->tp = event->tp;
>
> --
> 2.53.0
>
^ permalink raw reply
* Re: [PATCH v2] fprobe: Fix unregister_fprobe() to wait for RCU grace period
From: patchwork-bot+netdevbpf @ 2026-05-14 17:12 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: rostedt, ast, daniel, andrii, jolsa, mathieu.desnoyers,
linux-kernel, linux-trace-kernel, bpf
In-Reply-To: <177813998919.256460.2809243930741138224.stgit@mhiramat.tok.corp.google.com>
Hello:
This patch was applied to netdev/net.git (main)
by Masami Hiramatsu (Google) <mhiramat@kernel.org>:
On Thu, 7 May 2026 16:46:29 +0900 you wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Commit 4346ba1604093 ("fprobe: Rewrite fprobe on function-graph tracer")
> changed fprobe to register struct fprobe to an rcu-hlist, but it forgot
> to wait for RCU GP. Thus there can be use-after-free if the fprobe is
> released right after unregistering. This can be happened on fprobe
> event and sample module code.
>
> [...]
Here is the summary with links:
- [v2] fprobe: Fix unregister_fprobe() to wait for RCU grace period
https://git.kernel.org/netdev/net/c/657b594b2084
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH 1/7] uprobes/x86: Move optimized uprobe from nop5 to nop10
From: Jakub Sitnicki @ 2026-05-14 16:54 UTC (permalink / raw)
To: Jiri Olsa
Cc: Oleg Nesterov, Peter Zijlstra, Ingo Molnar, Masami Hiramatsu,
Andrii Nakryiko, bpf, linux-trace-kernel
In-Reply-To: <20260514135342.22130-2-jolsa@kernel.org>
On Thu, May 14, 2026 at 03:53:36PM +0200, Jiri Olsa wrote:
> Andrii reported an issue with optimized uprobes [1] that can clobber
> redzone area with call instruction storing return address on stack
> where user code may keep temporary data without adjusting rsp.
>
> Fixing this by moving the optimized uprobes on top of 10-bytes nop
> instruction, so we can squeeze another instruction to escape the
> redzone area before doing the call, like:
>
> lea -0x80(%rsp), %rsp
> call tramp
>
> Note the lea instruction is used to adjust the rsp register without
> changing the flags.
>
> The optimized uprobe performance stays the same:
>
> uprobe-nop : 3.129 ± 0.013M/s
> uprobe-push : 3.045 ± 0.006M/s
> uprobe-ret : 1.095 ± 0.004M/s
> --> uprobe-nop10 : 7.170 ± 0.020M/s
> uretprobe-nop : 2.143 ± 0.021M/s
> uretprobe-push : 2.090 ± 0.000M/s
> uretprobe-ret : 0.942 ± 0.000M/s
> --> uretprobe-nop10: 3.381 ± 0.003M/s
> usdt-nop : 3.245 ± 0.004M/s
> --> usdt-nop10 : 7.256 ± 0.023M/s
>
> [1] https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
> Reported-by: Andrii Nakryiko <andrii@kernel.org>
> Closes: https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
> Fixes: ba2bfc97b462 ("uprobes/x86: Add support to optimize uprobes")
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> arch/x86/kernel/uprobes.c | 121 +++++++++++++++++++++++++++-----------
> 1 file changed, 86 insertions(+), 35 deletions(-)
>
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index ebb1baf1eb1d..f7c4101a4039 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -636,9 +636,21 @@ struct uprobe_trampoline {
> unsigned long vaddr;
> };
>
> +#define LEA_INSN_SIZE 5
> +#define OPT_INSN_SIZE (LEA_INSN_SIZE + CALL_INSN_SIZE)
> +#define OPT_JMP8_OFFSET (OPT_INSN_SIZE - JMP8_INSN_SIZE)
> +#define REDZONE_SIZE 0x80
> +
> +static const u8 lea_rsp[] = { 0x48, 0x8d, 0x64, 0x24, 0x80 };
> +
> +static bool is_lea_insn(const uprobe_opcode_t *insn)
> +{
> + return !memcmp(insn, lea_rsp, LEA_INSN_SIZE);
> +}
> +
Just a thought. See if below maybe reads better when plugged in.
is_call_insn can then be removed, I think.
static bool is_call_past_redzone_insns(const uprobe_opcode_t *insn)
{
static const u8 lea_rsp_call[] = {
0x48, 0x8d, 0x64, 0x24, REDZONE_SIZE, /* lea -0x80(%rsp), %rsp */
CALL_INSN_OPCODE
};
return !memcmp(insn, lea_rsp_call, ARRAY_SIZE(lea_rsp_call));
}
[...]
^ permalink raw reply
* [PATCH] sched/clock: Provide !HAVE_UNSTABLE_SCHED_CLOCK stub for sched_clock_stable()
From: Yiyang Chen @ 2026-05-14 16:05 UTC (permalink / raw)
To: peterz, rostedt
Cc: mingo, vincent.guittot, mhiramat, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, Yiyang Chen
When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is disabled, sched_clock() is
already assumed to provide stable semantics, but the public header
doesn't provide a sched_clock_stable() stub for that case.
Add a header stub that always returns true and clean up the duplicate
local stub in ring_buffer.c, so callers can use sched_clock_stable()
unconditionally.
Signed-off-by: Yiyang Chen <cyyzero16@gmail.com>
---
include/linux/sched/clock.h | 5 +++++
kernel/trace/ring_buffer.c | 7 -------
2 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
index 196f0ca351a2..39f0a7f94bfc 100644
--- a/include/linux/sched/clock.h
+++ b/include/linux/sched/clock.h
@@ -33,6 +33,11 @@ extern u64 sched_clock_cpu(int cpu);
extern void sched_clock_init(void);
#ifndef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
+static inline int sched_clock_stable(void)
+{
+ return 1;
+}
+
static inline void sched_clock_tick(void)
{
}
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 5326924615a4..02691c3c6dd6 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3769,13 +3769,6 @@ rb_add_time_stamp(struct ring_buffer_per_cpu *cpu_buffer,
return skip_time_extend(event);
}
-#ifndef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
-static inline bool sched_clock_stable(void)
-{
- return true;
-}
-#endif
-
static void
rb_check_timestamp(struct ring_buffer_per_cpu *cpu_buffer,
struct rb_event_info *info)
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v6 5/7] locking: Add contended_release tracepoint to qspinlock
From: Steven Rostedt @ 2026-05-14 16:03 UTC (permalink / raw)
To: Dmitry Ilvokhin
Cc: Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Waiman Long,
Thomas Bogendoerfer, Juergen Gross, Ajay Kaher, Alexey Makhalov,
Broadcom internal kernel review list, Thomas Gleixner,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Arnd Bergmann,
Dennis Zhou, Tejun Heo, Christoph Lameter, Masami Hiramatsu,
Mathieu Desnoyers, linux-kernel, linux-mips, virtualization,
linux-arch, linux-mm, linux-trace-kernel, kernel-team,
Paul E. McKenney
In-Reply-To: <agXYjw3GM15WtC-H@shell.ilvokhin.com>
On Thu, 14 May 2026 14:13:35 +0000
Dmitry Ilvokhin <d@ilvokhin.com> wrote:
> > > +void __lockfunc queued_spin_release_traced(struct qspinlock *lock)
> > > +{
> > > + if (queued_spin_is_contended(lock))
> > > + trace_call__contended_release(lock);
> > > + queued_spin_release(lock);
> >
> > And then remove the duplicate call of "queued_spin_release()" here.
>
> This is the scenario the comment above the static branch describes.
> Here's what it looks like in practice on x86_64 (defconfig, compiled
> with GCC 11).
>
> Current design (trace + unlock combined, with return):
>
> endbr64
> xchg %ax,%ax ; NOP (static branch)
> movb $0x0,(%rdi) ; unlock
> decl %gs:__preempt_count
> je preempt
> jmp __x86_return_thunk
> call queued_spin_release_traced ; cold
> jmp preempt_handling ; cold
> call __SCT__preempt_schedule
> jmp __x86_return_thunk
>
> With the trace-only function (no return, unlock after the call):
>
> endbr64
> push %rbx ; saves callee-saved rbx (!)
> mov %rdi,%rbx ; preserve lock across call (!)
> xchg %ax,%ax ; NOP (static branch)
> movb $0x0,(%rbx) ; unlock
> decl %gs:__preempt_count
> je preempt
> pop %rbx ; callee-saved restore (!)
> jmp __x86_return_thunk
> call queued_spin_release_traced ; cold
> jmp unlock ; cold
> call __SCT__preempt_schedule
> pop %rbx
> jmp __x86_return_thunk
>
> Three extra instructions marked by "!" on the hot path (push, mov, pop),
> all wasted when the tracepoint is off. That's the main reason for
> combining trace and unlock in the same out-of-line function.
Ah, because the return makes it into two tail calls.
I still don't like the duplication, perhaps add some more comments about
needing to update the other location if anything changes here? And perhaps
comment that this duplicate code helps the assembly.
-- Steve
^ permalink raw reply
* [PATCH v2 14/14] rv: Add KUnit tests for some LTL monitors
From: Gabriele Monaco @ 2026-05-14 15:20 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel, Steven Rostedt, Gabriele Monaco,
Masami Hiramatsu
Cc: Nam Cao, Thomas Weissschuh, Tomas Glozar, John Kacur, Wen Yang
In-Reply-To: <20260514152055.229162-1-gmonaco@redhat.com>
Validate the functionality of LTL monitors by injecting events in a
controlled environment (KUnit) and expecting reactions, just like it is
done in DA monitors.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
include/rv/ltl_monitor.h | 37 +++++++++++++
.../trace/rv/monitors/pagefault/pagefault.c | 25 +++++++++
kernel/trace/rv/monitors/sleep/sleep.c | 52 +++++++++++++++++++
kernel/trace/rv/rv_monitors_test.c | 4 ++
4 files changed, 118 insertions(+)
diff --git a/include/rv/ltl_monitor.h b/include/rv/ltl_monitor.h
index 35bc870d808a..75c9c6230dae 100644
--- a/include/rv/ltl_monitor.h
+++ b/include/rv/ltl_monitor.h
@@ -172,3 +172,40 @@ static void __maybe_unused ltl_atom_pulse(struct task_struct *task, enum ltl_ato
ltl_atom_set(mon, atom, !value);
ltl_validate(task, mon);
}
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+
+/*
+ * ltl_teardown_test - Disable the monitor for a kunit test
+ */
+static inline void ltl_teardown_test(void *arg)
+{
+ struct rv_monitor *rv_this = arg;
+ struct kunit *test = kunit_get_current_test();
+
+ if (test) {
+ struct rv_kunit_ctx *ctx = test->priv;
+
+ RV_KUNIT_EXPECT_NO_REACTION(test, ctx);
+ }
+
+ rv_this->enabled = 0;
+ ltl_monitor_destroy();
+}
+
+/*
+ * ltl_prepare_test - Enable the monitor for a kunit test
+ *
+ * Do the bare minimum to set up the monitor, make sure it is not active and
+ * real tracepoint handlers are NOT attached.
+ */
+static inline void ltl_prepare_test(struct kunit *test, struct rv_monitor *rv_this)
+{
+ KUNIT_ASSERT_FALSE(test, rv_this->enabled);
+ ltl_monitor_init();
+ rv_this->enabled = 1;
+
+ KUNIT_ASSERT_EQ(test, 0,
+ kunit_add_action_or_reset(test, ltl_teardown_test, rv_this));
+}
+#endif /* CONFIG_RV_MONITORS_KUNIT_TEST */
diff --git a/kernel/trace/rv/monitors/pagefault/pagefault.c b/kernel/trace/rv/monitors/pagefault/pagefault.c
index 56abe5079676..28c3382eabb0 100644
--- a/kernel/trace/rv/monitors/pagefault/pagefault.c
+++ b/kernel/trace/rv/monitors/pagefault/pagefault.c
@@ -86,3 +86,28 @@ module_exit(unregister_pagefault);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Nam Cao <namcao@linutronix.de>");
MODULE_DESCRIPTION("pagefault: Monitor that RT tasks do not raise page faults");
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+void rv_test_pagefault(struct kunit *test);
+
+void rv_test_pagefault(struct kunit *test)
+{
+ struct task_struct *target;
+ struct rv_kunit_ctx *ctx = test->priv;
+
+ ltl_prepare_test(test, &rv_pagefault);
+ target = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, target);
+ target->policy = SCHED_FIFO;
+ target->prio = MAX_RT_PRIO - 1;
+ handle_task_newtask(NULL, target, 0);
+
+ ltl_attempt_start(target, ltl_get_monitor(target));
+
+ /* RT task has a page fault */
+ rv_mock_current(ctx, target);
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_page_fault(NULL, 0, NULL, 0);
+}
+EXPORT_SYMBOL_GPL(rv_test_pagefault);
+#endif
diff --git a/kernel/trace/rv/monitors/sleep/sleep.c b/kernel/trace/rv/monitors/sleep/sleep.c
index 8b44161d47d3..76e70e2db992 100644
--- a/kernel/trace/rv/monitors/sleep/sleep.c
+++ b/kernel/trace/rv/monitors/sleep/sleep.c
@@ -247,3 +247,55 @@ module_exit(unregister_sleep);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Nam Cao <namcao@linutronix.de>");
MODULE_DESCRIPTION("sleep: Monitor that RT tasks do not undesirably sleep");
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+void rv_test_sleep(struct kunit *test);
+
+void rv_test_sleep(struct kunit *test)
+{
+ struct task_struct *target, *other;
+ struct rv_kunit_ctx *ctx = test->priv;
+ unsigned long args[6] = {0};
+ struct pt_regs regs;
+
+ ltl_prepare_test(test, &rv_sleep);
+ target = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, target);
+ target->policy = SCHED_FIFO;
+ target->prio = MAX_RT_PRIO - 2;
+ other = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, other);
+ other->policy = SCHED_FIFO;
+ other->prio = MAX_RT_PRIO - 1;
+ handle_task_newtask(NULL, target, 0);
+
+ /* RT task sleeps on a non RT-friendly nanosleep */
+ rv_mock_current(ctx, target);
+ args[0] = CLOCK_REALTIME;
+ syscall_set_arguments(target, ®s, args);
+#ifdef __NR_clock_nanosleep
+ handle_sys_enter(NULL, ®s, __NR_clock_nanosleep);
+#elif defined(__NR_clock_nanosleep_time64)
+ handle_sys_enter(NULL, ®s, __NR_clock_nanosleep_time64);
+#endif
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_set_state(NULL, target, TASK_INTERRUPTIBLE);
+ handle_sys_exit(NULL, NULL, 0);
+
+ /* RT task woken up by lower priority task */
+ args[1] = FUTEX_WAIT;
+ syscall_set_arguments(target, ®s, args);
+ rv_mock_current(ctx, target);
+#ifdef __NR_futex
+ handle_sys_enter(NULL, ®s, __NR_futex);
+#elif defined(__NR_futex_time64)
+ handle_sys_enter(NULL, ®s, __NR_futex_time64);
+#endif
+ handle_sched_set_state(NULL, target, TASK_INTERRUPTIBLE);
+ rv_mock_current(ctx, other);
+ handle_sched_waking(NULL, target);
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_wakeup(NULL, target);
+}
+EXPORT_SYMBOL_GPL(rv_test_sleep);
+#endif
diff --git a/kernel/trace/rv/rv_monitors_test.c b/kernel/trace/rv/rv_monitors_test.c
index 01cbee9ac6c0..d3e3aa1ac4ec 100644
--- a/kernel/trace/rv/rv_monitors_test.c
+++ b/kernel/trace/rv/rv_monitors_test.c
@@ -137,6 +137,8 @@ DECLARE_RV_TEST(rv_test_sssw);
DECLARE_RV_TEST(rv_test_sts);
DECLARE_RV_TEST(rv_test_opid);
DECLARE_RV_TEST(rv_test_nomiss);
+DECLARE_RV_TEST(rv_test_pagefault);
+DECLARE_RV_TEST(rv_test_sleep);
static struct kunit_case rv_mon_test_cases[] = {
KUNIT_CASE(rv_test_sco),
@@ -144,6 +146,8 @@ static struct kunit_case rv_mon_test_cases[] = {
KUNIT_CASE(rv_test_sts),
KUNIT_CASE(rv_test_opid),
KUNIT_CASE(rv_test_nomiss),
+ KUNIT_CASE(rv_test_pagefault),
+ KUNIT_CASE(rv_test_sleep),
{}
};
--
2.54.0
^ permalink raw reply related
* [PATCH v2 12/14] rv: Add KUnit stubs for current and smp_processor_id()
From: Gabriele Monaco @ 2026-05-14 15:20 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel, Steven Rostedt, Gabriele Monaco,
Masami Hiramatsu
Cc: Nam Cao, Thomas Weissschuh, Tomas Glozar, John Kacur, Wen Yang
In-Reply-To: <20260514152055.229162-1-gmonaco@redhat.com>
Some monitors do not only rely on tracepoint arguments but also on the
currently executing task and current processor.
This makes it more challenging to mock events in KUnit.
Define wrapper functions around current and smp_processor_id(), the
functionality is stubbed only during KUnit, however the additional
function call is necessary whenever the KUnit tests are built in.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
include/rv/kunit.h | 18 ++++++++++++-
include/rv/ltl_monitor.h | 1 +
kernel/trace/rv/Kconfig | 3 +++
.../trace/rv/monitors/pagefault/pagefault.c | 2 +-
kernel/trace/rv/monitors/sleep/sleep.c | 24 ++++++++---------
kernel/trace/rv/rv_monitors_test.c | 26 +++++++++++++++++++
6 files changed, 60 insertions(+), 14 deletions(-)
diff --git a/include/rv/kunit.h b/include/rv/kunit.h
index 67f6057bd5b1..3ef83d337880 100644
--- a/include/rv/kunit.h
+++ b/include/rv/kunit.h
@@ -2,7 +2,10 @@
/*
* Copyright (C) 2026-2029 Red Hat, Inc. Gabriele Monaco <gmonaco@redhat.com>
*
- * Declaration of utilities to run KUnit tests.
+ * Declaration of wrappers to allow stubbing core functionality, like current
+ * and smp_processor_id(), and other testing utilities.
+ * Necessary only when mocking may be needed. If the RV KUnit test is
+ * enabled, the wrappers incur an additional function call overhead.
*/
#ifndef _RV_KUNIT_H
@@ -14,8 +17,13 @@
#include <kunit/test-bug.h>
#include <linux/delay.h>
+struct task_struct *rv_get_current(void);
+int rv_current_cpu(void);
+
struct rv_kunit_ctx {
int reactions, expected;
+ int cpu;
+ struct task_struct *curr;
};
#define RV_KUNIT_EXPECT_REACTION(test, ctx) \
@@ -37,5 +45,13 @@ struct rv_kunit_ctx {
!__done; \
__done = ({ RV_KUNIT_EXPECT_REACTION(test, ctx); 1; }))
+#define rv_mock_current(ctx, task) (ctx->curr = task)
+#define rv_mock_cpu(ctx, cpu) (ctx->cpu = cpu)
+
+#else /* !CONFIG_RV_MONITORS_KUNIT_TEST */
+
+#define rv_get_current() current
+#define rv_current_cpu() smp_processor_id()
+
#endif /* CONFIG_RV_MONITORS_KUNIT_TEST */
#endif /* _RV_KUNIT_H */
diff --git a/include/rv/ltl_monitor.h b/include/rv/ltl_monitor.h
index eff60cd61106..35bc870d808a 100644
--- a/include/rv/ltl_monitor.h
+++ b/include/rv/ltl_monitor.h
@@ -9,6 +9,7 @@
#include <linux/stringify.h>
#include <linux/seq_buf.h>
#include <rv/instrumentation.h>
+#include <rv/kunit.h>
#include <trace/events/task.h>
#include <trace/events/sched.h>
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index d7dba4453bd3..702349e1ddd4 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -121,4 +121,7 @@ config RV_MONITORS_KUNIT_TEST
These tests verify that monitors correctly detect violations by
triggering fake events and validating the expected reactions.
+ Enabling this may slightly increase overhead of some monitors even
+ when the KUnit test is not running.
+
If unsure, say N.
diff --git a/kernel/trace/rv/monitors/pagefault/pagefault.c b/kernel/trace/rv/monitors/pagefault/pagefault.c
index 9fe6123b2200..56abe5079676 100644
--- a/kernel/trace/rv/monitors/pagefault/pagefault.c
+++ b/kernel/trace/rv/monitors/pagefault/pagefault.c
@@ -38,7 +38,7 @@ static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bo
static void handle_page_fault(void *data, unsigned long address, struct pt_regs *regs,
unsigned long error_code)
{
- ltl_atom_pulse(current, LTL_PAGEFAULT, true);
+ ltl_atom_pulse(rv_get_current(), LTL_PAGEFAULT, true);
}
static int enable_pagefault(void)
diff --git a/kernel/trace/rv/monitors/sleep/sleep.c b/kernel/trace/rv/monitors/sleep/sleep.c
index 8dfe5ec13e19..8b44161d47d3 100644
--- a/kernel/trace/rv/monitors/sleep/sleep.c
+++ b/kernel/trace/rv/monitors/sleep/sleep.c
@@ -102,7 +102,7 @@ static void handle_sched_waking(void *data, struct task_struct *task)
if (this_cpu_read(hardirq_context)) {
ltl_atom_pulse(task, LTL_WOKEN_BY_HARDIRQ, true);
} else if (in_task()) {
- if (current->prio <= task->prio)
+ if (rv_get_current()->prio <= task->prio)
ltl_atom_pulse(task, LTL_WOKEN_BY_EQUAL_OR_HIGHER_PRIO, true);
} else if (in_nmi()) {
ltl_atom_pulse(task, LTL_WOKEN_BY_NMI, true);
@@ -112,12 +112,12 @@ static void handle_sched_waking(void *data, struct task_struct *task)
static void handle_contention_begin(void *data, void *lock, unsigned int flags)
{
if (flags & LCB_F_RT)
- ltl_atom_update(current, LTL_BLOCK_ON_RT_MUTEX, true);
+ ltl_atom_update(rv_get_current(), LTL_BLOCK_ON_RT_MUTEX, true);
}
static void handle_contention_end(void *data, void *lock, int ret)
{
- ltl_atom_update(current, LTL_BLOCK_ON_RT_MUTEX, false);
+ ltl_atom_update(rv_get_current(), LTL_BLOCK_ON_RT_MUTEX, false);
}
static void handle_sys_enter(void *data, struct pt_regs *regs, long id)
@@ -126,7 +126,7 @@ static void handle_sys_enter(void *data, struct pt_regs *regs, long id)
unsigned long args[6];
int op, cmd;
- mon = ltl_get_monitor(current);
+ mon = ltl_get_monitor(rv_get_current());
switch (id) {
#ifdef __NR_clock_nanosleep
@@ -135,11 +135,11 @@ static void handle_sys_enter(void *data, struct pt_regs *regs, long id)
#ifdef __NR_clock_nanosleep_time64
case __NR_clock_nanosleep_time64:
#endif
- syscall_get_arguments(current, regs, args);
+ syscall_get_arguments(rv_get_current(), regs, args);
ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_MONOTONIC, args[0] == CLOCK_MONOTONIC);
ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_TAI, args[0] == CLOCK_TAI);
ltl_atom_set(mon, LTL_NANOSLEEP_TIMER_ABSTIME, args[1] == TIMER_ABSTIME);
- ltl_atom_update(current, LTL_CLOCK_NANOSLEEP, true);
+ ltl_atom_update(rv_get_current(), LTL_CLOCK_NANOSLEEP, true);
break;
#ifdef __NR_futex
@@ -148,25 +148,25 @@ static void handle_sys_enter(void *data, struct pt_regs *regs, long id)
#ifdef __NR_futex_time64
case __NR_futex_time64:
#endif
- syscall_get_arguments(current, regs, args);
+ syscall_get_arguments(rv_get_current(), regs, args);
op = args[1];
cmd = op & FUTEX_CMD_MASK;
switch (cmd) {
case FUTEX_LOCK_PI:
case FUTEX_LOCK_PI2:
- ltl_atom_update(current, LTL_FUTEX_LOCK_PI, true);
+ ltl_atom_update(rv_get_current(), LTL_FUTEX_LOCK_PI, true);
break;
case FUTEX_WAIT:
case FUTEX_WAIT_BITSET:
case FUTEX_WAIT_REQUEUE_PI:
- ltl_atom_update(current, LTL_FUTEX_WAIT, true);
+ ltl_atom_update(rv_get_current(), LTL_FUTEX_WAIT, true);
break;
}
break;
#ifdef __NR_epoll_wait
case __NR_epoll_wait:
- ltl_atom_update(current, LTL_EPOLL_WAIT, true);
+ ltl_atom_update(rv_get_current(), LTL_EPOLL_WAIT, true);
break;
#endif
}
@@ -174,7 +174,7 @@ static void handle_sys_enter(void *data, struct pt_regs *regs, long id)
static void handle_sys_exit(void *data, struct pt_regs *regs, long ret)
{
- struct ltl_monitor *mon = ltl_get_monitor(current);
+ struct ltl_monitor *mon = ltl_get_monitor(rv_get_current());
ltl_atom_set(mon, LTL_FUTEX_LOCK_PI, false);
ltl_atom_set(mon, LTL_FUTEX_WAIT, false);
@@ -182,7 +182,7 @@ static void handle_sys_exit(void *data, struct pt_regs *regs, long ret)
ltl_atom_set(mon, LTL_NANOSLEEP_CLOCK_TAI, false);
ltl_atom_set(mon, LTL_NANOSLEEP_TIMER_ABSTIME, false);
ltl_atom_set(mon, LTL_EPOLL_WAIT, false);
- ltl_atom_update(current, LTL_CLOCK_NANOSLEEP, false);
+ ltl_atom_update(rv_get_current(), LTL_CLOCK_NANOSLEEP, false);
}
static void handle_kthread_stop(void *data, struct task_struct *task)
diff --git a/kernel/trace/rv/rv_monitors_test.c b/kernel/trace/rv/rv_monitors_test.c
index 5a12a109c1ed..3dbe562f00c1 100644
--- a/kernel/trace/rv/rv_monitors_test.c
+++ b/kernel/trace/rv/rv_monitors_test.c
@@ -31,6 +31,30 @@ static void stub_rv_put_task_monitor_slot(int slot)
{
}
+struct task_struct *rv_get_current(void)
+{
+ KUNIT_STATIC_STUB_REDIRECT(rv_get_current);
+ return current;
+}
+int rv_current_cpu(void)
+{
+ KUNIT_STATIC_STUB_REDIRECT(rv_current_cpu);
+ return smp_processor_id();
+}
+
+static struct task_struct *stub_rv_get_current(void)
+{
+ struct rv_kunit_ctx *ctx = kunit_get_current_test()->priv;
+
+ return ctx->curr ?: current;
+}
+static int stub_rv_current_cpu(void)
+{
+ struct rv_kunit_ctx *ctx = kunit_get_current_test()->priv;
+
+ return ctx->cpu;
+}
+
static int rv_mon_test_init(struct kunit *test)
{
struct rv_kunit_ctx *ctx;
@@ -49,6 +73,8 @@ static int rv_mon_test_init(struct kunit *test)
stub_rv_get_task_monitor_slot);
kunit_activate_static_stub(test, rv_put_task_monitor_slot,
stub_rv_put_task_monitor_slot);
+ kunit_activate_static_stub(test, rv_get_current, stub_rv_get_current);
+ kunit_activate_static_stub(test, rv_current_cpu, stub_rv_current_cpu);
return 0;
}
--
2.54.0
^ permalink raw reply related
* [PATCH v2 13/14] rv: Prevent unintentional tracepoints during KUnit tests
From: Gabriele Monaco @ 2026-05-14 15:20 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel, Steven Rostedt, Gabriele Monaco,
Masami Hiramatsu
Cc: Nam Cao, Thomas Weissschuh, Tomas Glozar, John Kacur, Wen Yang
In-Reply-To: <20260514152055.229162-1-gmonaco@redhat.com>
Monitor initialisation also called during KUnit tests may register some
tracepoints, this can lead to issues since we don't expect real monitor
events running during KUnit tests.
Prevent tracepoint registration if an RV KUnit test is running.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
include/rv/instrumentation.h | 5 +++++
include/rv/kunit.h | 2 ++
kernel/trace/rv/rv_monitors_test.c | 10 ++++++++++
3 files changed, 17 insertions(+)
diff --git a/include/rv/instrumentation.h b/include/rv/instrumentation.h
index d4e7a02ede1a..761f8f147dac 100644
--- a/include/rv/instrumentation.h
+++ b/include/rv/instrumentation.h
@@ -9,12 +9,15 @@
*/
#include <linux/ftrace.h>
+#include <rv/kunit.h>
/*
* rv_attach_trace_probe - check and attach a handler function to a tracepoint
*/
#define rv_attach_trace_probe(monitor, tp, rv_handler) \
do { \
+ if (rv_mon_test_is_running()) \
+ break; \
check_trace_callback_type_##tp(rv_handler); \
WARN_ONCE(register_trace_##tp(rv_handler, NULL), \
"fail attaching " #monitor " " #tp "handler"); \
@@ -25,5 +28,7 @@
*/
#define rv_detach_trace_probe(monitor, tp, rv_handler) \
do { \
+ if (rv_mon_test_is_running()) \
+ break; \
unregister_trace_##tp(rv_handler, NULL); \
} while (0)
diff --git a/include/rv/kunit.h b/include/rv/kunit.h
index 3ef83d337880..9daaebacfc7e 100644
--- a/include/rv/kunit.h
+++ b/include/rv/kunit.h
@@ -19,6 +19,7 @@
struct task_struct *rv_get_current(void);
int rv_current_cpu(void);
+bool rv_mon_test_is_running(void);
struct rv_kunit_ctx {
int reactions, expected;
@@ -52,6 +53,7 @@ struct rv_kunit_ctx {
#define rv_get_current() current
#define rv_current_cpu() smp_processor_id()
+#define rv_mon_test_is_running() false
#endif /* CONFIG_RV_MONITORS_KUNIT_TEST */
#endif /* _RV_KUNIT_H */
diff --git a/kernel/trace/rv/rv_monitors_test.c b/kernel/trace/rv/rv_monitors_test.c
index 3dbe562f00c1..01cbee9ac6c0 100644
--- a/kernel/trace/rv/rv_monitors_test.c
+++ b/kernel/trace/rv/rv_monitors_test.c
@@ -14,6 +14,8 @@
#include <linux/rv.h>
#include "rv.h"
+static bool rv_mon_test_running;
+
__printf(2, 3)
static void stub_rv_react(struct rv_monitor *monitor, const char *msg, ...)
{
@@ -55,6 +57,11 @@ static int stub_rv_current_cpu(void)
return ctx->cpu;
}
+bool rv_mon_test_is_running(void)
+{
+ return rv_mon_test_running;
+}
+
static int rv_mon_test_init(struct kunit *test)
{
struct rv_kunit_ctx *ctx;
@@ -94,6 +101,8 @@ static int rv_set_testing(struct kunit_suite *suite)
mutex_lock(&rv_interface_lock);
+ rv_mon_test_running = true;
+
list_for_each_entry(mon, &rv_monitors_list, list) {
if (mon->enabled) {
mutex_unlock(&rv_interface_lock);
@@ -111,6 +120,7 @@ static int rv_set_testing(struct kunit_suite *suite)
*/
static void rv_clear_testing(struct kunit_suite *suite)
{
+ rv_mon_test_running = false;
mutex_unlock(&rv_interface_lock);
}
--
2.54.0
^ permalink raw reply related
* [PATCH v2 11/14] rv: Add KUnit tests for some DA/HA monitors
From: Gabriele Monaco @ 2026-05-14 15:20 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel, Steven Rostedt, Gabriele Monaco,
Masami Hiramatsu
Cc: Nam Cao, Thomas Weissschuh, Tomas Glozar, John Kacur, Wen Yang
In-Reply-To: <20260514152055.229162-1-gmonaco@redhat.com>
Validate the functionality of DA monitors by injecting events in a
controlled environment (KUnit) and expecting reactions.
Events handlers are called directly from the monitor source files
without using system events and with dummy arguments (e.g. no real
tasks). If the provided sequence of events incurs a violation, the test
expects the stub version of rv_react() to be called.
This testing method can validate the entire monitor implementation since
it sits between the monitor and the system (in place of the
tracepoints). All sorts of system and timing events can be emulated
without affecting the running kernel.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
include/rv/da_monitor.h | 38 +++++++
include/rv/kunit.h | 41 ++++++++
kernel/trace/rv/Kconfig | 11 ++
kernel/trace/rv/Makefile | 1 +
kernel/trace/rv/monitors/nomiss/nomiss.c | 44 ++++++++
kernel/trace/rv/monitors/opid/opid.c | 26 +++++
kernel/trace/rv/monitors/sco/sco.c | 24 +++++
kernel/trace/rv/monitors/sssw/sssw.c | 29 ++++++
kernel/trace/rv/monitors/sts/sts.c | 35 +++++++
kernel/trace/rv/rv_monitors_test.c | 126 +++++++++++++++++++++++
10 files changed, 375 insertions(+)
create mode 100644 include/rv/kunit.h
create mode 100644 kernel/trace/rv/rv_monitors_test.c
diff --git a/include/rv/da_monitor.h b/include/rv/da_monitor.h
index 39765ff6f098..d16a55292f3f 100644
--- a/include/rv/da_monitor.h
+++ b/include/rv/da_monitor.h
@@ -15,6 +15,7 @@
#define _RV_DA_MONITOR_H
#include <rv/automata.h>
+#include <rv/kunit.h>
#include <linux/rv.h>
#include <linux/stringify.h>
#include <linux/bug.h>
@@ -817,4 +818,41 @@ static inline void da_reset(da_id_type id, monitor_target target)
}
#endif /* RV_MON_TYPE */
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+
+/*
+ * da_teardown_test - Disable the monitor for a kunit test
+ */
+static inline void da_teardown_test(void *arg)
+{
+ struct rv_monitor *rv_this = arg;
+ struct kunit *test = kunit_get_current_test();
+
+ if (test) {
+ struct rv_kunit_ctx *ctx = test->priv;
+
+ RV_KUNIT_EXPECT_NO_REACTION(test, ctx);
+ }
+
+ rv_this->enabled = 0;
+ da_monitor_destroy();
+}
+
+/*
+ * da_prepare_test - Enable the monitor for a kunit test
+ *
+ * Do the bare minimum to set up the monitor, make sure it is not active and
+ * real tracepoint handlers are NOT attached.
+ */
+static inline void da_prepare_test(struct kunit *test, struct rv_monitor *rv_this)
+{
+ KUNIT_ASSERT_FALSE(test, rv_this->enabled);
+ da_monitor_init();
+ rv_this->enabled = 1;
+
+ KUNIT_ASSERT_EQ(test, 0,
+ kunit_add_action_or_reset(test, da_teardown_test, rv_this));
+}
+#endif /* CONFIG_RV_MONITORS_KUNIT_TEST */
+
#endif
diff --git a/include/rv/kunit.h b/include/rv/kunit.h
new file mode 100644
index 000000000000..67f6057bd5b1
--- /dev/null
+++ b/include/rv/kunit.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2026-2029 Red Hat, Inc. Gabriele Monaco <gmonaco@redhat.com>
+ *
+ * Declaration of utilities to run KUnit tests.
+ */
+
+#ifndef _RV_KUNIT_H
+#define _RV_KUNIT_H
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+
+#include <kunit/test.h>
+#include <kunit/test-bug.h>
+#include <linux/delay.h>
+
+struct rv_kunit_ctx {
+ int reactions, expected;
+};
+
+#define RV_KUNIT_EXPECT_REACTION(test, ctx) \
+ do { \
+ KUNIT_EXPECT_EQ(test, ctx->reactions, ++ctx->expected); \
+ if (ctx->reactions != ctx->expected) \
+ ctx->expected = ctx->reactions; \
+ } while (0)
+
+#define RV_KUNIT_EXPECT_NO_REACTION(test, ctx) \
+ do { \
+ KUNIT_EXPECT_EQ(test, ctx->reactions, ctx->expected); \
+ if (ctx->reactions != ctx->expected) \
+ ctx->expected = ctx->reactions; \
+ } while (0)
+
+#define RV_KUNIT_EXPECT_REACTION_HERE(test, ctx) \
+ for (int __done = ({ RV_KUNIT_EXPECT_NO_REACTION(test, ctx); 0; }); \
+ !__done; \
+ __done = ({ RV_KUNIT_EXPECT_REACTION(test, ctx); 1; }))
+
+#endif /* CONFIG_RV_MONITORS_KUNIT_TEST */
+#endif /* _RV_KUNIT_H */
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index 3884b14df375..d7dba4453bd3 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -111,3 +111,14 @@ config RV_REACT_PANIC
help
Enables the panic reactor. The panic reactor emits a printk()
message if an exception is found and panic()s the system.
+
+config RV_MONITORS_KUNIT_TEST
+ bool "KUnit tests for RV monitors" if !KUNIT_ALL_TESTS
+ depends on KUNIT=y && RV
+ default KUNIT_ALL_TESTS
+ help
+ Enable KUnit tests for the RV (Runtime Verification) monitors.
+ These tests verify that monitors correctly detect violations by
+ triggering fake events and validating the expected reactions.
+
+ If unsure, say N.
diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
index 94498da35b37..a3502b7fe7f2 100644
--- a/kernel/trace/rv/Makefile
+++ b/kernel/trace/rv/Makefile
@@ -24,3 +24,4 @@ obj-$(CONFIG_RV_MON_NOMISS) += monitors/nomiss/nomiss.o
obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
obj-$(CONFIG_RV_REACT_PANIC) += reactor_panic.o
+obj-$(CONFIG_RV_MONITORS_KUNIT_TEST) += rv_monitors_test.o
diff --git a/kernel/trace/rv/monitors/nomiss/nomiss.c b/kernel/trace/rv/monitors/nomiss/nomiss.c
index 31f90f3638d8..a0b5641a1858 100644
--- a/kernel/trace/rv/monitors/nomiss/nomiss.c
+++ b/kernel/trace/rv/monitors/nomiss/nomiss.c
@@ -291,3 +291,47 @@ module_exit(unregister_nomiss);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
MODULE_DESCRIPTION("nomiss: dl entities run to completion before their deadline.");
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+void rv_test_nomiss(struct kunit *test);
+
+void rv_test_nomiss(struct kunit *test)
+{
+ struct task_struct *target, *other;
+ struct rv_kunit_ctx *ctx = test->priv;
+
+ da_prepare_test(test, &rv_this);
+ target = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, target);
+ other = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, other);
+
+ if (IS_ENABLED(CONFIG_SMP)) {
+ if (!IS_ENABLED(CONFIG_THREAD_INFO_IN_TASK)) {
+ target->stack = kunit_kzalloc(test, sizeof(struct thread_info), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, target->stack);
+ other->stack = kunit_kzalloc(test, sizeof(struct thread_info), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, other->stack);
+ }
+ task_thread_info(target)->cpu = 0;
+ task_thread_info(other)->cpu = 0;
+ }
+
+ target->pid = 99;
+ target->policy = SCHED_DEADLINE;
+ target->dl.runtime = 10000;
+ target->dl.dl_deadline = 20000;
+
+ handle_newtask(NULL, target, 0);
+
+ /* Task gets preempted and can't terminate before deadline */
+ handle_sched_switch(NULL, 0, other, target, TASK_RUNNING);
+ handle_dl_replenish(NULL, &target->dl, 0, DL_TASK);
+ udelay(10);
+ handle_sched_switch(NULL, 0, target, other, TASK_RUNNING);
+ udelay(10 + deadline_thresh / 1000);
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_switch(NULL, 0, other, target, TASK_RUNNING);
+}
+EXPORT_SYMBOL_GPL(rv_test_nomiss);
+#endif
diff --git a/kernel/trace/rv/monitors/opid/opid.c b/kernel/trace/rv/monitors/opid/opid.c
index 4594c7c46601..124dd043999f 100644
--- a/kernel/trace/rv/monitors/opid/opid.c
+++ b/kernel/trace/rv/monitors/opid/opid.c
@@ -121,3 +121,29 @@ module_exit(unregister_opid);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
MODULE_DESCRIPTION("opid: operations with preemption and irq disabled.");
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+void rv_test_opid(struct kunit *test);
+
+void rv_test_opid(struct kunit *test)
+{
+ struct rv_kunit_ctx *ctx = test->priv;
+
+ da_prepare_test(test, &rv_this);
+
+ /* Ensure we keep the same per-cpu monitor */
+ guard(migrate)();
+ KUNIT_EXPECT_TRUE(test, preemptible());
+
+ /* Wakeup with preemption and interrupts enabled */
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_waking(NULL, NULL);
+
+ /* Need resched with interrupts enabled */
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx) {
+ scoped_guard(preempt)
+ handle_sched_need_resched(NULL, NULL, 0, TIF_NEED_RESCHED);
+ }
+}
+EXPORT_SYMBOL_GPL(rv_test_opid);
+#endif
diff --git a/kernel/trace/rv/monitors/sco/sco.c b/kernel/trace/rv/monitors/sco/sco.c
index 5a3bd5e16e62..40eab946574b 100644
--- a/kernel/trace/rv/monitors/sco/sco.c
+++ b/kernel/trace/rv/monitors/sco/sco.c
@@ -83,3 +83,27 @@ module_exit(unregister_sco);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
MODULE_DESCRIPTION("sco: scheduling context operations.");
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+void rv_test_sco(struct kunit *test);
+
+void rv_test_sco(struct kunit *test)
+{
+ struct task_struct *target;
+ struct rv_kunit_ctx *ctx = test->priv;
+
+ da_prepare_test(test, &rv_this);
+ target = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, target);
+
+ /* Ensure we keep the same per-cpu monitor */
+ guard(migrate)();
+
+ /* Set state while scheduling */
+ handle_sched_set_state(NULL, target, TASK_INTERRUPTIBLE);
+ handle_schedule_entry(NULL, false);
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_set_state(NULL, target, TASK_INTERRUPTIBLE);
+}
+EXPORT_SYMBOL_GPL(rv_test_sco);
+#endif
diff --git a/kernel/trace/rv/monitors/sssw/sssw.c b/kernel/trace/rv/monitors/sssw/sssw.c
index a91321c890cd..6d33b740474c 100644
--- a/kernel/trace/rv/monitors/sssw/sssw.c
+++ b/kernel/trace/rv/monitors/sssw/sssw.c
@@ -112,3 +112,32 @@ module_exit(unregister_sssw);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
MODULE_DESCRIPTION("sssw: set state sleep and wakeup.");
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+void rv_test_sssw(struct kunit *test);
+
+void rv_test_sssw(struct kunit *test)
+{
+ struct task_struct *target, *other;
+ struct rv_kunit_ctx *ctx = test->priv;
+
+ da_prepare_test(test, &rv_this);
+ target = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, target);
+ other = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, other);
+
+ /* Suspend without setting to sleepable */
+ handle_sched_set_state(NULL, target, TASK_RUNNING);
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_switch(NULL, 0, target, other, TASK_INTERRUPTIBLE);
+
+ /* Switch in after suspension without wakeup */
+ handle_sched_wakeup(NULL, target);
+ handle_sched_set_state(NULL, target, TASK_INTERRUPTIBLE);
+ handle_sched_switch(NULL, 0, target, other, TASK_INTERRUPTIBLE);
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_switch(NULL, 0, other, target, TASK_RUNNING);
+}
+EXPORT_SYMBOL_GPL(rv_test_sssw);
+#endif
diff --git a/kernel/trace/rv/monitors/sts/sts.c b/kernel/trace/rv/monitors/sts/sts.c
index ce031cbf202a..587ec44fb509 100644
--- a/kernel/trace/rv/monitors/sts/sts.c
+++ b/kernel/trace/rv/monitors/sts/sts.c
@@ -152,3 +152,38 @@ module_exit(unregister_sts);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
MODULE_DESCRIPTION("sts: schedule implies task switch.");
+
+#ifdef CONFIG_RV_MONITORS_KUNIT_TEST
+void rv_test_sts(struct kunit *test);
+
+void rv_test_sts(struct kunit *test)
+{
+ struct task_struct *target, *other;
+ struct rv_kunit_ctx *ctx = test->priv;
+
+ da_prepare_test(test, &rv_this);
+ target = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, target);
+ other = kunit_kzalloc(test, sizeof(struct task_struct), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_NULL(test, other);
+ /* Per-CPU monitor, make sure we don't change CPU mid-test */
+ guard(migrate)();
+
+ /* Switch without disabling interrupts */
+ handle_schedule_exit(NULL, false);
+ handle_schedule_entry(NULL, false);
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_switch(NULL, 0, target, other, TASK_RUNNING);
+
+ handle_schedule_exit(NULL, false);
+
+ /* Schedule from interrupt context */
+ handle_schedule_entry(NULL, false);
+ handle_irq_disable(NULL, 0, 0);
+ handle_irq_entry(NULL, 0, NULL);
+ RV_KUNIT_EXPECT_REACTION_HERE(test, ctx)
+ handle_sched_switch(NULL, 0, target, other, TASK_RUNNING);
+ handle_irq_enable(NULL, 0, 0);
+}
+EXPORT_SYMBOL_GPL(rv_test_sts);
+#endif
diff --git a/kernel/trace/rv/rv_monitors_test.c b/kernel/trace/rv/rv_monitors_test.c
new file mode 100644
index 000000000000..5a12a109c1ed
--- /dev/null
+++ b/kernel/trace/rv/rv_monitors_test.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2026-2029 Red Hat, Inc. Gabriele Monaco <gmonaco@redhat.com>
+ *
+ * RV monitor kunit tests:
+ * Tests the RV monitors by triggering fake events to verify monitor
+ * behavior and reactions. Tests start from the first defined event and
+ * trigger events in order to verify error detection.
+ */
+#include <rv/kunit.h>
+#include <kunit/static_stub.h>
+#include <kunit/test-bug.h>
+#include <linux/kernel.h>
+#include <linux/rv.h>
+#include "rv.h"
+
+__printf(2, 3)
+static void stub_rv_react(struct rv_monitor *monitor, const char *msg, ...)
+{
+ struct rv_kunit_ctx *ctx = kunit_get_current_test()->priv;
+
+ ++ctx->reactions;
+}
+
+static int stub_rv_get_task_monitor_slot(void)
+{
+ return 0;
+}
+
+static void stub_rv_put_task_monitor_slot(int slot)
+{
+}
+
+static int rv_mon_test_init(struct kunit *test)
+{
+ struct rv_kunit_ctx *ctx;
+
+ ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, ctx);
+
+ test->priv = ctx;
+
+ __diag_push();
+ __diag_ignore(GCC, all, "-Wsuggest-attribute=format",
+ "Not a valid __printf() conversion candidate.");
+ kunit_activate_static_stub(test, rv_react, stub_rv_react);
+ __diag_pop();
+ kunit_activate_static_stub(test, rv_get_task_monitor_slot,
+ stub_rv_get_task_monitor_slot);
+ kunit_activate_static_stub(test, rv_put_task_monitor_slot,
+ stub_rv_put_task_monitor_slot);
+
+ return 0;
+}
+
+/*
+ * rv_set_testing - ensure mutual exclusion between KUnit tests and real monitors
+ *
+ * KUnit tests for RV monitors rely on stubs that are incompatible with
+ * the execution of real monitors. Ensure mutual exclusion by acquiring
+ * the rv_interface_lock for the duration of the suite.
+ *
+ * Returns 0 on success, -EBUSY if any real monitor is already enabled.
+ */
+static int rv_set_testing(struct kunit_suite *suite)
+{
+ struct rv_monitor *mon;
+
+ mutex_lock(&rv_interface_lock);
+
+ list_for_each_entry(mon, &rv_monitors_list, list) {
+ if (mon->enabled) {
+ mutex_unlock(&rv_interface_lock);
+ return -EBUSY;
+ }
+ }
+
+ rv_mon_test_running = true;
+
+ return 0;
+}
+
+/*
+ * rv_clear_testing - allow real monitors to run again after KUnit tests
+ */
+static void rv_clear_testing(struct kunit_suite *suite)
+{
+ mutex_unlock(&rv_interface_lock);
+}
+
+static void rv_test_stub(struct kunit *test)
+{
+ kunit_skip(test, "Monitor not enabled\n");
+}
+
+#define DECLARE_RV_TEST(name) \
+ void name(struct kunit *test) __weak __alias(rv_test_stub)
+
+DECLARE_RV_TEST(rv_test_sco);
+DECLARE_RV_TEST(rv_test_sssw);
+DECLARE_RV_TEST(rv_test_sts);
+DECLARE_RV_TEST(rv_test_opid);
+DECLARE_RV_TEST(rv_test_nomiss);
+
+static struct kunit_case rv_mon_test_cases[] = {
+ KUNIT_CASE(rv_test_sco),
+ KUNIT_CASE(rv_test_sssw),
+ KUNIT_CASE(rv_test_sts),
+ KUNIT_CASE(rv_test_opid),
+ KUNIT_CASE(rv_test_nomiss),
+ {}
+};
+
+static struct kunit_suite rv_mon_test_suite = {
+ .name = "rv_mon",
+ .suite_init = rv_set_testing,
+ .suite_exit = rv_clear_testing,
+ .init = rv_mon_test_init,
+ .test_cases = rv_mon_test_cases,
+};
+
+kunit_test_suites(&rv_mon_test_suite);
+
+MODULE_AUTHOR("Gabriele Monaco <gmonaco@redhat.com>");
+MODULE_DESCRIPTION("RV monitor kunit tests: test monitors by triggering reactions");
+MODULE_LICENSE("GPL");
--
2.54.0
^ permalink raw reply related
* [PATCH v2 10/14] rv: Add KUnit stub to rv_react() and rv_*_task_monitor_slot()
From: Gabriele Monaco @ 2026-05-14 15:20 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel, Steven Rostedt, Gabriele Monaco,
Masami Hiramatsu
Cc: Nam Cao, Thomas Weissschuh, Tomas Glozar, John Kacur, Wen Yang
In-Reply-To: <20260514152055.229162-1-gmonaco@redhat.com>
Add KUNIT_STATIC_STUB_REDIRECT to allow those functions to be stubbed in
a KUnit test. This is useful to catch reaction without creating a custom
reactor and going through the effort of setting it from a test.
rv_{get/put}_task_monitor_slot() rely on a lock, but this isn't
necessary during a unit test, so simply skip the calls.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
kernel/trace/rv/rv.c | 5 +++++
kernel/trace/rv/rv_reactors.c | 7 +++++++
2 files changed, 12 insertions(+)
diff --git a/kernel/trace/rv/rv.c b/kernel/trace/rv/rv.c
index ee4e68102f17..f59385a24fa1 100644
--- a/kernel/trace/rv/rv.c
+++ b/kernel/trace/rv/rv.c
@@ -142,6 +142,7 @@
#include <linux/module.h>
#include <linux/init.h>
#include <linux/slab.h>
+#include <kunit/static_stub.h>
#ifdef CONFIG_RV_MON_EVENTS
#define CREATE_TRACE_POINTS
@@ -171,6 +172,8 @@ int rv_get_task_monitor_slot(void)
{
int i;
+ KUNIT_STATIC_STUB_REDIRECT(rv_get_task_monitor_slot);
+
lockdep_assert_held(&rv_interface_lock);
if (task_monitor_count == CONFIG_RV_PER_TASK_MONITORS)
@@ -192,6 +195,8 @@ int rv_get_task_monitor_slot(void)
void rv_put_task_monitor_slot(int slot)
{
+ KUNIT_STATIC_STUB_REDIRECT(rv_put_task_monitor_slot, slot);
+
lockdep_assert_held(&rv_interface_lock);
if (slot < 0 || slot >= CONFIG_RV_PER_TASK_MONITORS) {
diff --git a/kernel/trace/rv/rv_reactors.c b/kernel/trace/rv/rv_reactors.c
index 460af07f7aba..3435dcedc7ee 100644
--- a/kernel/trace/rv/rv_reactors.c
+++ b/kernel/trace/rv/rv_reactors.c
@@ -63,6 +63,7 @@
#include <linux/lockdep.h>
#include <linux/slab.h>
+#include <kunit/static_stub.h>
#include "rv.h"
@@ -468,6 +469,12 @@ void rv_react(struct rv_monitor *monitor, const char *msg, ...)
static DEFINE_WAIT_OVERRIDE_MAP(rv_react_map, LD_WAIT_FREE);
va_list args;
+ __diag_push();
+ __diag_ignore(GCC, all, "-Wsuggest-attribute=format",
+ "Not a valid __printf() conversion candidate.");
+ KUNIT_STATIC_STUB_REDIRECT(rv_react, monitor, msg);
+ __diag_pop();
+
if (!rv_reacting_on() || !monitor->react)
return;
--
2.54.0
^ permalink raw reply related
* [PATCH v2 09/14] verification/rvgen: Add selftests
From: Gabriele Monaco @ 2026-05-14 15:20 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel, Steven Rostedt, Gabriele Monaco
Cc: Nam Cao, Thomas Weissschuh, Tomas Glozar, John Kacur, Wen Yang
In-Reply-To: <20260514152055.229162-1-gmonaco@redhat.com>
The rvgen code generator needs validation to ensure it produces correct
monitor implementations from input specifications.
Add selftests with golden reference outputs covering all monitor classes
(DA, HA, LTL) and types (global, per_cpu, per_task, per_obj), including
optional features like descriptions and parent monitors. Container
generation and error handling (missing files, invalid specifications,
missing arguments) are also validated against expected output.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
tools/verification/rvgen/Makefile | 4 +
.../rvgen/tests/rvgen_container.t | 20 +++++
.../verification/rvgen/tests/rvgen_monitor.t | 87 +++++++++++++++++++
tools/verification/tests/engine.sh | 34 ++++++++
4 files changed, 145 insertions(+)
create mode 100644 tools/verification/rvgen/tests/rvgen_container.t
create mode 100644 tools/verification/rvgen/tests/rvgen_monitor.t
diff --git a/tools/verification/rvgen/Makefile b/tools/verification/rvgen/Makefile
index cfc4056c1e87..2a2b9e64ea42 100644
--- a/tools/verification/rvgen/Makefile
+++ b/tools/verification/rvgen/Makefile
@@ -13,6 +13,10 @@ all:
.PHONY: clean
clean:
+.PHONY: check
+check:
+ prove -o --directives -f tests/
+
.PHONY: install
install:
$(INSTALL) rvgen/automata.py -D -m 644 $(DESTDIR)$(PYLIB)/rvgen/automata.py
diff --git a/tools/verification/rvgen/tests/rvgen_container.t b/tools/verification/rvgen/tests/rvgen_container.t
new file mode 100644
index 000000000000..fa4fb3db8288
--- /dev/null
+++ b/tools/verification/rvgen/tests/rvgen_container.t
@@ -0,0 +1,20 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+source ../tests/engine.sh
+test_begin
+
+set_timeout 30s
+
+# Help tests
+check "verify container subcommand help" \
+ "$RVGEN container -h" 0 "model_name" "class"
+
+check_and_compare_folder "container with description" \
+ "$RVGEN container -n test_container -D 'Test container for grouping monitors'" \
+ "test_container" "Writing the monitor into the directory test_container"
+
+# Error handling tests
+check "missing required model_name" \
+ "$RVGEN container" 2 "the following arguments are required: -n/--model_name"
+
+test_end
diff --git a/tools/verification/rvgen/tests/rvgen_monitor.t b/tools/verification/rvgen/tests/rvgen_monitor.t
new file mode 100644
index 000000000000..261476504eee
--- /dev/null
+++ b/tools/verification/rvgen/tests/rvgen_monitor.t
@@ -0,0 +1,87 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+source ../tests/engine.sh
+test_begin
+
+set_timeout 30s
+
+# Help and basic tests
+check "verify help page" \
+ "$RVGEN --help" 0 "Generate kernel rv monitor"
+
+check "verify monitor subcommand help" \
+ "$RVGEN monitor --help" 0 "Monitor class"
+
+# DA monitor tests - test all monitor types
+check_and_compare_folder "DA per_cpu (default name)" \
+ "$RVGEN monitor -c da -s tests/specs/test_da.dot -t per_cpu" \
+ "test_da" "obj-\$(CONFIG_RV_MON_TEST_DA) += monitors/test_da/test_da.o"
+
+check_and_compare_folder "DA global type" \
+ "$RVGEN monitor -c da -s tests/specs/test_da.dot -t global -n da_global" \
+ "da_global" "DA_MON_EVENTS_IMPLICIT"
+
+check_and_compare_folder "DA per_task with description" \
+ "$RVGEN monitor -c da -s tests/specs/test_da2.dot -t per_task -n da_pertask_desc -D 'Custom description for testing'" \
+ "da_pertask_desc" "#include <monitors/da_pertask_desc/da_pertask_desc_trace.h>"
+
+check_and_compare_folder "DA per_obj with parent" \
+ "$RVGEN monitor -c da -s tests/specs/test_da2.dot -t per_obj -n da_perobj_parent -p parent_mon" \
+ "da_perobj_parent" "DA_MON_EVENTS_ID"
+
+# HA monitor tests
+check_and_compare_folder "HA per_task (default name)" \
+ "$RVGEN monitor -c ha -s tests/specs/test_ha.dot -t per_task" \
+ "test_ha" "HA_MON_EVENTS_ID"
+
+check_and_compare_folder "HA per_cpu type" \
+ "$RVGEN monitor -c ha -s tests/specs/test_ha.dot -t per_cpu -n ha_percpu" \
+ "ha_percpu" "HA_MON_EVENTS_IMPLICIT"
+
+# LTL monitor test
+check_and_compare_folder "LTL per_task" \
+ "$RVGEN monitor -c ltl -s tests/specs/test_ltl.ltl -t per_task -n ltl_pertask" \
+ "ltl_pertask" "source \"kernel/trace/rv/monitors/ltl_pertask/Kconfig\""
+
+check_and_compare_folder "LTL per_task with parent and description (default name)" \
+ "$RVGEN monitor -c ltl -s tests/specs/test_ltl.ltl -t per_task -p ltl_parent -D 'Simple description'" \
+ "test_ltl" "LTL_MON_EVENTS_ID"
+
+# Error handling tests
+check "missing required spec argument" \
+ "$RVGEN monitor -c da -t per_cpu" 2 \
+ "the following arguments are required: -s/--spec" "Traceback (most recent call last)"
+
+check "missing required monitor type" \
+ "$RVGEN monitor -c da -s tests/specs/test_da.dot" 2 \
+ "the following arguments are required: -t/--monitor_type" "Traceback (most recent call last)"
+
+check "missing required monitor class" \
+ "$RVGEN monitor -s tests/specs/test_da.dot -t per_cpu" 2 \
+ "the following arguments are required: -c/--class" "Traceback (most recent call last)"
+
+check "invalid monitor class" \
+ "$RVGEN monitor -c invalid -s tests/specs/test_da.dot -t per_cpu" 1 \
+ "Unknown monitor class" "Traceback (most recent call last)"
+
+check "missing dot file" \
+ "$RVGEN monitor -c da -s tests/specs/nonexistent.dot -t per_cpu" 1 \
+ "No such file or directory" "Traceback (most recent call last)"
+
+check "missing ltl file" \
+ "$RVGEN monitor -c ltl -s tests/specs/nonexistent.ltl -t per_task" 1 \
+ "No such file or directory" "Traceback (most recent call last)"
+
+check "invalid dot file syntax" \
+ "$RVGEN monitor -c da -s tests/specs/test_invalid.dot -t per_cpu" 1 \
+ "Not a valid .dot format" "Traceback (most recent call last)"
+
+check "invalid ha file syntax" \
+ "$RVGEN monitor -c ha -s tests/specs/test_invalid_ha.dot -t per_obj" 1 \
+ "Unrecognised event constraint" "Traceback (most recent call last)"
+
+check "invalid ltl file syntax" \
+ "$RVGEN monitor -c ltl -s tests/specs/test_invalid.ltl -t per_task" 1 \
+ "Illegal character 'i'" "Traceback (most recent call last)"
+
+test_end
diff --git a/tools/verification/tests/engine.sh b/tools/verification/tests/engine.sh
index 76cc254ff94c..f86d44460895 100644
--- a/tools/verification/tests/engine.sh
+++ b/tools/verification/tests/engine.sh
@@ -5,6 +5,8 @@ test_begin() {
# included correctly.
ctr=0
[ -z "$RV" ] && RV="../rv/rv"
+ [ -z "$RVGEN" ] && RVGEN="python3 ../rvgen"
+ [ -z "$GOLDEN_DIR" ] && GOLDEN_DIR="tests/golden"
[ -n "$TEST_COUNT" ] && echo "1..$TEST_COUNT"
}
@@ -109,6 +111,38 @@ check_if_exists() {
fi
}
+check_and_compare_folder() {
+ # Run command, compare generated folder to golden, and cleanup
+ local desc=$1
+ local command=$2
+ local generated_dir=$3
+ local expected_output=$4
+ local unexpected_output=$5
+ local golden_dir="$GOLDEN_DIR/$generated_dir"
+
+ ctr=$((ctr + 1))
+ if [ -n "$TEST_COUNT" ]; then
+ rm -rf "$generated_dir"
+ _check "$desc" "$command" 0 "$expected_output" "$unexpected_output"
+
+ if [ "$fail" -eq 0 ] && [ ! -d "$generated_dir" ]; then
+ failure "# Generated directory not found: $generated_dir"
+ fi
+
+ if [ "$fail" -ne 0 ]; then
+ :
+ elif ! diff -r "$generated_dir" "$golden_dir" &> /dev/null; then
+ failure "# Directories differ:"
+ failbuf+=$(diff -r "$generated_dir" "$golden_dir" 2>&1 | sed 's/^/# /')
+ failbuf+=$'\n'
+ fi
+
+ report "$1"
+
+ rm -rf "$generated_dir"
+ fi
+}
+
set_timeout() {
TIMEOUT="timeout -v -k 15s $1"
}
--
2.54.0
^ permalink raw reply related
* [PATCH v2 08/14] verification/rvgen: Add golden and spec folders for tests
From: Gabriele Monaco @ 2026-05-14 15:20 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel, Steven Rostedt, Gabriele Monaco
Cc: Nam Cao, Thomas Weissschuh, Tomas Glozar, John Kacur, Wen Yang
In-Reply-To: <20260514152055.229162-1-gmonaco@redhat.com>
Create reference models specifications and generated files in the golded
folder. Those can be used as reference to validate rvgen still generates
files as expected in automated tests.
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
.../rvgen/tests/golden/da_global/Kconfig | 9 +
.../rvgen/tests/golden/da_global/da_global.c | 95 +++++++
.../rvgen/tests/golden/da_global/da_global.h | 47 ++++
.../tests/golden/da_global/da_global_trace.h | 15 ++
.../tests/golden/da_perobj_parent/Kconfig | 11 +
.../da_perobj_parent/da_perobj_parent.c | 110 ++++++++
.../da_perobj_parent/da_perobj_parent.h | 64 +++++
.../da_perobj_parent/da_perobj_parent_trace.h | 15 ++
.../tests/golden/da_pertask_desc/Kconfig | 9 +
.../golden/da_pertask_desc/da_pertask_desc.c | 105 ++++++++
.../golden/da_pertask_desc/da_pertask_desc.h | 64 +++++
.../da_pertask_desc/da_pertask_desc_trace.h | 15 ++
.../rvgen/tests/golden/ha_percpu/Kconfig | 9 +
.../rvgen/tests/golden/ha_percpu/ha_percpu.c | 244 +++++++++++++++++
.../rvgen/tests/golden/ha_percpu/ha_percpu.h | 72 +++++
.../tests/golden/ha_percpu/ha_percpu_trace.h | 19 ++
.../rvgen/tests/golden/ltl_pertask/Kconfig | 9 +
.../tests/golden/ltl_pertask/ltl_pertask.c | 107 ++++++++
.../tests/golden/ltl_pertask/ltl_pertask.h | 108 ++++++++
.../golden/ltl_pertask/ltl_pertask_trace.h | 14 +
.../rvgen/tests/golden/test_container/Kconfig | 5 +
.../golden/test_container/test_container.c | 35 +++
.../golden/test_container/test_container.h | 3 +
.../rvgen/tests/golden/test_da/Kconfig | 9 +
.../rvgen/tests/golden/test_da/test_da.c | 95 +++++++
.../rvgen/tests/golden/test_da/test_da.h | 47 ++++
.../tests/golden/test_da/test_da_trace.h | 15 ++
.../rvgen/tests/golden/test_ha/Kconfig | 9 +
.../rvgen/tests/golden/test_ha/test_ha.c | 247 ++++++++++++++++++
.../rvgen/tests/golden/test_ha/test_ha.h | 72 +++++
.../tests/golden/test_ha/test_ha_trace.h | 19 ++
.../rvgen/tests/golden/test_ltl/Kconfig | 11 +
.../rvgen/tests/golden/test_ltl/test_ltl.c | 108 ++++++++
.../rvgen/tests/golden/test_ltl/test_ltl.h | 108 ++++++++
.../tests/golden/test_ltl/test_ltl_trace.h | 14 +
.../rvgen/tests/specs/test_da.dot | 16 ++
.../rvgen/tests/specs/test_da2.dot | 19 ++
.../rvgen/tests/specs/test_ha.dot | 27 ++
.../rvgen/tests/specs/test_invalid.dot | 8 +
.../rvgen/tests/specs/test_invalid.ltl | 1 +
.../rvgen/tests/specs/test_invalid_ha.dot | 16 ++
.../rvgen/tests/specs/test_ltl.ltl | 1 +
42 files changed, 2026 insertions(+)
create mode 100644 tools/verification/rvgen/tests/golden/da_global/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/da_global/da_global.c
create mode 100644 tools/verification/rvgen/tests/golden/da_global/da_global.h
create mode 100644 tools/verification/rvgen/tests/golden/da_global/da_global_trace.h
create mode 100644 tools/verification/rvgen/tests/golden/da_perobj_parent/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent.c
create mode 100644 tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent.h
create mode 100644 tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent_trace.h
create mode 100644 tools/verification/rvgen/tests/golden/da_pertask_desc/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc.c
create mode 100644 tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc.h
create mode 100644 tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc_trace.h
create mode 100644 tools/verification/rvgen/tests/golden/ha_percpu/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu.c
create mode 100644 tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu.h
create mode 100644 tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu_trace.h
create mode 100644 tools/verification/rvgen/tests/golden/ltl_pertask/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask.c
create mode 100644 tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask.h
create mode 100644 tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask_trace.h
create mode 100644 tools/verification/rvgen/tests/golden/test_container/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/test_container/test_container.c
create mode 100644 tools/verification/rvgen/tests/golden/test_container/test_container.h
create mode 100644 tools/verification/rvgen/tests/golden/test_da/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/test_da/test_da.c
create mode 100644 tools/verification/rvgen/tests/golden/test_da/test_da.h
create mode 100644 tools/verification/rvgen/tests/golden/test_da/test_da_trace.h
create mode 100644 tools/verification/rvgen/tests/golden/test_ha/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/test_ha/test_ha.c
create mode 100644 tools/verification/rvgen/tests/golden/test_ha/test_ha.h
create mode 100644 tools/verification/rvgen/tests/golden/test_ha/test_ha_trace.h
create mode 100644 tools/verification/rvgen/tests/golden/test_ltl/Kconfig
create mode 100644 tools/verification/rvgen/tests/golden/test_ltl/test_ltl.c
create mode 100644 tools/verification/rvgen/tests/golden/test_ltl/test_ltl.h
create mode 100644 tools/verification/rvgen/tests/golden/test_ltl/test_ltl_trace.h
create mode 100644 tools/verification/rvgen/tests/specs/test_da.dot
create mode 100644 tools/verification/rvgen/tests/specs/test_da2.dot
create mode 100644 tools/verification/rvgen/tests/specs/test_ha.dot
create mode 100644 tools/verification/rvgen/tests/specs/test_invalid.dot
create mode 100644 tools/verification/rvgen/tests/specs/test_invalid.ltl
create mode 100644 tools/verification/rvgen/tests/specs/test_invalid_ha.dot
create mode 100644 tools/verification/rvgen/tests/specs/test_ltl.ltl
diff --git a/tools/verification/rvgen/tests/golden/da_global/Kconfig b/tools/verification/rvgen/tests/golden/da_global/Kconfig
new file mode 100644
index 000000000000..799fbf11c3ac
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_global/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_DA_GLOBAL
+ depends on RV
+ # XXX: add dependencies if there
+ select DA_MON_EVENTS_IMPLICIT
+ bool "da_global monitor"
+ help
+ auto-generated
diff --git a/tools/verification/rvgen/tests/golden/da_global/da_global.c b/tools/verification/rvgen/tests/golden/da_global/da_global.c
new file mode 100644
index 000000000000..ad4b939d2323
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_global/da_global.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "da_global"
+
+/*
+ * XXX: include required tracepoint headers, e.g.,
+ * #include <trace/events/sched.h>
+ */
+#include <rv_trace.h>
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#define RV_MON_TYPE RV_MON_GLOBAL
+#include "da_global.h"
+#include <rv/da_monitor.h>
+
+/*
+ * This is the instrumentation part of the monitor.
+ *
+ * This is the section where manual work is required. Here the kernel events
+ * are translated into model's event.
+ *
+ */
+static void handle_event_1(void *data, /* XXX: fill header */)
+{
+ da_handle_event(event_1_da_global);
+}
+
+static void handle_event_2(void *data, /* XXX: fill header */)
+{
+ /* XXX: validate that this event always leads to the initial state */
+ da_handle_start_event(event_2_da_global);
+}
+
+static int enable_da_global(void)
+{
+ int retval;
+
+ retval = da_monitor_init();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("da_global", /* XXX: tracepoint */, handle_event_1);
+ rv_attach_trace_probe("da_global", /* XXX: tracepoint */, handle_event_2);
+
+ return 0;
+}
+
+static void disable_da_global(void)
+{
+ rv_this.enabled = 0;
+
+ rv_detach_trace_probe("da_global", /* XXX: tracepoint */, handle_event_1);
+ rv_detach_trace_probe("da_global", /* XXX: tracepoint */, handle_event_2);
+
+ da_monitor_destroy();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_this = {
+ .name = "da_global",
+ .description = "auto-generated",
+ .enable = enable_da_global,
+ .disable = disable_da_global,
+ .reset = da_monitor_reset_all,
+ .enabled = 0,
+};
+
+static int __init register_da_global(void)
+{
+ return rv_register_monitor(&rv_this, NULL);
+}
+
+static void __exit unregister_da_global(void)
+{
+ rv_unregister_monitor(&rv_this);
+}
+
+module_init(register_da_global);
+module_exit(unregister_da_global);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("dot2k: auto-generated");
+MODULE_DESCRIPTION("da_global: auto-generated");
diff --git a/tools/verification/rvgen/tests/golden/da_global/da_global.h b/tools/verification/rvgen/tests/golden/da_global/da_global.h
new file mode 100644
index 000000000000..40b1f1c0c681
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_global/da_global.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of da_global automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+#define MONITOR_NAME da_global
+
+enum states_da_global {
+ state_a_da_global,
+ state_b_da_global,
+ state_max_da_global,
+};
+
+#define INVALID_STATE state_max_da_global
+
+enum events_da_global {
+ event_1_da_global,
+ event_2_da_global,
+ event_max_da_global,
+};
+
+struct automaton_da_global {
+ char *state_names[state_max_da_global];
+ char *event_names[event_max_da_global];
+ unsigned char function[state_max_da_global][event_max_da_global];
+ unsigned char initial_state;
+ bool final_states[state_max_da_global];
+};
+
+static const struct automaton_da_global automaton_da_global = {
+ .state_names = {
+ "state_a",
+ "state_b",
+ },
+ .event_names = {
+ "event_1",
+ "event_2",
+ },
+ .function = {
+ { state_b_da_global, state_a_da_global },
+ { INVALID_STATE, state_a_da_global },
+ },
+ .initial_state = state_a_da_global,
+ .final_states = { 1, 0 },
+};
diff --git a/tools/verification/rvgen/tests/golden/da_global/da_global_trace.h b/tools/verification/rvgen/tests/golden/da_global/da_global_trace.h
new file mode 100644
index 000000000000..4d2730b71dd0
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_global/da_global_trace.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_DA_GLOBAL
+DEFINE_EVENT(event_da_monitor, event_da_global,
+ TP_PROTO(char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor, error_da_global,
+ TP_PROTO(char *state, char *event),
+ TP_ARGS(state, event));
+#endif /* CONFIG_RV_MON_DA_GLOBAL */
diff --git a/tools/verification/rvgen/tests/golden/da_perobj_parent/Kconfig b/tools/verification/rvgen/tests/golden/da_perobj_parent/Kconfig
new file mode 100644
index 000000000000..249ba3aee8d7
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_perobj_parent/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_DA_PEROBJ_PARENT
+ depends on RV
+ # XXX: add dependencies if there
+ depends on RV_MON_PARENT_MON
+ default y
+ select DA_MON_EVENTS_ID
+ bool "da_perobj_parent monitor"
+ help
+ auto-generated
diff --git a/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent.c b/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent.c
new file mode 100644
index 000000000000..66f3a010876a
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent.c
@@ -0,0 +1,110 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "da_perobj_parent"
+
+/*
+ * XXX: include required tracepoint headers, e.g.,
+ * #include <trace/events/sched.h>
+ */
+#include <rv_trace.h>
+#include <monitors/parent_mon/parent_mon.h>
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#define RV_MON_TYPE RV_MON_PER_OBJ
+typedef /* XXX: define the target type */ *monitor_target;
+#include "da_perobj_parent.h"
+#include <rv/da_monitor.h>
+
+/*
+ * This is the instrumentation part of the monitor.
+ *
+ * This is the section where manual work is required. Here the kernel events
+ * are translated into model's event.
+ *
+ */
+static void handle_event_1(void *data, /* XXX: fill header */)
+{
+ /* XXX: validate that this event is only valid in the initial state */
+ int id = /* XXX: how do I get the id? */;
+ monitor_target t = /* XXX: how do I get t? */;
+ da_handle_start_run_event(id, t, event_1_da_perobj_parent);
+}
+
+static void handle_event_2(void *data, /* XXX: fill header */)
+{
+ int id = /* XXX: how do I get the id? */;
+ monitor_target t = /* XXX: how do I get t? */;
+ da_handle_event(id, t, event_2_da_perobj_parent);
+}
+
+static void handle_event_3(void *data, /* XXX: fill header */)
+{
+ int id = /* XXX: how do I get the id? */;
+ monitor_target t = /* XXX: how do I get t? */;
+ da_handle_event(id, t, event_3_da_perobj_parent);
+}
+
+static int enable_da_perobj_parent(void)
+{
+ int retval;
+
+ retval = da_monitor_init();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("da_perobj_parent", /* XXX: tracepoint */, handle_event_1);
+ rv_attach_trace_probe("da_perobj_parent", /* XXX: tracepoint */, handle_event_2);
+ rv_attach_trace_probe("da_perobj_parent", /* XXX: tracepoint */, handle_event_3);
+
+ return 0;
+}
+
+static void disable_da_perobj_parent(void)
+{
+ rv_this.enabled = 0;
+
+ rv_detach_trace_probe("da_perobj_parent", /* XXX: tracepoint */, handle_event_1);
+ rv_detach_trace_probe("da_perobj_parent", /* XXX: tracepoint */, handle_event_2);
+ rv_detach_trace_probe("da_perobj_parent", /* XXX: tracepoint */, handle_event_3);
+
+ da_monitor_destroy();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_this = {
+ .name = "da_perobj_parent",
+ .description = "auto-generated",
+ .enable = enable_da_perobj_parent,
+ .disable = disable_da_perobj_parent,
+ .reset = da_monitor_reset_all,
+ .enabled = 0,
+};
+
+static int __init register_da_perobj_parent(void)
+{
+ return rv_register_monitor(&rv_this, &rv_parent_mon);
+}
+
+static void __exit unregister_da_perobj_parent(void)
+{
+ rv_unregister_monitor(&rv_this);
+}
+
+module_init(register_da_perobj_parent);
+module_exit(unregister_da_perobj_parent);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("dot2k: auto-generated");
+MODULE_DESCRIPTION("da_perobj_parent: auto-generated");
diff --git a/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent.h b/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent.h
new file mode 100644
index 000000000000..3c8dc3b22443
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of da_perobj_parent automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+#define MONITOR_NAME da_perobj_parent
+
+enum states_da_perobj_parent {
+ state_a_da_perobj_parent,
+ state_b_da_perobj_parent,
+ state_c_da_perobj_parent,
+ state_max_da_perobj_parent,
+};
+
+#define INVALID_STATE state_max_da_perobj_parent
+
+enum events_da_perobj_parent {
+ event_1_da_perobj_parent,
+ event_2_da_perobj_parent,
+ event_3_da_perobj_parent,
+ event_max_da_perobj_parent,
+};
+
+struct automaton_da_perobj_parent {
+ char *state_names[state_max_da_perobj_parent];
+ char *event_names[event_max_da_perobj_parent];
+ unsigned char function[state_max_da_perobj_parent][event_max_da_perobj_parent];
+ unsigned char initial_state;
+ bool final_states[state_max_da_perobj_parent];
+};
+
+static const struct automaton_da_perobj_parent automaton_da_perobj_parent = {
+ .state_names = {
+ "state_a",
+ "state_b",
+ "state_c",
+ },
+ .event_names = {
+ "event_1",
+ "event_2",
+ "event_3",
+ },
+ .function = {
+ {
+ state_b_da_perobj_parent,
+ state_c_da_perobj_parent,
+ INVALID_STATE,
+ },
+ {
+ INVALID_STATE,
+ state_a_da_perobj_parent,
+ state_c_da_perobj_parent,
+ },
+ {
+ INVALID_STATE,
+ INVALID_STATE,
+ INVALID_STATE,
+ },
+ },
+ .initial_state = state_a_da_perobj_parent,
+ .final_states = { 1, 0, 0 },
+};
diff --git a/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent_trace.h b/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent_trace.h
new file mode 100644
index 000000000000..59bfca8f73d2
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_perobj_parent/da_perobj_parent_trace.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_DA_PEROBJ_PARENT
+DEFINE_EVENT(event_da_monitor_id, event_da_perobj_parent,
+ TP_PROTO(int id, char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(id, state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor_id, error_da_perobj_parent,
+ TP_PROTO(int id, char *state, char *event),
+ TP_ARGS(id, state, event));
+#endif /* CONFIG_RV_MON_DA_PEROBJ_PARENT */
diff --git a/tools/verification/rvgen/tests/golden/da_pertask_desc/Kconfig b/tools/verification/rvgen/tests/golden/da_pertask_desc/Kconfig
new file mode 100644
index 000000000000..c6f350179098
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_pertask_desc/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_DA_PERTASK_DESC
+ depends on RV
+ # XXX: add dependencies if there
+ select DA_MON_EVENTS_ID
+ bool "da_pertask_desc monitor"
+ help
+ Custom description for testing
diff --git a/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc.c b/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc.c
new file mode 100644
index 000000000000..bd76ecc3a998
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc.c
@@ -0,0 +1,105 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "da_pertask_desc"
+
+/*
+ * XXX: include required tracepoint headers, e.g.,
+ * #include <trace/events/sched.h>
+ */
+#include <rv_trace.h>
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#define RV_MON_TYPE RV_MON_PER_TASK
+#include "da_pertask_desc.h"
+#include <rv/da_monitor.h>
+
+/*
+ * This is the instrumentation part of the monitor.
+ *
+ * This is the section where manual work is required. Here the kernel events
+ * are translated into model's event.
+ *
+ */
+static void handle_event_1(void *data, /* XXX: fill header */)
+{
+ /* XXX: validate that this event is only valid in the initial state */
+ struct task_struct *p = /* XXX: how do I get p? */;
+ da_handle_start_run_event(p, event_1_da_pertask_desc);
+}
+
+static void handle_event_2(void *data, /* XXX: fill header */)
+{
+ struct task_struct *p = /* XXX: how do I get p? */;
+ da_handle_event(p, event_2_da_pertask_desc);
+}
+
+static void handle_event_3(void *data, /* XXX: fill header */)
+{
+ struct task_struct *p = /* XXX: how do I get p? */;
+ da_handle_event(p, event_3_da_pertask_desc);
+}
+
+static int enable_da_pertask_desc(void)
+{
+ int retval;
+
+ retval = da_monitor_init();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("da_pertask_desc", /* XXX: tracepoint */, handle_event_1);
+ rv_attach_trace_probe("da_pertask_desc", /* XXX: tracepoint */, handle_event_2);
+ rv_attach_trace_probe("da_pertask_desc", /* XXX: tracepoint */, handle_event_3);
+
+ return 0;
+}
+
+static void disable_da_pertask_desc(void)
+{
+ rv_this.enabled = 0;
+
+ rv_detach_trace_probe("da_pertask_desc", /* XXX: tracepoint */, handle_event_1);
+ rv_detach_trace_probe("da_pertask_desc", /* XXX: tracepoint */, handle_event_2);
+ rv_detach_trace_probe("da_pertask_desc", /* XXX: tracepoint */, handle_event_3);
+
+ da_monitor_destroy();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_this = {
+ .name = "da_pertask_desc",
+ .description = "Custom description for testing",
+ .enable = enable_da_pertask_desc,
+ .disable = disable_da_pertask_desc,
+ .reset = da_monitor_reset_all,
+ .enabled = 0,
+};
+
+static int __init register_da_pertask_desc(void)
+{
+ return rv_register_monitor(&rv_this, NULL);
+}
+
+static void __exit unregister_da_pertask_desc(void)
+{
+ rv_unregister_monitor(&rv_this);
+}
+
+module_init(register_da_pertask_desc);
+module_exit(unregister_da_pertask_desc);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("dot2k: auto-generated");
+MODULE_DESCRIPTION("da_pertask_desc: Custom description for testing");
diff --git a/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc.h b/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc.h
new file mode 100644
index 000000000000..837b238754b0
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of da_pertask_desc automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+#define MONITOR_NAME da_pertask_desc
+
+enum states_da_pertask_desc {
+ state_a_da_pertask_desc,
+ state_b_da_pertask_desc,
+ state_c_da_pertask_desc,
+ state_max_da_pertask_desc,
+};
+
+#define INVALID_STATE state_max_da_pertask_desc
+
+enum events_da_pertask_desc {
+ event_1_da_pertask_desc,
+ event_2_da_pertask_desc,
+ event_3_da_pertask_desc,
+ event_max_da_pertask_desc,
+};
+
+struct automaton_da_pertask_desc {
+ char *state_names[state_max_da_pertask_desc];
+ char *event_names[event_max_da_pertask_desc];
+ unsigned char function[state_max_da_pertask_desc][event_max_da_pertask_desc];
+ unsigned char initial_state;
+ bool final_states[state_max_da_pertask_desc];
+};
+
+static const struct automaton_da_pertask_desc automaton_da_pertask_desc = {
+ .state_names = {
+ "state_a",
+ "state_b",
+ "state_c",
+ },
+ .event_names = {
+ "event_1",
+ "event_2",
+ "event_3",
+ },
+ .function = {
+ {
+ state_b_da_pertask_desc,
+ state_c_da_pertask_desc,
+ INVALID_STATE,
+ },
+ {
+ INVALID_STATE,
+ state_a_da_pertask_desc,
+ state_c_da_pertask_desc,
+ },
+ {
+ INVALID_STATE,
+ INVALID_STATE,
+ INVALID_STATE,
+ },
+ },
+ .initial_state = state_a_da_pertask_desc,
+ .final_states = { 1, 0, 0 },
+};
diff --git a/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc_trace.h b/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc_trace.h
new file mode 100644
index 000000000000..4e6086c4d86e
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/da_pertask_desc/da_pertask_desc_trace.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_DA_PERTASK_DESC
+DEFINE_EVENT(event_da_monitor_id, event_da_pertask_desc,
+ TP_PROTO(int id, char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(id, state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor_id, error_da_pertask_desc,
+ TP_PROTO(int id, char *state, char *event),
+ TP_ARGS(id, state, event));
+#endif /* CONFIG_RV_MON_DA_PERTASK_DESC */
diff --git a/tools/verification/rvgen/tests/golden/ha_percpu/Kconfig b/tools/verification/rvgen/tests/golden/ha_percpu/Kconfig
new file mode 100644
index 000000000000..0cc185ccfddf
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/ha_percpu/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_HA_PERCPU
+ depends on RV
+ # XXX: add dependencies if there
+ select HA_MON_EVENTS_IMPLICIT
+ bool "ha_percpu monitor"
+ help
+ auto-generated
diff --git a/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu.c b/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu.c
new file mode 100644
index 000000000000..ba7a02a18f81
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu.c
@@ -0,0 +1,244 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "ha_percpu"
+
+/*
+ * XXX: include required tracepoint headers, e.g.,
+ * #include <trace/events/sched.h>
+ */
+#include <rv_trace.h>
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#define RV_MON_TYPE RV_MON_PER_CPU
+/* XXX: If the monitor has several instances, consider HA_TIMER_WHEEL */
+#define HA_TIMER_TYPE HA_TIMER_HRTIMER
+#include "ha_percpu.h"
+#include <rv/ha_monitor.h>
+
+/*
+ * This is the instrumentation part of the monitor.
+ *
+ * This is the section where manual work is required. Here the kernel events
+ * are translated into model's event.
+ *
+ */
+#define BAR_NS(ha_mon) /* XXX: what is BAR_NS(ha_mon)? */
+
+#define FOO_NS /* XXX: what is FOO_NS? */
+
+static inline u64 bar_ns(struct ha_monitor *ha_mon)
+{
+ return /* XXX: what is bar_ns(ha_mon)? */;
+}
+
+static u64 foo_ns = /* XXX: default value */;
+module_param(foo_ns, ullong, 0644);
+
+/*
+ * These functions define how to read and reset the environment variable.
+ *
+ * Common environment variables like ns-based and jiffy-based clocks have
+ * pre-define getters and resetters you can use. The parser can infer the type
+ * of the environment variable if you supply a measure unit in the constraint.
+ * If you define your own functions, make sure to add appropriate memory
+ * barriers if required.
+ * Some environment variables don't require a storage as they read a system
+ * state (e.g. preemption count). Those variables are never reset, so we don't
+ * define a reset function on monitors only relying on this type of variables.
+ */
+static u64 ha_get_env(struct ha_monitor *ha_mon, enum envs_ha_percpu env, u64 time_ns)
+{
+ if (env == clk_ha_percpu)
+ return ha_get_clk_ns(ha_mon, env, time_ns);
+ else if (env == env1_ha_percpu)
+ return /* XXX: how do I read env1? */
+ else if (env == env2_ha_percpu)
+ return /* XXX: how do I read env2? */
+ return ENV_INVALID_VALUE;
+}
+
+static void ha_reset_env(struct ha_monitor *ha_mon, enum envs_ha_percpu env, u64 time_ns)
+{
+ if (env == clk_ha_percpu)
+ ha_reset_clk_ns(ha_mon, env, time_ns);
+}
+
+/*
+ * These functions are used to validate state transitions.
+ *
+ * They are generated by parsing the model, there is usually no need to change them.
+ * If the monitor requires a timer, there are functions responsible to arm it when
+ * the next state has a constraint, cancel it in any other case and to check
+ * that it didn't expire before the callback run. Transitions to the same state
+ * without a reset never affect timers.
+ * Due to the different representations between invariants and guards, there is
+ * a function to convert it in case invariants or guards are reachable from
+ * another invariant without reset. Those are not present if not required in
+ * the model. This is all automatic but is worth checking because it may show
+ * errors in the model (e.g. missing resets).
+ */
+static inline bool ha_verify_invariants(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ if (curr_state == S0_ha_percpu)
+ return ha_check_invariant_ns(ha_mon, clk_ha_percpu, time_ns);
+ else if (curr_state == S2_ha_percpu)
+ return ha_check_invariant_ns(ha_mon, clk_ha_percpu, time_ns);
+ return true;
+}
+
+static inline void ha_convert_inv_guard(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ if (curr_state == next_state)
+ return;
+ if (curr_state == S2_ha_percpu)
+ ha_inv_to_guard(ha_mon, clk_ha_percpu, BAR_NS(ha_mon), time_ns);
+}
+
+static inline bool ha_verify_guards(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ bool res = true;
+
+ if (curr_state == S0_ha_percpu && event == event0_ha_percpu)
+ ha_reset_env(ha_mon, clk_ha_percpu, time_ns);
+ else if (curr_state == S0_ha_percpu && event == event1_ha_percpu)
+ ha_reset_env(ha_mon, clk_ha_percpu, time_ns);
+ else if (curr_state == S1_ha_percpu && event == event0_ha_percpu)
+ ha_reset_env(ha_mon, clk_ha_percpu, time_ns);
+ else if (curr_state == S1_ha_percpu && event == event2_ha_percpu) {
+ res = ha_get_env(ha_mon, env1_ha_percpu, time_ns) == 0ull;
+ ha_reset_env(ha_mon, clk_ha_percpu, time_ns);
+ } else if (curr_state == S2_ha_percpu && event == event1_ha_percpu)
+ res = ha_monitor_env_invalid(ha_mon, clk_ha_percpu) ||
+ ha_get_env(ha_mon, clk_ha_percpu, time_ns) < foo_ns;
+ else if (curr_state == S3_ha_percpu && event == event0_ha_percpu)
+ res = ha_monitor_env_invalid(ha_mon, clk_ha_percpu) ||
+ (ha_get_env(ha_mon, clk_ha_percpu, time_ns) < FOO_NS &&
+ ha_get_env(ha_mon, env2_ha_percpu, time_ns) == 0ull);
+ else if (curr_state == S3_ha_percpu && event == event1_ha_percpu) {
+ res = ha_monitor_env_invalid(ha_mon, clk_ha_percpu) ||
+ (ha_get_env(ha_mon, clk_ha_percpu, time_ns) < foo_ns &&
+ ha_get_env(ha_mon, env1_ha_percpu, time_ns) == 1ull);
+ ha_reset_env(ha_mon, clk_ha_percpu, time_ns);
+ }
+ return res;
+}
+
+static inline void ha_setup_invariants(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ if (next_state == curr_state && event != event0_ha_percpu)
+ return;
+ if (next_state == S0_ha_percpu)
+ ha_start_timer_ns(ha_mon, clk_ha_percpu, bar_ns(ha_mon), time_ns);
+ else if (next_state == S2_ha_percpu)
+ ha_start_timer_ns(ha_mon, clk_ha_percpu, BAR_NS(ha_mon), time_ns);
+ else if (curr_state == S0_ha_percpu)
+ ha_cancel_timer(ha_mon);
+ else if (curr_state == S2_ha_percpu)
+ ha_cancel_timer(ha_mon);
+}
+
+static bool ha_verify_constraint(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ if (!ha_verify_invariants(ha_mon, curr_state, event, next_state, time_ns))
+ return false;
+
+ ha_convert_inv_guard(ha_mon, curr_state, event, next_state, time_ns);
+
+ if (!ha_verify_guards(ha_mon, curr_state, event, next_state, time_ns))
+ return false;
+
+ ha_setup_invariants(ha_mon, curr_state, event, next_state, time_ns);
+
+ return true;
+}
+
+static void handle_event0(void *data, /* XXX: fill header */)
+{
+ /* XXX: validate that this event always leads to the initial state */
+ da_handle_start_event(event0_ha_percpu);
+}
+
+static void handle_event1(void *data, /* XXX: fill header */)
+{
+ da_handle_event(event1_ha_percpu);
+}
+
+static void handle_event2(void *data, /* XXX: fill header */)
+{
+ da_handle_event(event2_ha_percpu);
+}
+
+static int enable_ha_percpu(void)
+{
+ int retval;
+
+ retval = da_monitor_init();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("ha_percpu", /* XXX: tracepoint */, handle_event0);
+ rv_attach_trace_probe("ha_percpu", /* XXX: tracepoint */, handle_event1);
+ rv_attach_trace_probe("ha_percpu", /* XXX: tracepoint */, handle_event2);
+
+ return 0;
+}
+
+static void disable_ha_percpu(void)
+{
+ rv_this.enabled = 0;
+
+ rv_detach_trace_probe("ha_percpu", /* XXX: tracepoint */, handle_event0);
+ rv_detach_trace_probe("ha_percpu", /* XXX: tracepoint */, handle_event1);
+ rv_detach_trace_probe("ha_percpu", /* XXX: tracepoint */, handle_event2);
+
+ da_monitor_destroy();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_this = {
+ .name = "ha_percpu",
+ .description = "auto-generated",
+ .enable = enable_ha_percpu,
+ .disable = disable_ha_percpu,
+ .reset = da_monitor_reset_all,
+ .enabled = 0,
+};
+
+static int __init register_ha_percpu(void)
+{
+ return rv_register_monitor(&rv_this, NULL);
+}
+
+static void __exit unregister_ha_percpu(void)
+{
+ rv_unregister_monitor(&rv_this);
+}
+
+module_init(register_ha_percpu);
+module_exit(unregister_ha_percpu);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("dot2k: auto-generated");
+MODULE_DESCRIPTION("ha_percpu: auto-generated");
diff --git a/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu.h b/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu.h
new file mode 100644
index 000000000000..2538db4f6a26
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of ha_percpu automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+#define MONITOR_NAME ha_percpu
+
+enum states_ha_percpu {
+ S0_ha_percpu,
+ S1_ha_percpu,
+ S2_ha_percpu,
+ S3_ha_percpu,
+ state_max_ha_percpu,
+};
+
+#define INVALID_STATE state_max_ha_percpu
+
+enum events_ha_percpu {
+ event0_ha_percpu,
+ event1_ha_percpu,
+ event2_ha_percpu,
+ event_max_ha_percpu,
+};
+
+enum envs_ha_percpu {
+ clk_ha_percpu,
+ env1_ha_percpu,
+ env2_ha_percpu,
+ env_max_ha_percpu,
+ env_max_stored_ha_percpu = env1_ha_percpu,
+};
+
+_Static_assert(env_max_stored_ha_percpu <= MAX_HA_ENV_LEN, "Not enough slots");
+#define HA_CLK_NS
+
+struct automaton_ha_percpu {
+ char *state_names[state_max_ha_percpu];
+ char *event_names[event_max_ha_percpu];
+ char *env_names[env_max_ha_percpu];
+ unsigned char function[state_max_ha_percpu][event_max_ha_percpu];
+ unsigned char initial_state;
+ bool final_states[state_max_ha_percpu];
+};
+
+static const struct automaton_ha_percpu automaton_ha_percpu = {
+ .state_names = {
+ "S0",
+ "S1",
+ "S2",
+ "S3",
+ },
+ .event_names = {
+ "event0",
+ "event1",
+ "event2",
+ },
+ .env_names = {
+ "clk",
+ "env1",
+ "env2",
+ },
+ .function = {
+ { S0_ha_percpu, S1_ha_percpu, INVALID_STATE },
+ { S0_ha_percpu, INVALID_STATE, S2_ha_percpu },
+ { INVALID_STATE, S2_ha_percpu, S3_ha_percpu },
+ { S0_ha_percpu, S1_ha_percpu, INVALID_STATE },
+ },
+ .initial_state = S0_ha_percpu,
+ .final_states = { 1, 0, 0, 0 },
+};
diff --git a/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu_trace.h b/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu_trace.h
new file mode 100644
index 000000000000..074ddff6a60d
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/ha_percpu/ha_percpu_trace.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_HA_PERCPU
+DEFINE_EVENT(event_da_monitor, event_ha_percpu,
+ TP_PROTO(char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor, error_ha_percpu,
+ TP_PROTO(char *state, char *event),
+ TP_ARGS(state, event));
+
+DEFINE_EVENT(error_env_da_monitor, error_env_ha_percpu,
+ TP_PROTO(char *state, char *event, char *env),
+ TP_ARGS(state, event, env));
+#endif /* CONFIG_RV_MON_HA_PERCPU */
diff --git a/tools/verification/rvgen/tests/golden/ltl_pertask/Kconfig b/tools/verification/rvgen/tests/golden/ltl_pertask/Kconfig
new file mode 100644
index 000000000000..b37f46670bfd
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/ltl_pertask/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_LTL_PERTASK
+ depends on RV
+ # XXX: add dependencies if there
+ select LTL_MON_EVENTS_ID
+ bool "ltl_pertask monitor"
+ help
+ auto-generated
diff --git a/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask.c b/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask.c
new file mode 100644
index 000000000000..1b6897200e4b
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask.c
@@ -0,0 +1,107 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "ltl_pertask"
+
+/*
+ * XXX: include required tracepoint headers, e.g.,
+ * #include <trace/events/sched.h>
+ */
+#include <rv_trace.h>
+
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#include "ltl_pertask.h"
+#include <rv/ltl_monitor.h>
+
+static void ltl_atoms_fetch(struct task_struct *task, struct ltl_monitor *mon)
+{
+ /*
+ * This is called everytime the Buchi automaton is triggered.
+ *
+ * This function could be used to fetch the atomic propositions which
+ * are expensive to trace. It is possible only if the atomic proposition
+ * does not need to be updated at precise time.
+ *
+ * It is recommended to use tracepoints and ltl_atom_update() instead.
+ */
+}
+
+static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
+{
+ /*
+ * This should initialize as many atomic propositions as possible.
+ *
+ * @task_creation indicates whether the task is being created. This is
+ * false if the task is already running before the monitor is enabled.
+ */
+ ltl_atom_set(mon, LTL_EVENT_A, true/false);
+ ltl_atom_set(mon, LTL_EVENT_B, true/false);
+}
+
+/*
+ * This is the instrumentation part of the monitor.
+ *
+ * This is the section where manual work is required. Here the kernel events
+ * are translated into model's event.
+ */
+static void handle_example_event(void *data, /* XXX: fill header */)
+{
+ ltl_atom_update(task, LTL_EVENT_A, true/false);
+}
+
+static int enable_ltl_pertask(void)
+{
+ int retval;
+
+ retval = ltl_monitor_init();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("ltl_pertask", /* XXX: tracepoint */, handle_example_event);
+
+ return 0;
+}
+
+static void disable_ltl_pertask(void)
+{
+ rv_detach_trace_probe("ltl_pertask", /* XXX: tracepoint */, handle_sample_event);
+
+ ltl_monitor_destroy();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_ltl_pertask = {
+ .name = "ltl_pertask",
+ .description = "auto-generated",
+ .enable = enable_ltl_pertask,
+ .disable = disable_ltl_pertask,
+};
+
+static int __init register_ltl_pertask(void)
+{
+ return rv_register_monitor(&rv_ltl_pertask, NULL);
+}
+
+static void __exit unregister_ltl_pertask(void)
+{
+ rv_unregister_monitor(&rv_ltl_pertask);
+}
+
+module_init(register_ltl_pertask);
+module_exit(unregister_ltl_pertask);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR(/* TODO */);
+MODULE_DESCRIPTION("ltl_pertask: auto-generated");
diff --git a/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask.h b/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask.h
new file mode 100644
index 000000000000..7e5de351b8fa
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask.h
@@ -0,0 +1,108 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * C implementation of Buchi automaton, automatically generated by
+ * tools/verification/rvgen from the linear temporal logic specification.
+ * For further information, see kernel documentation:
+ * Documentation/trace/rv/linear_temporal_logic.rst
+ */
+
+#include <linux/rv.h>
+
+#define MONITOR_NAME ltl_pertask
+
+enum ltl_atom {
+ LTL_EVENT_A,
+ LTL_EVENT_B,
+ LTL_NUM_ATOM
+};
+static_assert(LTL_NUM_ATOM <= RV_MAX_LTL_ATOM);
+
+static const char *ltl_atom_str(enum ltl_atom atom)
+{
+ static const char *const names[] = {
+ "ev_a",
+ "ev_b",
+ };
+
+ return names[atom];
+}
+
+enum ltl_buchi_state {
+ S0,
+ S1,
+ S2,
+ S3,
+ S4,
+ RV_NUM_BA_STATES
+};
+static_assert(RV_NUM_BA_STATES <= RV_MAX_BA_STATES);
+
+static void ltl_start(struct task_struct *task, struct ltl_monitor *mon)
+{
+ bool event_b = test_bit(LTL_EVENT_B, mon->atoms);
+ bool event_a = test_bit(LTL_EVENT_A, mon->atoms);
+ bool val1 = !event_a;
+
+ if (val1)
+ __set_bit(S0, mon->states);
+ if (true)
+ __set_bit(S1, mon->states);
+ if (event_b)
+ __set_bit(S4, mon->states);
+}
+
+static void
+ltl_possible_next_states(struct ltl_monitor *mon, unsigned int state, unsigned long *next)
+{
+ bool event_b = test_bit(LTL_EVENT_B, mon->atoms);
+ bool event_a = test_bit(LTL_EVENT_A, mon->atoms);
+ bool val1 = !event_a;
+
+ switch (state) {
+ case S0:
+ if (val1)
+ __set_bit(S0, next);
+ if (true)
+ __set_bit(S1, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ case S1:
+ if (true)
+ __set_bit(S1, next);
+ if (true && val1)
+ __set_bit(S2, next);
+ if (event_b && val1)
+ __set_bit(S3, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ case S2:
+ if (true)
+ __set_bit(S1, next);
+ if (true && val1)
+ __set_bit(S2, next);
+ if (event_b && val1)
+ __set_bit(S3, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ case S3:
+ if (val1)
+ __set_bit(S0, next);
+ if (true)
+ __set_bit(S1, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ case S4:
+ if (val1)
+ __set_bit(S0, next);
+ if (true)
+ __set_bit(S1, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ }
+}
diff --git a/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask_trace.h b/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask_trace.h
new file mode 100644
index 000000000000..ebd53621a5b1
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/ltl_pertask/ltl_pertask_trace.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_LTL_PERTASK
+DEFINE_EVENT(event_ltl_monitor_id, event_ltl_pertask,
+ TP_PROTO(struct task_struct *task, char *states, char *atoms, char *next),
+ TP_ARGS(task, states, atoms, next));
+DEFINE_EVENT(error_ltl_monitor_id, error_ltl_pertask,
+ TP_PROTO(struct task_struct *task),
+ TP_ARGS(task));
+#endif /* CONFIG_RV_MON_LTL_PERTASK */
diff --git a/tools/verification/rvgen/tests/golden/test_container/Kconfig b/tools/verification/rvgen/tests/golden/test_container/Kconfig
new file mode 100644
index 000000000000..2becb65dddad
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_container/Kconfig
@@ -0,0 +1,5 @@
+config RV_MON_TEST_CONTAINER
+ depends on RV
+ bool "test_container monitor"
+ help
+ Test container for grouping monitors
diff --git a/tools/verification/rvgen/tests/golden/test_container/test_container.c b/tools/verification/rvgen/tests/golden/test_container/test_container.c
new file mode 100644
index 000000000000..984e2eac7196
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_container/test_container.c
@@ -0,0 +1,35 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+
+#define MODULE_NAME "test_container"
+
+#include "test_container.h"
+
+struct rv_monitor rv_test_container = {
+ .name = "test_container",
+ .description = "Test container for grouping monitors",
+ .enable = NULL,
+ .disable = NULL,
+ .reset = NULL,
+ .enabled = 0,
+};
+
+static int __init register_test_container(void)
+{
+ return rv_register_monitor(&rv_test_container, NULL);
+}
+
+static void __exit unregister_test_container(void)
+{
+ rv_unregister_monitor(&rv_test_container);
+}
+
+module_init(register_test_container);
+module_exit(unregister_test_container);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("dot2k: auto-generated");
+MODULE_DESCRIPTION("test_container: Test container for grouping monitors");
diff --git a/tools/verification/rvgen/tests/golden/test_container/test_container.h b/tools/verification/rvgen/tests/golden/test_container/test_container.h
new file mode 100644
index 000000000000..83e434432650
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_container/test_container.h
@@ -0,0 +1,3 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+extern struct rv_monitor rv_test_container;
diff --git a/tools/verification/rvgen/tests/golden/test_da/Kconfig b/tools/verification/rvgen/tests/golden/test_da/Kconfig
new file mode 100644
index 000000000000..0143a148ef34
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_da/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_TEST_DA
+ depends on RV
+ # XXX: add dependencies if there
+ select DA_MON_EVENTS_IMPLICIT
+ bool "test_da monitor"
+ help
+ auto-generated
diff --git a/tools/verification/rvgen/tests/golden/test_da/test_da.c b/tools/verification/rvgen/tests/golden/test_da/test_da.c
new file mode 100644
index 000000000000..b63bbf4e35c5
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_da/test_da.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "test_da"
+
+/*
+ * XXX: include required tracepoint headers, e.g.,
+ * #include <trace/events/sched.h>
+ */
+#include <rv_trace.h>
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#define RV_MON_TYPE RV_MON_PER_CPU
+#include "test_da.h"
+#include <rv/da_monitor.h>
+
+/*
+ * This is the instrumentation part of the monitor.
+ *
+ * This is the section where manual work is required. Here the kernel events
+ * are translated into model's event.
+ *
+ */
+static void handle_event_1(void *data, /* XXX: fill header */)
+{
+ da_handle_event(event_1_test_da);
+}
+
+static void handle_event_2(void *data, /* XXX: fill header */)
+{
+ /* XXX: validate that this event always leads to the initial state */
+ da_handle_start_event(event_2_test_da);
+}
+
+static int enable_test_da(void)
+{
+ int retval;
+
+ retval = da_monitor_init();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("test_da", /* XXX: tracepoint */, handle_event_1);
+ rv_attach_trace_probe("test_da", /* XXX: tracepoint */, handle_event_2);
+
+ return 0;
+}
+
+static void disable_test_da(void)
+{
+ rv_this.enabled = 0;
+
+ rv_detach_trace_probe("test_da", /* XXX: tracepoint */, handle_event_1);
+ rv_detach_trace_probe("test_da", /* XXX: tracepoint */, handle_event_2);
+
+ da_monitor_destroy();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_this = {
+ .name = "test_da",
+ .description = "auto-generated",
+ .enable = enable_test_da,
+ .disable = disable_test_da,
+ .reset = da_monitor_reset_all,
+ .enabled = 0,
+};
+
+static int __init register_test_da(void)
+{
+ return rv_register_monitor(&rv_this, NULL);
+}
+
+static void __exit unregister_test_da(void)
+{
+ rv_unregister_monitor(&rv_this);
+}
+
+module_init(register_test_da);
+module_exit(unregister_test_da);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("dot2k: auto-generated");
+MODULE_DESCRIPTION("test_da: auto-generated");
diff --git a/tools/verification/rvgen/tests/golden/test_da/test_da.h b/tools/verification/rvgen/tests/golden/test_da/test_da.h
new file mode 100644
index 000000000000..d55795efbb61
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_da/test_da.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of test_da automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+#define MONITOR_NAME test_da
+
+enum states_test_da {
+ state_a_test_da,
+ state_b_test_da,
+ state_max_test_da,
+};
+
+#define INVALID_STATE state_max_test_da
+
+enum events_test_da {
+ event_1_test_da,
+ event_2_test_da,
+ event_max_test_da,
+};
+
+struct automaton_test_da {
+ char *state_names[state_max_test_da];
+ char *event_names[event_max_test_da];
+ unsigned char function[state_max_test_da][event_max_test_da];
+ unsigned char initial_state;
+ bool final_states[state_max_test_da];
+};
+
+static const struct automaton_test_da automaton_test_da = {
+ .state_names = {
+ "state_a",
+ "state_b",
+ },
+ .event_names = {
+ "event_1",
+ "event_2",
+ },
+ .function = {
+ { state_b_test_da, state_a_test_da },
+ { INVALID_STATE, state_a_test_da },
+ },
+ .initial_state = state_a_test_da,
+ .final_states = { 1, 0 },
+};
diff --git a/tools/verification/rvgen/tests/golden/test_da/test_da_trace.h b/tools/verification/rvgen/tests/golden/test_da/test_da_trace.h
new file mode 100644
index 000000000000..8bd67115d244
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_da/test_da_trace.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_TEST_DA
+DEFINE_EVENT(event_da_monitor, event_test_da,
+ TP_PROTO(char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor, error_test_da,
+ TP_PROTO(char *state, char *event),
+ TP_ARGS(state, event));
+#endif /* CONFIG_RV_MON_TEST_DA */
diff --git a/tools/verification/rvgen/tests/golden/test_ha/Kconfig b/tools/verification/rvgen/tests/golden/test_ha/Kconfig
new file mode 100644
index 000000000000..f4048290c774
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_ha/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_TEST_HA
+ depends on RV
+ # XXX: add dependencies if there
+ select HA_MON_EVENTS_ID
+ bool "test_ha monitor"
+ help
+ auto-generated
diff --git a/tools/verification/rvgen/tests/golden/test_ha/test_ha.c b/tools/verification/rvgen/tests/golden/test_ha/test_ha.c
new file mode 100644
index 000000000000..485fcd0259b6
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_ha/test_ha.c
@@ -0,0 +1,247 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "test_ha"
+
+/*
+ * XXX: include required tracepoint headers, e.g.,
+ * #include <trace/events/sched.h>
+ */
+#include <rv_trace.h>
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#define RV_MON_TYPE RV_MON_PER_TASK
+/* XXX: If the monitor has several instances, consider HA_TIMER_WHEEL */
+#define HA_TIMER_TYPE HA_TIMER_HRTIMER
+#include "test_ha.h"
+#include <rv/ha_monitor.h>
+
+/*
+ * This is the instrumentation part of the monitor.
+ *
+ * This is the section where manual work is required. Here the kernel events
+ * are translated into model's event.
+ *
+ */
+#define BAR_NS(ha_mon) /* XXX: what is BAR_NS(ha_mon)? */
+
+#define FOO_NS /* XXX: what is FOO_NS? */
+
+static inline u64 bar_ns(struct ha_monitor *ha_mon)
+{
+ return /* XXX: what is bar_ns(ha_mon)? */;
+}
+
+static u64 foo_ns = /* XXX: default value */;
+module_param(foo_ns, ullong, 0644);
+
+/*
+ * These functions define how to read and reset the environment variable.
+ *
+ * Common environment variables like ns-based and jiffy-based clocks have
+ * pre-define getters and resetters you can use. The parser can infer the type
+ * of the environment variable if you supply a measure unit in the constraint.
+ * If you define your own functions, make sure to add appropriate memory
+ * barriers if required.
+ * Some environment variables don't require a storage as they read a system
+ * state (e.g. preemption count). Those variables are never reset, so we don't
+ * define a reset function on monitors only relying on this type of variables.
+ */
+static u64 ha_get_env(struct ha_monitor *ha_mon, enum envs_test_ha env, u64 time_ns)
+{
+ if (env == clk_test_ha)
+ return ha_get_clk_ns(ha_mon, env, time_ns);
+ else if (env == env1_test_ha)
+ return /* XXX: how do I read env1? */
+ else if (env == env2_test_ha)
+ return /* XXX: how do I read env2? */
+ return ENV_INVALID_VALUE;
+}
+
+static void ha_reset_env(struct ha_monitor *ha_mon, enum envs_test_ha env, u64 time_ns)
+{
+ if (env == clk_test_ha)
+ ha_reset_clk_ns(ha_mon, env, time_ns);
+}
+
+/*
+ * These functions are used to validate state transitions.
+ *
+ * They are generated by parsing the model, there is usually no need to change them.
+ * If the monitor requires a timer, there are functions responsible to arm it when
+ * the next state has a constraint, cancel it in any other case and to check
+ * that it didn't expire before the callback run. Transitions to the same state
+ * without a reset never affect timers.
+ * Due to the different representations between invariants and guards, there is
+ * a function to convert it in case invariants or guards are reachable from
+ * another invariant without reset. Those are not present if not required in
+ * the model. This is all automatic but is worth checking because it may show
+ * errors in the model (e.g. missing resets).
+ */
+static inline bool ha_verify_invariants(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ if (curr_state == S0_test_ha)
+ return ha_check_invariant_ns(ha_mon, clk_test_ha, time_ns);
+ else if (curr_state == S2_test_ha)
+ return ha_check_invariant_ns(ha_mon, clk_test_ha, time_ns);
+ return true;
+}
+
+static inline void ha_convert_inv_guard(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ if (curr_state == next_state)
+ return;
+ if (curr_state == S2_test_ha)
+ ha_inv_to_guard(ha_mon, clk_test_ha, BAR_NS(ha_mon), time_ns);
+}
+
+static inline bool ha_verify_guards(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ bool res = true;
+
+ if (curr_state == S0_test_ha && event == event0_test_ha)
+ ha_reset_env(ha_mon, clk_test_ha, time_ns);
+ else if (curr_state == S0_test_ha && event == event1_test_ha)
+ ha_reset_env(ha_mon, clk_test_ha, time_ns);
+ else if (curr_state == S1_test_ha && event == event0_test_ha)
+ ha_reset_env(ha_mon, clk_test_ha, time_ns);
+ else if (curr_state == S1_test_ha && event == event2_test_ha) {
+ res = ha_get_env(ha_mon, env1_test_ha, time_ns) == 0ull;
+ ha_reset_env(ha_mon, clk_test_ha, time_ns);
+ } else if (curr_state == S2_test_ha && event == event1_test_ha)
+ res = ha_monitor_env_invalid(ha_mon, clk_test_ha) ||
+ ha_get_env(ha_mon, clk_test_ha, time_ns) < foo_ns;
+ else if (curr_state == S3_test_ha && event == event0_test_ha)
+ res = ha_monitor_env_invalid(ha_mon, clk_test_ha) ||
+ (ha_get_env(ha_mon, clk_test_ha, time_ns) < FOO_NS &&
+ ha_get_env(ha_mon, env2_test_ha, time_ns) == 0ull);
+ else if (curr_state == S3_test_ha && event == event1_test_ha) {
+ res = ha_monitor_env_invalid(ha_mon, clk_test_ha) ||
+ (ha_get_env(ha_mon, clk_test_ha, time_ns) < foo_ns &&
+ ha_get_env(ha_mon, env1_test_ha, time_ns) == 1ull);
+ ha_reset_env(ha_mon, clk_test_ha, time_ns);
+ }
+ return res;
+}
+
+static inline void ha_setup_invariants(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ if (next_state == curr_state && event != event0_test_ha)
+ return;
+ if (next_state == S0_test_ha)
+ ha_start_timer_ns(ha_mon, clk_test_ha, bar_ns(ha_mon), time_ns);
+ else if (next_state == S2_test_ha)
+ ha_start_timer_ns(ha_mon, clk_test_ha, BAR_NS(ha_mon), time_ns);
+ else if (curr_state == S0_test_ha)
+ ha_cancel_timer(ha_mon);
+ else if (curr_state == S2_test_ha)
+ ha_cancel_timer(ha_mon);
+}
+
+static bool ha_verify_constraint(struct ha_monitor *ha_mon,
+ enum states curr_state, enum events event,
+ enum states next_state, u64 time_ns)
+{
+ if (!ha_verify_invariants(ha_mon, curr_state, event, next_state, time_ns))
+ return false;
+
+ ha_convert_inv_guard(ha_mon, curr_state, event, next_state, time_ns);
+
+ if (!ha_verify_guards(ha_mon, curr_state, event, next_state, time_ns))
+ return false;
+
+ ha_setup_invariants(ha_mon, curr_state, event, next_state, time_ns);
+
+ return true;
+}
+
+static void handle_event0(void *data, /* XXX: fill header */)
+{
+ /* XXX: validate that this event always leads to the initial state */
+ struct task_struct *p = /* XXX: how do I get p? */;
+ da_handle_start_event(p, event0_test_ha);
+}
+
+static void handle_event1(void *data, /* XXX: fill header */)
+{
+ struct task_struct *p = /* XXX: how do I get p? */;
+ da_handle_event(p, event1_test_ha);
+}
+
+static void handle_event2(void *data, /* XXX: fill header */)
+{
+ struct task_struct *p = /* XXX: how do I get p? */;
+ da_handle_event(p, event2_test_ha);
+}
+
+static int enable_test_ha(void)
+{
+ int retval;
+
+ retval = da_monitor_init();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("test_ha", /* XXX: tracepoint */, handle_event0);
+ rv_attach_trace_probe("test_ha", /* XXX: tracepoint */, handle_event1);
+ rv_attach_trace_probe("test_ha", /* XXX: tracepoint */, handle_event2);
+
+ return 0;
+}
+
+static void disable_test_ha(void)
+{
+ rv_this.enabled = 0;
+
+ rv_detach_trace_probe("test_ha", /* XXX: tracepoint */, handle_event0);
+ rv_detach_trace_probe("test_ha", /* XXX: tracepoint */, handle_event1);
+ rv_detach_trace_probe("test_ha", /* XXX: tracepoint */, handle_event2);
+
+ da_monitor_destroy();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_this = {
+ .name = "test_ha",
+ .description = "auto-generated",
+ .enable = enable_test_ha,
+ .disable = disable_test_ha,
+ .reset = da_monitor_reset_all,
+ .enabled = 0,
+};
+
+static int __init register_test_ha(void)
+{
+ return rv_register_monitor(&rv_this, NULL);
+}
+
+static void __exit unregister_test_ha(void)
+{
+ rv_unregister_monitor(&rv_this);
+}
+
+module_init(register_test_ha);
+module_exit(unregister_test_ha);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("dot2k: auto-generated");
+MODULE_DESCRIPTION("test_ha: auto-generated");
diff --git a/tools/verification/rvgen/tests/golden/test_ha/test_ha.h b/tools/verification/rvgen/tests/golden/test_ha/test_ha.h
new file mode 100644
index 000000000000..949fa4453403
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_ha/test_ha.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Automatically generated C representation of test_ha automaton
+ * For further information about this format, see kernel documentation:
+ * Documentation/trace/rv/deterministic_automata.rst
+ */
+
+#define MONITOR_NAME test_ha
+
+enum states_test_ha {
+ S0_test_ha,
+ S1_test_ha,
+ S2_test_ha,
+ S3_test_ha,
+ state_max_test_ha,
+};
+
+#define INVALID_STATE state_max_test_ha
+
+enum events_test_ha {
+ event0_test_ha,
+ event1_test_ha,
+ event2_test_ha,
+ event_max_test_ha,
+};
+
+enum envs_test_ha {
+ clk_test_ha,
+ env1_test_ha,
+ env2_test_ha,
+ env_max_test_ha,
+ env_max_stored_test_ha = env1_test_ha,
+};
+
+_Static_assert(env_max_stored_test_ha <= MAX_HA_ENV_LEN, "Not enough slots");
+#define HA_CLK_NS
+
+struct automaton_test_ha {
+ char *state_names[state_max_test_ha];
+ char *event_names[event_max_test_ha];
+ char *env_names[env_max_test_ha];
+ unsigned char function[state_max_test_ha][event_max_test_ha];
+ unsigned char initial_state;
+ bool final_states[state_max_test_ha];
+};
+
+static const struct automaton_test_ha automaton_test_ha = {
+ .state_names = {
+ "S0",
+ "S1",
+ "S2",
+ "S3",
+ },
+ .event_names = {
+ "event0",
+ "event1",
+ "event2",
+ },
+ .env_names = {
+ "clk",
+ "env1",
+ "env2",
+ },
+ .function = {
+ { S0_test_ha, S1_test_ha, INVALID_STATE },
+ { S0_test_ha, INVALID_STATE, S2_test_ha },
+ { INVALID_STATE, S2_test_ha, S3_test_ha },
+ { S0_test_ha, S1_test_ha, INVALID_STATE },
+ },
+ .initial_state = S0_test_ha,
+ .final_states = { 1, 0, 0, 0 },
+};
diff --git a/tools/verification/rvgen/tests/golden/test_ha/test_ha_trace.h b/tools/verification/rvgen/tests/golden/test_ha/test_ha_trace.h
new file mode 100644
index 000000000000..381bafcb3322
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_ha/test_ha_trace.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_TEST_HA
+DEFINE_EVENT(event_da_monitor_id, event_test_ha,
+ TP_PROTO(int id, char *state, char *event, char *next_state, bool final_state),
+ TP_ARGS(id, state, event, next_state, final_state));
+
+DEFINE_EVENT(error_da_monitor_id, error_test_ha,
+ TP_PROTO(int id, char *state, char *event),
+ TP_ARGS(id, state, event));
+
+DEFINE_EVENT(error_env_da_monitor_id, error_env_test_ha,
+ TP_PROTO(int id, char *state, char *event, char *env),
+ TP_ARGS(id, state, event, env));
+#endif /* CONFIG_RV_MON_TEST_HA */
diff --git a/tools/verification/rvgen/tests/golden/test_ltl/Kconfig b/tools/verification/rvgen/tests/golden/test_ltl/Kconfig
new file mode 100644
index 000000000000..e2d0e721f180
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_ltl/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+config RV_MON_TEST_LTL
+ depends on RV
+ # XXX: add dependencies if there
+ depends on RV_MON_LTL_PARENT
+ default y
+ select LTL_MON_EVENTS_ID
+ bool "test_ltl monitor"
+ help
+ Simple description
diff --git a/tools/verification/rvgen/tests/golden/test_ltl/test_ltl.c b/tools/verification/rvgen/tests/golden/test_ltl/test_ltl.c
new file mode 100644
index 000000000000..92c69b9d9a41
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_ltl/test_ltl.c
@@ -0,0 +1,108 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+
+#define MODULE_NAME "test_ltl"
+
+/*
+ * XXX: include required tracepoint headers, e.g.,
+ * #include <trace/events/sched.h>
+ */
+#include <rv_trace.h>
+#include <monitors/ltl_parent/ltl_parent.h>
+
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#include "test_ltl.h"
+#include <rv/ltl_monitor.h>
+
+static void ltl_atoms_fetch(struct task_struct *task, struct ltl_monitor *mon)
+{
+ /*
+ * This is called everytime the Buchi automaton is triggered.
+ *
+ * This function could be used to fetch the atomic propositions which
+ * are expensive to trace. It is possible only if the atomic proposition
+ * does not need to be updated at precise time.
+ *
+ * It is recommended to use tracepoints and ltl_atom_update() instead.
+ */
+}
+
+static void ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
+{
+ /*
+ * This should initialize as many atomic propositions as possible.
+ *
+ * @task_creation indicates whether the task is being created. This is
+ * false if the task is already running before the monitor is enabled.
+ */
+ ltl_atom_set(mon, LTL_EVENT_A, true/false);
+ ltl_atom_set(mon, LTL_EVENT_B, true/false);
+}
+
+/*
+ * This is the instrumentation part of the monitor.
+ *
+ * This is the section where manual work is required. Here the kernel events
+ * are translated into model's event.
+ */
+static void handle_example_event(void *data, /* XXX: fill header */)
+{
+ ltl_atom_update(task, LTL_EVENT_A, true/false);
+}
+
+static int enable_test_ltl(void)
+{
+ int retval;
+
+ retval = ltl_monitor_init();
+ if (retval)
+ return retval;
+
+ rv_attach_trace_probe("test_ltl", /* XXX: tracepoint */, handle_example_event);
+
+ return 0;
+}
+
+static void disable_test_ltl(void)
+{
+ rv_detach_trace_probe("test_ltl", /* XXX: tracepoint */, handle_sample_event);
+
+ ltl_monitor_destroy();
+}
+
+/*
+ * This is the monitor register section.
+ */
+static struct rv_monitor rv_test_ltl = {
+ .name = "test_ltl",
+ .description = "Simple description",
+ .enable = enable_test_ltl,
+ .disable = disable_test_ltl,
+};
+
+static int __init register_test_ltl(void)
+{
+ return rv_register_monitor(&rv_test_ltl, &rv_ltl_parent);
+}
+
+static void __exit unregister_test_ltl(void)
+{
+ rv_unregister_monitor(&rv_test_ltl);
+}
+
+module_init(register_test_ltl);
+module_exit(unregister_test_ltl);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR(/* TODO */);
+MODULE_DESCRIPTION("test_ltl: Simple description");
diff --git a/tools/verification/rvgen/tests/golden/test_ltl/test_ltl.h b/tools/verification/rvgen/tests/golden/test_ltl/test_ltl.h
new file mode 100644
index 000000000000..7895f2e233e8
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_ltl/test_ltl.h
@@ -0,0 +1,108 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * C implementation of Buchi automaton, automatically generated by
+ * tools/verification/rvgen from the linear temporal logic specification.
+ * For further information, see kernel documentation:
+ * Documentation/trace/rv/linear_temporal_logic.rst
+ */
+
+#include <linux/rv.h>
+
+#define MONITOR_NAME test_ltl
+
+enum ltl_atom {
+ LTL_EVENT_A,
+ LTL_EVENT_B,
+ LTL_NUM_ATOM
+};
+static_assert(LTL_NUM_ATOM <= RV_MAX_LTL_ATOM);
+
+static const char *ltl_atom_str(enum ltl_atom atom)
+{
+ static const char *const names[] = {
+ "ev_a",
+ "ev_b",
+ };
+
+ return names[atom];
+}
+
+enum ltl_buchi_state {
+ S0,
+ S1,
+ S2,
+ S3,
+ S4,
+ RV_NUM_BA_STATES
+};
+static_assert(RV_NUM_BA_STATES <= RV_MAX_BA_STATES);
+
+static void ltl_start(struct task_struct *task, struct ltl_monitor *mon)
+{
+ bool event_b = test_bit(LTL_EVENT_B, mon->atoms);
+ bool event_a = test_bit(LTL_EVENT_A, mon->atoms);
+ bool val1 = !event_a;
+
+ if (val1)
+ __set_bit(S0, mon->states);
+ if (true)
+ __set_bit(S1, mon->states);
+ if (event_b)
+ __set_bit(S4, mon->states);
+}
+
+static void
+ltl_possible_next_states(struct ltl_monitor *mon, unsigned int state, unsigned long *next)
+{
+ bool event_b = test_bit(LTL_EVENT_B, mon->atoms);
+ bool event_a = test_bit(LTL_EVENT_A, mon->atoms);
+ bool val1 = !event_a;
+
+ switch (state) {
+ case S0:
+ if (val1)
+ __set_bit(S0, next);
+ if (true)
+ __set_bit(S1, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ case S1:
+ if (true)
+ __set_bit(S1, next);
+ if (true && val1)
+ __set_bit(S2, next);
+ if (event_b && val1)
+ __set_bit(S3, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ case S2:
+ if (true)
+ __set_bit(S1, next);
+ if (true && val1)
+ __set_bit(S2, next);
+ if (event_b && val1)
+ __set_bit(S3, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ case S3:
+ if (val1)
+ __set_bit(S0, next);
+ if (true)
+ __set_bit(S1, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ case S4:
+ if (val1)
+ __set_bit(S0, next);
+ if (true)
+ __set_bit(S1, next);
+ if (event_b)
+ __set_bit(S4, next);
+ break;
+ }
+}
diff --git a/tools/verification/rvgen/tests/golden/test_ltl/test_ltl_trace.h b/tools/verification/rvgen/tests/golden/test_ltl/test_ltl_trace.h
new file mode 100644
index 000000000000..3571b004c114
--- /dev/null
+++ b/tools/verification/rvgen/tests/golden/test_ltl/test_ltl_trace.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * Snippet to be included in rv_trace.h
+ */
+
+#ifdef CONFIG_RV_MON_TEST_LTL
+DEFINE_EVENT(event_ltl_monitor_id, event_test_ltl,
+ TP_PROTO(struct task_struct *task, char *states, char *atoms, char *next),
+ TP_ARGS(task, states, atoms, next));
+DEFINE_EVENT(error_ltl_monitor_id, error_test_ltl,
+ TP_PROTO(struct task_struct *task),
+ TP_ARGS(task));
+#endif /* CONFIG_RV_MON_TEST_LTL */
diff --git a/tools/verification/rvgen/tests/specs/test_da.dot b/tools/verification/rvgen/tests/specs/test_da.dot
new file mode 100644
index 000000000000..e555c239b221
--- /dev/null
+++ b/tools/verification/rvgen/tests/specs/test_da.dot
@@ -0,0 +1,16 @@
+digraph state_automaton {
+ {node [shape = circle] "state_b"};
+ {node [shape = plaintext, style=invis, label=""] "__init_state_a"};
+ {node [shape = doublecircle] "state_a"};
+ {node [shape = circle] "state_a"};
+ "__init_state_a" -> "state_a";
+ "state_a" [label = "state_a"];
+ "state_a" -> "state_a" [ label = "event_2" ];
+ "state_a" -> "state_b" [ label = "event_1" ];
+ "state_b" [label = "state_b"];
+ "state_b" -> "state_a" [ label = "event_2" ];
+ { rank = min ;
+ "__init_state_a";
+ "state_a";
+ }
+}
diff --git a/tools/verification/rvgen/tests/specs/test_da2.dot b/tools/verification/rvgen/tests/specs/test_da2.dot
new file mode 100644
index 000000000000..cdd4192f58ae
--- /dev/null
+++ b/tools/verification/rvgen/tests/specs/test_da2.dot
@@ -0,0 +1,19 @@
+digraph state_automaton {
+ {node [shape = circle] "state_b"};
+ {node [shape = circle] "state_c"};
+ {node [shape = plaintext, style=invis, label=""] "__init_state_a"};
+ {node [shape = doublecircle] "state_a"};
+ {node [shape = circle] "state_a"};
+ "__init_state_a" -> "state_a";
+ "state_a" [label = "state_a"];
+ "state_a" -> "state_b" [ label = "event_1" ];
+ "state_a" -> "state_c" [ label = "event_2" ];
+ "state_b" [label = "state_b"];
+ "state_b" -> "state_a" [ label = "event_2" ];
+ "state_b" -> "state_c" [ label = "event_3" ];
+ "state_c" [label = "state_c"];
+ { rank = min ;
+ "__init_state_a";
+ "state_a";
+ }
+}
diff --git a/tools/verification/rvgen/tests/specs/test_ha.dot b/tools/verification/rvgen/tests/specs/test_ha.dot
new file mode 100644
index 000000000000..786aa8b22098
--- /dev/null
+++ b/tools/verification/rvgen/tests/specs/test_ha.dot
@@ -0,0 +1,27 @@
+digraph state_automaton {
+ center = true;
+ size = "7,11";
+ {node [shape = circle] "S1"};
+ {node [shape = plaintext, style=invis, label=""] "__init_S0"};
+ {node [shape = doublecircle] "S0"};
+ {node [shape = circle] "S0"};
+ {node [shape = circle] "S2"};
+ {node [shape = circle] "S3"};
+ "__init_S0" -> "S0";
+ "S0" [label = "S0\nclk < bar_ns()", color = green3];
+ "S1" [label = "S1"];
+ "S2" [label = "S2\nclk < BAR_NS()"];
+ "S3" [label = "S3"];
+ "S1" -> "S0" [ label = "event0;reset(clk)" ];
+ "S0" -> "S1" [ label = "event1;reset(clk)" ];
+ "S0" -> "S0" [ label = "event0;reset(clk)" ];
+ "S1" -> "S2" [ label = "event2;env1 == 0;reset(clk)" ];
+ "S2" -> "S3" [ label = "event2" ];
+ "S2" -> "S2" [ label = "event1;clk < foo_ns" ];
+ "S3" -> "S0" [ label = "event0;clk < FOO_NS && env2 == 0" ];
+ "S3" -> "S1" [ label = "event1;clk < foo_ns && env1 == 1;reset(clk)" ];
+ { rank = min ;
+ "__init_S0";
+ "S0";
+ }
+}
diff --git a/tools/verification/rvgen/tests/specs/test_invalid.dot b/tools/verification/rvgen/tests/specs/test_invalid.dot
new file mode 100644
index 000000000000..17c63fc57f17
--- /dev/null
+++ b/tools/verification/rvgen/tests/specs/test_invalid.dot
@@ -0,0 +1,8 @@
+digraph invalid {
+ {node [shape = circle] "init"};
+ {node [shape = circle] "state1"};
+ "init" [label = "init"];
+ "init" -> "state1" [ label = "event_a" ];
+ "state1" [label = "state1"];
+ "state1" -> "init" [ label = "event_b" ];
+}
diff --git a/tools/verification/rvgen/tests/specs/test_invalid.ltl b/tools/verification/rvgen/tests/specs/test_invalid.ltl
new file mode 100644
index 000000000000..cf36307e003c
--- /dev/null
+++ b/tools/verification/rvgen/tests/specs/test_invalid.ltl
@@ -0,0 +1 @@
+RULE = A invalid B
diff --git a/tools/verification/rvgen/tests/specs/test_invalid_ha.dot b/tools/verification/rvgen/tests/specs/test_invalid_ha.dot
new file mode 100644
index 000000000000..06de6aa8709f
--- /dev/null
+++ b/tools/verification/rvgen/tests/specs/test_invalid_ha.dot
@@ -0,0 +1,16 @@
+digraph state_automaton {
+ {node [shape = circle] "state_b"};
+ {node [shape = plaintext, style=invis, label=""] "__init_state_a"};
+ {node [shape = doublecircle] "state_a"};
+ {node [shape = circle] "state_a"};
+ "__init_state_a" -> "state_a";
+ "state_a" [label = "state_a;clk < 1"];
+ "state_a" -> "state_a" [ label = "event_2;reset(clk)" ];
+ "state_a" -> "state_b" [ label = "event_1;wrong_constraint" ];
+ "state_b" [label = "state_b"];
+ "state_b" -> "state_a" [ label = "event_2" ];
+ { rank = min ;
+ "__init_state_a";
+ "state_a";
+ }
+}
diff --git a/tools/verification/rvgen/tests/specs/test_ltl.ltl b/tools/verification/rvgen/tests/specs/test_ltl.ltl
new file mode 100644
index 000000000000..5ed658abd69c
--- /dev/null
+++ b/tools/verification/rvgen/tests/specs/test_ltl.ltl
@@ -0,0 +1 @@
+RULE = always (EVENT_A imply eventually EVENT_B)
--
2.54.0
^ permalink raw reply related
* [PATCH v2 07/14] verification/rvgen: Fix ltl2k writing True as a literal
From: Gabriele Monaco @ 2026-05-14 15:20 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel, Steven Rostedt, Gabriele Monaco,
Nam Cao
Cc: Thomas Weissschuh, Tomas Glozar, John Kacur, Wen Yang
In-Reply-To: <20260514152055.229162-1-gmonaco@redhat.com>
The rvgen parser for LTL stores literal true values in the python
representation (capitalised True), this doesn't build in C.
The Literal class should already handle this case but ASTNode skips its
strigification method and converts the value (true/false) directly.
Fix by delegating ASTNode stringification to the Literal and Variable
classes instead of bypassing them.
Fixes: 97ffa4ce6ab32 ("verification/rvgen: Add support for linear temporal logic")
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
tools/verification/rvgen/rvgen/ltl2ba.py | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/tools/verification/rvgen/rvgen/ltl2ba.py b/tools/verification/rvgen/rvgen/ltl2ba.py
index 7f538598a868..016e7cf93bbb 100644
--- a/tools/verification/rvgen/rvgen/ltl2ba.py
+++ b/tools/verification/rvgen/rvgen/ltl2ba.py
@@ -122,10 +122,8 @@ class ASTNode:
return self.op.expand(self, node, node_set)
def __str__(self):
- if isinstance(self.op, Literal):
- return str(self.op.value)
- if isinstance(self.op, Variable):
- return self.op.name.lower()
+ if isinstance(self.op, (Literal, Variable)):
+ return str(self.op)
return "val" + str(self.id)
def normalize(self):
@@ -382,6 +380,9 @@ class Variable:
def __iter__(self):
yield from ()
+ def __str__(self):
+ return self.name.lower()
+
def negate(self):
new = ASTNode(self)
return NotOp(new)
--
2.54.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox