* [PATCH v3 0/2] tracing: Remove trace_printk.h from kernel.h
From: Steven Rostedt @ 2026-06-24 8:18 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
Linus Torvalds, Sebastian Andrzej Siewior, John Ogness,
Thomas Gleixner, Peter Zijlstra, Julia Lawall, Yury Norov
Remove trace_printk.h by creating a trace_controls.h for those places that
need access to tracing prototypes like tracing_off() and for the places that
need trace_printk() directly, to have it included directly.
Changse since v2: https://lore.kernel.org/all/20260622130739.375198646@kernel.org/
- Update change log in patch 1
- Remove #ifdef DEBUG and always include trace_printk.h in patch 2.
Steven Rostedt (2):
tracing: Move non-trace_printk prototypes into trace_controls.h
tracing: Remove trace_printk.h from kernel.h
----
arch/powerpc/kvm/book3s_xics.c | 1 +
arch/powerpc/xmon/xmon.c | 1 +
arch/s390/kernel/ipl.c | 1 +
arch/s390/kernel/machine_kexec.c | 1 +
drivers/gpu/drm/i915/gt/intel_gtt.h | 1 +
drivers/gpu/drm/i915/i915_gem.h | 2 ++
drivers/hwtracing/stm/dummy_stm.c | 1 +
drivers/infiniband/hw/hfi1/trace_dbg.h | 1 +
drivers/tty/sysrq.c | 1 +
drivers/usb/early/xhci-dbc.c | 1 +
fs/ext4/inline.c | 1 +
include/linux/ftrace.h | 2 ++
include/linux/kernel.h | 1 -
include/linux/sunrpc/debug.h | 1 +
include/linux/trace_controls.h | 54 ++++++++++++++++++++++++++++++++
include/linux/trace_printk.h | 56 ++--------------------------------
kernel/debug/debug_core.c | 1 +
kernel/panic.c | 1 +
kernel/rcu/rcu.h | 2 ++
kernel/rcu/rcutorture.c | 1 +
kernel/trace/ring_buffer_benchmark.c | 1 +
kernel/trace/trace.h | 1 +
kernel/trace/trace_benchmark.c | 1 +
lib/sys_info.c | 1 +
samples/fprobe/fprobe_example.c | 1 +
samples/ftrace/ftrace-direct-too.c | 1 -
samples/trace_printk/trace-printk.c | 1 +
27 files changed, 83 insertions(+), 55 deletions(-)
create mode 100644 include/linux/trace_controls.h
^ permalink raw reply
* Re: [syzbot] [trace?] general protection fault in mtree_load
From: Jiri Olsa @ 2026-06-24 7:49 UTC (permalink / raw)
To: Oleg Nesterov
Cc: syzbot, bp, dave.hansen, hpa, linux-kernel, linux-trace-kernel,
mhiramat, mingo, peterz, syzkaller-bugs, tglx, x86
In-Reply-To: <ajky0IbEvV_UDj2a@redhat.com>
On Mon, Jun 22, 2026 at 03:04:16PM +0200, Oleg Nesterov wrote:
> On 06/21, syzbot wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 6b5a2b7d9bc1 Merge tag 'trace-tools-v7.2' of git://git.ker..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16d56986580000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=ea6584355d75e0cd
> > dashboard link: https://syzkaller.appspot.com/bug?extid=61ce80689253f42e6d80
> > compiler: gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-6b5a2b7d.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/b3cb0499fbe9/vmlinux-6b5a2b7d.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/47cfbe57f6ea/bzImage-6b5a2b7d.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+61ce80689253f42e6d80@syzkaller.appspotmail.com
> >
> > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000011: 0000 [#1] SMP KASAN NOPTI
> > KASAN: null-ptr-deref in range [0x0000000000000088-0x000000000000008f]
> > CPU: 3 UID: 0 PID: 24402 Comm: syz.4.5217 Tainted: G L syzkaller #0 PREEMPT(full)
> > Tainted: [L]=SOFTLOCKUP
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > RIP: 0010:mas_root lib/maple_tree.c:759 [inline]
> > RIP: 0010:mas_start lib/maple_tree.c:1179 [inline]
> > RIP: 0010:mtree_load+0x16d/0xa90 lib/maple_tree.c:5657
> > Code: 00 00 00 00 48 c7 44 24 78 ff ff ff ff e8 6b bd 84 f6 48 8b 5c 24 50 c6 84 24 9c 00 00 00 00 48 8d 7b 48 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 0f 85 d6 08 00 00 48 8b 5b 48 e8 6f 1a 08 00 31 ff
> > RSP: 0018:ffffc900039c76d8 EFLAGS: 00010206
> > RAX: 0000000000000011 RBX: 0000000000000040 RCX: ffffffff8b848746
> > RDX: ffff888041b6a540 RSI: ffffffff8b848775 RDI: 0000000000000088
> > RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000001
> > R10: 0000000000000001 R11: 000000000000751b R12: dffffc0000000000
> > R13: ffff88802693adc0 R14: 00001fff904365a7 R15: dffffc0000000000
> > FS: 0000000000000000(0000) GS:ffff8880d665f000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f44aa04f156 CR3: 00000000364d5000 CR4: 0000000000352ef0
> > Call Trace:
> > <TASK>
> > vma_lookup include/linux/mm.h:4204 [inline]
> > __in_uprobe_trampoline arch/x86/kernel/uprobes.c:766 [inline]
> > __is_optimized arch/x86/kernel/uprobes.c:1056 [inline]
> > is_optimized arch/x86/kernel/uprobes.c:1067 [inline]
> > set_orig_insn+0x1ec/0x2a0 arch/x86/kernel/uprobes.c:1098
> > remove_breakpoint kernel/events/uprobes.c:1185 [inline]
> > register_for_each_vma+0xbb7/0xdb0 kernel/events/uprobes.c:1318
> > uprobe_unregister_nosync+0x12a/0x1c0 kernel/events/uprobes.c:1343
> > bpf_uprobe_unregister kernel/trace/bpf_trace.c:2936 [inline]
> > bpf_uprobe_multi_link_release+0xb3/0x1c0 kernel/trace/bpf_trace.c:2947
> > bpf_link_free+0xec/0x4a0 kernel/bpf/syscall.c:3273
> > bpf_link_put_direct kernel/bpf/syscall.c:3326 [inline]
> > bpf_link_release+0x5d/0x80 kernel/bpf/syscall.c:3333
> > __fput+0x3ff/0xb50 fs/file_table.c:512
> > task_work_run+0x150/0x240 kernel/task_work.c:233
> > exit_task_work include/linux/task_work.h:40 [inline]
>
> current->mm is already NULL, the exiting task has already passed exit_mm().
>
> Hopefully
>
> [PATCHv4 01/13] uprobes/x86: Use proper mm_struct in __in_uprobe_trampoline
> https://lore.kernel.org/all/20260526205840.173790-2-jolsa@kernel.org/
>
> should help...
yes, that sould fix it
thanks,
jirka
^ permalink raw reply
* Re: [PATCH v3] tracing: Use seq_buf for string concatenation
From: Markus Elfring @ 2026-06-24 7:15 UTC (permalink / raw)
To: Woradorn Laodhanadhaworn, linux-trace-kernel, linux-hardening,
linux-kernel-mentees, Steven Rostedt, Masami Hiramatsu
Cc: LKML, Brigham Campbell, Jori Koolstra, Mathieu Desnoyers,
Shuah Khan, Shuah Khan
In-Reply-To: <20260623145147.12145-1-woradorn.laon@gmail.com>
…
> +++ b/kernel/trace/trace_events.c
…
> @@ -4501,13 +4502,20 @@ extern struct trace_event_call *__start_ftrace_events[];
…
> static __init int setup_trace_event(char *str)
> {
> - if (bootup_event_buf[0] != '\0')
> - strlcat(bootup_event_buf, ",", COMMAND_LINE_SIZE);
> + if (seq_buf_used(&bootup_event_seq) > 0)
> + seq_buf_puts(&bootup_event_seq, ",");
…
I suggest to use the function “seq_buf_putc” instead at this source code place.
https://elixir.bootlin.com/linux/v7.1.1/source/lib/seq_buf.c#L203-L221
Is there a need for corresponding error detection?
Regards,
Markus
^ permalink raw reply
* [PATCH] tracing: Fix NULL pointer dereference in func_set_flag()
From: Yuanhe Shu @ 2026-06-24 6:17 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu
Cc: Mathieu Desnoyers, linux-kernel, linux-trace-kernel, stable,
Yuanhe Shu
func_set_flag() dereferences tr->current_trace_flags before verifying
that the current tracer is actually the function tracer. When the active
tracer has been switched away from "function" (e.g., to "wakeup_rt"),
tr->current_trace_flags can be NULL, leading to a NULL pointer
dereference and kernel crash.
The call chain that triggers this is:
trace_options_write()
-> __set_tracer_option()
-> trace->set_flag() /* func_set_flag */
In func_set_flag(), the first operation is:
if (!!set == !!(tr->current_trace_flags->val & bit))
This dereferences tr->current_trace_flags unconditionally. The safety
check that guards against a non-function tracer:
if (tr->current_trace != &function_trace)
return 0;
is placed *after* the dereference, which is too late.
This was observed with the following crash dump:
BUG: unable to handle page fault at 0000000000000000
RIP: func_set_flag+0xd
Call Trace:
__set_tracer_option+0x27
trace_options_write+0x75
vfs_write+0x12a
ksys_write+0x66
do_syscall_64+0x5b
RIP: ffffffff914c973d RSP: ff67ec88b01dfdf0 RFLAGS: 00010202
RAX: 0000000000000000 RBX: ff3a826e80354580 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff93918080
The disassembly confirms the fault:
func_set_flag+0: mov 0x1f08(%rdi), %rax ; RAX = tr->current_trace_flags = NULL
func_set_flag+13: mov (%rax), %eax ; page fault: dereference NULL
At the time of the crash:
tr->current_trace_flags = 0x0 (NULL)
tr->current_trace = wakeup_rt_tracer (not function_trace)
The scenario is that a process opens a function tracer option file (such
as "func_stack_trace"), then the current tracer is switched to another
tracer (e.g., "wakeup_rt"), which sets current_trace_flags to NULL. When
the process subsequently writes to the option file, func_set_flag() is
invoked and crashes on the NULL dereference.
Fix this by moving the current_trace check before the
current_trace_flags dereference, so that func_set_flag() returns early
when the function tracer is not active.
Cc: stable@vger.kernel.org
Fixes: 76680d0d2825 ("tracing: Have function tracer define options per instance")
Signed-off-by: Yuanhe Shu <xiangzao@linux.alibaba.com>
---
kernel/trace/trace_functions.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index f283391a4dc8..cd37f2013758 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -458,12 +458,12 @@ func_set_flag(struct trace_array *tr, u32 old_flags, u32 bit, int set)
ftrace_func_t func;
u32 new_flags;
- /* Do nothing if already set. */
- if (!!set == !!(tr->current_trace_flags->val & bit))
+ /* We can change this flag only when current tracer is function. */
+ if (tr->current_trace != &function_trace)
return 0;
- /* We can change this flag only when not running. */
- if (tr->current_trace != &function_trace)
+ /* Do nothing if already set. */
+ if (!!set == !!(tr->current_trace_flags->val & bit))
return 0;
new_flags = (tr->current_trace_flags->val & ~bit) | (set ? bit : 0);
--
2.39.3
^ permalink raw reply related
* Re: [PATCH v3] tracing: Use seq_buf for string concatenation
From: Masami Hiramatsu @ 2026-06-24 5:03 UTC (permalink / raw)
To: Woradorn Laodhanadhaworn
Cc: rostedt, mhiramat, mathieu.desnoyers, linux-kernel,
linux-trace-kernel, linux-hardening, linux-kernel-mentees, shuah,
skhan, me, jkoolstra
In-Reply-To: <20260623145147.12145-1-woradorn.laon@gmail.com>
On Tue, 23 Jun 2026 21:51:47 +0700
Woradorn Laodhanadhaworn <woradorn.laon@gmail.com> wrote:
> In preparation for removing the strlcat API[1],
> replace the string concatenation logic with a struct seq_buf,
> which tracks the current position and the remaining space internally.
>
> Use seq_buf_str() to NUL-terminate before passing to early_enable_events().
>
> Link: https://github.com/KSPP/linux/issues/370 [1]
>
Looks good to me.
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Thanks,
> Signed-off-by: Woradorn Laodhanadhaworn <woradorn.laon@gmail.com>
> ---
> v1 -> v2:
> - Fixed WARN_ON when booting with empty trace_event.
> v2 -> v3:
> - Addressed Sashiko's concern about the compound literal backing buffer.
> - Replaced the compund literal with an explicit declared buffer and pointed
> seq_buf.buffer to it. This guarantees the backing storage is placed in
> `.init.data` and reclaimed after boot.
>
> v1: https://lore.kernel.org/all/20260620175441.223342-1-woradorn.laon@gmail.com
> v2: https://lore.kernel.org/all/20260622094623.18469-1-woradorn.laon@gmail.com
> Sashiko: https://sashiko.dev/#/patchset/20260622094623.18469-1-woradorn.laon%40gmail.com
>
> kernel/trace/trace_events.c | 18 +++++++++++++-----
> 1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> index c46e623e7e0d..5ab630155ab6 100644
> --- a/kernel/trace/trace_events.c
> +++ b/kernel/trace/trace_events.c
> @@ -22,6 +22,7 @@
> #include <linux/sort.h>
> #include <linux/slab.h>
> #include <linux/delay.h>
> +#include <linux/seq_buf.h>
>
> #include <trace/events/sched.h>
> #include <trace/syscall.h>
> @@ -4501,13 +4502,20 @@ extern struct trace_event_call *__start_ftrace_events[];
> extern struct trace_event_call *__stop_ftrace_events[];
>
> static char bootup_event_buf[COMMAND_LINE_SIZE] __initdata;
> +static struct seq_buf bootup_event_seq __initdata = {
> + .buffer = bootup_event_buf,
> + .size = COMMAND_LINE_SIZE,
> +};
>
> static __init int setup_trace_event(char *str)
> {
> - if (bootup_event_buf[0] != '\0')
> - strlcat(bootup_event_buf, ",", COMMAND_LINE_SIZE);
> + if (seq_buf_used(&bootup_event_seq) > 0)
> + seq_buf_puts(&bootup_event_seq, ",");
> +
> + seq_buf_puts(&bootup_event_seq, str);
>
> - strlcat(bootup_event_buf, str, COMMAND_LINE_SIZE);
> + if (seq_buf_has_overflowed(&bootup_event_seq))
> + return -ENOMEM;
>
> trace_set_ring_buffer_expanded(NULL);
> disable_tracing_selftest("running event tracing");
> @@ -4766,7 +4774,7 @@ static __init int event_trace_enable(void)
> */
> __trace_early_add_events(tr);
>
> - early_enable_events(tr, bootup_event_buf, false);
> + early_enable_events(tr, (char *)seq_buf_str(&bootup_event_seq), false);
>
> trace_printk_start_comm();
>
> @@ -4794,7 +4802,7 @@ static __init int event_trace_enable_again(void)
> if (!tr)
> return -ENODEV;
>
> - early_enable_events(tr, bootup_event_buf, true);
> + early_enable_events(tr, (char *)seq_buf_str(&bootup_event_seq), true);
>
> return 0;
> }
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* [PATCH v2 1/1] rtla: fix missing unistd include
From: Andreas Ziegler @ 2026-06-24 3:33 UTC (permalink / raw)
To: Tomas Glozar
Cc: Steven Rostedt, linux-trace-kernel, linux-kernel, Andreas Ziegler
In-Reply-To: <CAP4=nvRozLS73PooixqWUhDh19eG6oPLTCSV8HvjhibsTLswtw@mail.gmail.com>
Compiling RTLA 7.1.x with GCC 16 and uClibc as standard library fails
with these errors:
src/common.c: In function ‘set_signals’:
src/common.c:40:17: error: implicit declaration of function ‘alarm’ [-Wimplicit-function-declaration]
40 | alarm(params->duration);
| ^~~~~
src/common.c: In function ‘common_apply_config’:
src/common.c:187:44: error: implicit declaration of function ‘getpid’; did you mean ‘getpt’? [-Wimplicit-function-declaration]
187 | retval = sched_setaffinity(getpid(), sizeof(params->hk_cpu_set),
| ^~~~~~
| getpt
In file included from src/common.c:9:
src/common.c: In function ‘run_tool’:
src/common.c:262:19: error: implicit declaration of function ‘sysconf’; did you mean ‘sscanf’? [-Wimplicit-function-declaration]
262 | nr_cpus = get_nprocs_conf();
| ^~~~~~~~~~~~~~~
src/common.c:262:19: error: ‘_SC_NPROCESSORS_CONF’ undeclared (first use in this function)
262 | nr_cpus = get_nprocs_conf();
| ^~~~~~~~~~~~~~~
src/common.c:262:19: note: each undeclared identifier is reported only once for each function it appears in
src/common.c:370:17: error: implicit declaration of function ‘sleep’ [-Wimplicit-function-declaration]
370 | sleep(1);
| ^~~~~
Restore the missing unistd.h include.
Fixes: 115b06a00875 ("tools/rtla: Consolidate nr_cpus usage across all tools")
Signed-off-by: Andreas Ziegler <br025@umbiko.net>
---
Changes v1 -> v2:
adapt commit message
correct fixes: formatting
rebase on current master (502d801f0ab0)
tools/tracing/rtla/src/common.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/tracing/rtla/src/common.c b/tools/tracing/rtla/src/common.c
index d0a8a6edbf0c..8c7f5e75b2ec 100644
--- a/tools/tracing/rtla/src/common.c
+++ b/tools/tracing/rtla/src/common.c
@@ -5,6 +5,7 @@
#include <signal.h>
#include <stdlib.h>
#include <string.h>
+#include <unistd.h>
#include <sys/sysinfo.h>
#include "common.h"
--
2.53.0
^ permalink raw reply related
* Re: [PATCH] tracing/fprobe: Fix NULL pointer dereference in fprobe_fgraph_entry()
From: Masami Hiramatsu @ 2026-06-24 0:37 UTC (permalink / raw)
To: Sechang Lim
Cc: Steven Rostedt, Mathieu Desnoyers, Heiko Carstens, linux-kernel,
linux-trace-kernel
In-Reply-To: <20260619184425.3824774-1-rhkrqnwk98@gmail.com>
On Fri, 19 Jun 2026 18:44:24 +0000
Sechang Lim <rhkrqnwk98@gmail.com> wrote:
> fprobe_fgraph_entry() sizes a shadow-stack reservation in one walk of
> the per-ip fprobe list and fills it in a second walk, both under
> rcu_read_lock() only. A fprobe registered on an already-live ip can
> become visible between the two walks, so the fill walk processes an
> exit_handler the sizing walk did not count and used runs past
> reserved_words. If the sizing walk counted nothing, fgraph_data is NULL
> and the first write_fprobe_header() faults:
>
> Oops: general protection fault, probably for non-canonical address ...
> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> RIP: 0010:fprobe_fgraph_entry+0xa38/0xf10 kernel/trace/fprobe.c:167
> Call Trace:
> <TASK>
> function_graph_enter_regs+0x44c/0xa10 kernel/trace/fgraph.c:677
> ftrace_graph_func+0xc5/0x140 arch/x86/kernel/ftrace.c:671
> __kernel_text_address+0x9/0x40 kernel/extable.c:78
> arch_stack_walk+0x117/0x170 arch/x86/kernel/stacktrace.c:26
> kmem_cache_free+0x188/0x580 mm/slub.c:6378
> tcp_data_queue+0x18d/0x6550 net/ipv4/tcp_input.c:5590
> [...]
> </TASK>
>
> The list cannot be frozen across the two walks, so skip a node that does
> not fit the reservation and count it as missed.
>
Ah, good catch! Yes, if the list is scanned repeatedly, there is a
possibility that it could be updated even when using RCU guards.
This is rare case, but hmm, I think we need something like
fgraph_increment_reserved_data() to avoid skipping.
Anyway, this looks good to me. Let me pick it.
Thanks!
> Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
> ---
> kernel/trace/fprobe.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index f378613ad120..f215990b9061 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -613,6 +613,16 @@ static int fprobe_fgraph_entry(struct ftrace_graph_ent *trace, struct fgraph_ops
> continue;
>
> data_size = fp->entry_data_size;
> + /*
> + * The list may have grown since it was sized, so this node
> + * may not fit. Skip it as missed rather than overrun the
> + * reservation.
> + */
> + if (fp->exit_handler &&
> + used + FPROBE_HEADER_SIZE_IN_LONG + SIZE_IN_LONG(data_size) > reserved_words) {
> + fp->nmissed++;
> + continue;
> + }
> if (data_size && fp->exit_handler)
> data = fgraph_data + used + FPROBE_HEADER_SIZE_IN_LONG;
> else
> --
> 2.43.0
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH] tracing/probes: ignore id update from btf_type_skip_modifiers
From: Masami Hiramatsu @ 2026-06-24 0:24 UTC (permalink / raw)
To: Martin Kaiser; +Cc: Steven Rostedt, linux-trace-kernel, linux-kernel
In-Reply-To: <20260623132937.3494895-1-martin@kaiser.cx>
On Tue, 23 Jun 2026 15:29:32 +0200
Martin Kaiser <martin@kaiser.cx> wrote:
> We can pass NULL as id pointer to btf_type_skip_modifiers if we do not
> need the id of the returned btf_type.
>
Good catch! Let me pick it to probes/core.
Thanks,
> Signed-off-by: Martin Kaiser <martin@kaiser.cx>
> ---
> kernel/trace/trace_probe.c | 13 +++++--------
> 1 file changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 9b3219e755cb..78bca283763f 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -360,9 +360,8 @@ static bool btf_type_is_char_ptr(struct btf *btf, const struct btf_type *type)
> {
> const struct btf_type *real_type;
> u32 intdata;
> - s32 tid;
>
> - real_type = btf_type_skip_modifiers(btf, type->type, &tid);
> + real_type = btf_type_skip_modifiers(btf, type->type, NULL);
> if (!real_type)
> return false;
>
> @@ -379,14 +378,13 @@ static bool btf_type_is_char_array(struct btf *btf, const struct btf_type *type)
> const struct btf_type *real_type;
> const struct btf_array *array;
> u32 intdata;
> - s32 tid;
>
> if (BTF_INFO_KIND(type->info) != BTF_KIND_ARRAY)
> return false;
>
> array = (const struct btf_array *)(type + 1);
>
> - real_type = btf_type_skip_modifiers(btf, array->type, &tid);
> + real_type = btf_type_skip_modifiers(btf, array->type, NULL);
>
> intdata = btf_type_int(real_type);
> return !(BTF_INT_ENCODING(intdata) & BTF_INT_SIGNED)
> @@ -589,7 +587,6 @@ static int parse_btf_field(char *fieldname, const struct btf_type *type,
> struct btf *btf = ctx_btf(ctx);
> char *next;
> int is_ptr;
> - s32 tid;
>
> do {
> if (!is_struct) {
> @@ -600,7 +597,7 @@ static int parse_btf_field(char *fieldname, const struct btf_type *type,
> }
>
> /* Convert a struct pointer type to a struct type */
> - type = btf_type_skip_modifiers(btf, type->type, &tid);
> + type = btf_type_skip_modifiers(btf, type->type, NULL);
> if (!type) {
> trace_probe_log_err(ctx->offset, BAD_BTF_TID);
> return -EINVAL;
> @@ -640,7 +637,7 @@ static int parse_btf_field(char *fieldname, const struct btf_type *type,
> ctx->last_bitsize = 0;
> }
>
> - type = btf_type_skip_modifiers(btf, field->type, &tid);
> + type = btf_type_skip_modifiers(btf, field->type, NULL);
> if (!type) {
> trace_probe_log_err(ctx->offset, BAD_BTF_TID);
> return -EINVAL;
> @@ -759,7 +756,7 @@ static int parse_btf_arg(char *varname,
> return -ENOENT;
>
> found:
> - type = btf_type_skip_modifiers(ctx->btf, tid, &tid);
> + type = btf_type_skip_modifiers(ctx->btf, tid, NULL);
> found_type:
> if (!type) {
> trace_probe_log_err(ctx->offset, BAD_BTF_TID);
> --
> 2.43.7
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v7 03/10] tracing/probes: Support dumping fetcharg program for debugging dynamic events
From: Masami Hiramatsu @ 2026-06-24 0:18 UTC (permalink / raw)
To: Julian Braha
Cc: Steven Rostedt, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest
In-Reply-To: <96b043ed-c527-4e5d-8eb7-631805da53fd@gmail.com>
On Tue, 23 Jun 2026 19:31:47 +0100
Julian Braha <julianbraha@gmail.com> wrote:
> Hi Masami,
>
> On 6/23/26 02:44, Masami Hiramatsu (Google) wrote:
>
> > +config PROBE_EVENTS_DUMP_FETCHARG
> > + depends on PROBE_EVENTS
> > + bool "Dump of dynamic probe event fetch-arguments"
> > + default n
>
> Sorry, kconfig nitpick: could you match the style used by the rest of
> the config options in this file? E.g. the type and prompt come first in
> the list of attributes?
Ah, good catch! Let me fix it.
Thanks,
>
> - Julian Braha
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply
* Re: [PATCH v8 05/46] KVM: Make CONFIG_KVM_VM_MEMORY_ATTRIBUTES selectable
From: Ackerley Tng @ 2026-06-24 0:14 UTC (permalink / raw)
To: Sean Christopherson, Julian Braha
Cc: aik, andrew.jones, binbin.wu, brauner, chao.p.peng, david,
jmattson, jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
aneesh.kumar, liam, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka,
kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
linux-mm, linux-coco
In-Reply-To: <ajnQVkLvFl_lMuGB@google.com>
Sean Christopherson <seanjc@google.com> writes:
> On Fri, Jun 19, 2026, Julian Braha wrote:
>> Hi Ackerley,
>>
>> On 6/19/26 01:31, Ackerley Tng via B4 Relay wrote:
>>
>> > config KVM_VM_MEMORY_ATTRIBUTES
>> > - bool
>> > + depends on KVM_SW_PROTECTED_VM || KVM_INTEL_TDX || KVM_AMD_SEV
>> > + bool "Enable per-VM PRIVATE vs. SHARED attributes (for CoCo VMs)"
>>
>> Sorry for the style nitpick, but could you keep the type and prompt as
>> the first attribute in the Kconfig option definition (like the other
>> options do)?
>
> No need to be sorry, I've no idea why I put the "depends" first. I don't even
> know if that qualifies as a nit :-)
>
> Ackerley, if you can provide your SoB (for Fuad's feedback), I can fixup when
> applying (assuming nothing else necessitates v9).
Thanks, didn't notice this when consolidating this revision.
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
^ permalink raw reply
* Re: [PATCH v8 04/46] KVM: Decouple kvm_has_arch_private_mem from CONFIG_KVM_VM_MEMORY_ATTRIBUTES
From: Ackerley Tng @ 2026-06-24 0:13 UTC (permalink / raw)
To: Binbin Wu
Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
aneesh.kumar, liam, Paolo Bonzini, Sean Christopherson,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen, Yuanchu Xie,
Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt, Kiryl Shutsemau,
Baoquan He, Jason Gunthorpe, Vlastimil Babka, kvm, linux-kernel,
linux-trace-kernel, linux-doc, linux-kselftest, linux-mm,
linux-coco
In-Reply-To: <a21bfc05-787e-4cd8-89af-8579357e6a12@linux.intel.com>
Binbin Wu <binbin.wu@linux.intel.com> writes:
> On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
>> From: Sean Christopherson <seanjc@google.com>
>>
>> When memory attributes become trackable in guest_memfd, the concept of
>> having private memory is no longer dependent on
>> CONFIG_KVM_VM_MEMORY_ATTRIBUTES.
>>
>> With this, on x86, kvm_arch_has_private_mem() is defined if some CoCo
>> platform support (or the testing CONFIG_KVM_SW_PROTECTED_VM) is compiled
>> in.
>>
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>> Co-developed-by: Ackerley Tng <ackerleytng@google.com>
>> Signed-off-by: Ackerley Tng <ackerleytng@google.com>
>
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
>
> One nit below.
>
>> ---
>> arch/x86/include/asm/kvm_host.h | 4 +++-
>> include/linux/kvm_host.h | 2 +-
>> 2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 8e8eb8a5e8a6b..1bde67cf6eb0e 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -2394,7 +2394,9 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
>> int tdp_max_root_level, int tdp_huge_page_level);
>>
>>
>> -#ifdef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>> +#if defined(CONFIG_KVM_SW_PROTECTED_VM) || \
>> + defined(CONFIG_KVM_INTEL_TDX) || \
>> + defined(CONFIG_KVM_AMD_SEV)
>
> Nit:
> Vertically align the defined(XXX) statements for better readability?
>
Sean had this aligned with spaces, and checkpatch complained about
having no spaces before tabs, so I switched it to tabs instead since I
don't think alignment like that is officially documented either way.
Either way is fine :)
>> #define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem)
>> #endif
>>
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 201d0f2143976..d370e834d619e 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -722,7 +722,7 @@ static inline int kvm_arch_vcpu_memslots_id(struct kvm_vcpu *vcpu)
>> }
>> #endif
>>
>> -#ifndef CONFIG_KVM_VM_MEMORY_ATTRIBUTES
>> +#ifndef kvm_arch_has_private_mem
>> static inline bool kvm_arch_has_private_mem(struct kvm *kvm)
>> {
>> return false;
>>
^ permalink raw reply
* Re: [PATCH v8 01/46] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings
From: Ackerley Tng @ 2026-06-24 0:09 UTC (permalink / raw)
To: Sean Christopherson, Binbin Wu
Cc: aik, andrew.jones, brauner, chao.p.peng, david, jmattson,
jthoughton, michael.roth, oupton, pankaj.gupta, qperret,
rick.p.edgecombe, rientjes, shivankg, steven.price, tabba, willy,
wyihan, yan.y.zhao, forkloop, pratyush, suzuki.poulose,
aneesh.kumar, liam, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Steven Rostedt,
Masami Hiramatsu, Mathieu Desnoyers, Jonathan Corbet, Shuah Khan,
Shuah Khan, Vishal Annapurve, Andrew Morton, Chris Li,
Kairui Song, Kemeng Shi, Nhat Pham, Barry Song, Axel Rasmussen,
Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
Kiryl Shutsemau, Baoquan He, Jason Gunthorpe, Vlastimil Babka,
kvm, linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
linux-mm, linux-coco
In-Reply-To: <ajnjTJdQKD1Kz3tf@google.com>
Sean Christopherson <seanjc@google.com> writes:
> On Mon, Jun 22, 2026, Binbin Wu wrote:
>> On 6/19/2026 8:31 AM, Ackerley Tng via B4 Relay wrote:
>>
>> [...]
>>
>> >
>> > +static u64 kvm_gmem_get_attributes(struct inode *inode, pgoff_t index)
>> > +{
>> > + struct maple_tree *mt = &GMEM_I(inode)->attributes;
>> > + void *entry = mtree_load(mt, index);
>> > +
>> > + return WARN_ON_ONCE(!entry) ? 0 : xa_to_value(entry);
>>
>> If the entry is unexpectedly missing, returning 0 means the attribute would
>> be treated as shared. And then in kvm_gmem_fault_user_mapping(), it would
>> allow the userspace to fault in the folio.
>>
>> Should gmem deny such edge case?
>
> After several bugs this year where a WARN_ON_ONCE() fired, but was entirely
> insufficient to prevent true badness, I'm definitely senstive to making the "bad"
> behavior as harmless as possible.
>
I guess both are indeed awkward.
> However, in this case I think we're just hosed. If KVM treats the memory as
> private, KVM will incorrectly do prepare(), incorrectly allow populate(), and
> will caused missed invalidations (though I suppose __kvm_gmem_set_attributes()
> "only" lies to userspace in that case).
>
> That said, assuming SHARED is definitely odd for cases where guest_memfd *can't*
> hold shared memory. Ditto for assuming PRIVATE. What if we instead fall back to
> the "init" state, e.g.?
>
> static u64 kvm_gmem_get_attributes(struct inode *inode, pgoff_t index)
> {
> struct maple_tree *mt = &GMEM_I(inode)->attributes;
> void *entry = mtree_load(mt, index);
>
> if (WARN_ON_ONCE(!entry)) {
> bool shared = GMEM_I(inode)->flags & GUEST_MEMFD_FLAG_INIT_SHARED;
>
> return shared ? 0 : KVM_MEMORY_ATTRIBUTE_PRIVATE;
I was wondering if we should not only return the init state but also set
the init state, but that would involve performing a conversion to the
init state... Too complicated for an edge case.
> }
>
> return xa_to_value(entry);
> }
Thanks Binbin and Sean!
^ permalink raw reply
* Re: [PATCH] Documentation: tracing: fix typo in events documentation
From: Jonathan Corbet @ 2026-06-23 20:34 UTC (permalink / raw)
To: Yudistira Putra, Steven Rostedt, Masami Hiramatsu
Cc: Mathieu Desnoyers, Shuah Khan, linux-trace-kernel, linux-doc,
linux-kernel, Yudistira Putra
In-Reply-To: <20260622143735.71778-1-pyudistira519@gmail.com>
Yudistira Putra <pyudistira519@gmail.com> writes:
> Fix a typo in the tracing events documentation: "can by built up"
> should be "can be built up".
>
> Signed-off-by: Yudistira Putra <pyudistira519@gmail.com>
> ---
> Documentation/trace/events.rst | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst
> index 18d112963dec..581f2260614b 100644
> --- a/Documentation/trace/events.rst
> +++ b/Documentation/trace/events.rst
> @@ -1064,7 +1064,7 @@ correct command type, and a pointer to an event-specific run_command()
> callback that will be called to actually execute the event-specific
> command function.
>
> -Once that's done, the command string can by built up by successive
> +Once that's done, the command string can be built up by successive
> calls to argument-adding functions.
Applied, thanks.
jon
^ permalink raw reply
* Re: [PATCHv4 00/13] uprobes/x86: Fix red zone issue for optimized uprobes
From: Jiri Olsa @ 2026-06-23 19:11 UTC (permalink / raw)
To: Oleg Nesterov, Peter Zijlstra
Cc: Jiri Olsa, Ingo Molnar, Masami Hiramatsu, Andrii Nakryiko, bpf,
linux-trace-kernel
In-Reply-To: <aiEiP54zktDqAZpG@krava>
hi, ping, thanks
jirka
On Thu, Jun 04, 2026 at 08:59:11AM +0200, Jiri Olsa wrote:
> On Tue, May 26, 2026 at 10:58:27PM +0200, Jiri Olsa wrote:
> > hi,
> > Andrii reported an issue with optimized uprobes [1] that can clobber
> > redzone area with call instruction storing return address on stack
> > where user code may keep temporary data without adjusting rsp.
> >
> > Fixing this by moving the optimized uprobes on top of 10-bytes nop
> > instruction, so we can squeeze another instruction to escape the
> > redzone area before doing the call.
> >
> > Note we need upstream update first for patch 3 (github.com/libbpf/usdt),
> > if we decide to take this change.
> >
> > thanks,
> > jirka
> >
> >
> > v1: https://lore.kernel.org/bpf/20260514135342.22130-1-jolsa@kernel.org/
> > v2: https://lore.kernel.org/bpf/20260518105957.123445-1-jolsa@kernel.org/
> > v3: https://lore.kernel.org/bpf/20260521124411.31133-1-jolsa@kernel.org/
> >
> > v4 changes:
> > - do not use 2nd int3 (ont +5 offset) because the call instruction
> > is allways the same for the given nop10 address [Andrii/Peter]
> > - unmap unused trampoline vma after unsuccesfull optimization [sashiko]
> > - small change to patch#2 moved user_64bit_mode earlier in the path
> > and pass/use mm_struct pointer directly from arch_uprobe_optimize
> > instead of gettting current->mm
> > Andrii, keeping your ack, please shout otherwise
>
> hi,
> I think bots did not find anything substantial, I have just small
> selftests changes queued for v5
>
> any other feedback/review would be great
>
> thanks,
> jirka
>
>
> >
> > v3 changes:
> > - use nop10 update suggested by Peter in [2]
> > - remove struct uprobe_trampoline object, use vma objects directly instead
> > - selftests fixes [sashiko]
> > - ack from Andrii
> >
> > v2 changes:
> > - several selftest fixes [sashiko]
> > - consolidate is_lea_insn and is_call_insn insto single check [Jakub Sitnicki]
> > - use proper mm_struct object in __in_uprobe_trampoline check [sashiko]
> > - allow to copy uprobe trampolines vma objects on fork [sashiko]
> > - change uprobe syscall detection error from -ENXIO to -EPROTO [Andrii]
> > - added fork/clone tests
> > - I kept the selftest changes and nop5->nop10 changes in separate
> > commits for easier review, we can squash them later if we want to keep
> > bisect working properly
> >
> >
> > [1] https://lore.kernel.org/bpf/20260509003146.976844-1-andrii@kernel.org/
> > [2] https://lore.kernel.org/bpf/20260518104306.GU3102624@noisy.programming.kicks-ass.net/#t
> > ---
> > Andrii Nakryiko (1):
> > selftests/bpf: Add tests for uprobe nop10 red zone clobbering
> >
> > Jiri Olsa (12):
> > uprobes/x86: Use proper mm_struct in __in_uprobe_trampoline
> > uprobes/x86: Remove struct uprobe_trampoline object
> > uprobes/x86: Allow to copy uprobe trampolines on fork
> > uprobes/x86: Unmap trampoline vma object in case it's unused
> > uprobes/x86: Move optimized uprobe from nop5 to nop10
> > libbpf: Change has_nop_combo to work on top of nop10
> > libbpf: Detect uprobe syscall with new error
> > selftests/bpf: Emit nop,nop10 instructions combo for x86_64 arch
> > selftests/bpf: Change uprobe syscall tests to use nop10
> > selftests/bpf: Change uprobe/usdt trigger bench code to use nop10
> > selftests/bpf: Add reattach tests for uprobe syscall
> > selftests/bpf: Add tests for forked/cloned optimized uprobes
> >
> > arch/x86/kernel/uprobes.c | 379 +++++++++++++++++++++++++++++++++++++++++++-----------------------------
> > include/linux/uprobes.h | 5 -
> > kernel/events/uprobes.c | 10 --
> > kernel/fork.c | 1 -
> > tools/lib/bpf/features.c | 4 +-
> > tools/lib/bpf/usdt.c | 16 +--
> > tools/testing/selftests/bpf/bench.c | 20 ++--
> > tools/testing/selftests/bpf/benchs/bench_trigger.c | 38 ++++----
> > tools/testing/selftests/bpf/benchs/run_bench_uprobes.sh | 2 +-
> > tools/testing/selftests/bpf/prog_tests/uprobe_syscall.c | 307 +++++++++++++++++++++++++++++++++++++++++++++++++++++-----
> > tools/testing/selftests/bpf/prog_tests/usdt.c | 74 ++++++++++++--
> > tools/testing/selftests/bpf/progs/test_usdt.c | 25 +++++
> > tools/testing/selftests/bpf/usdt.h | 2 +-
> > tools/testing/selftests/bpf/usdt_2.c | 15 ++-
> > 14 files changed, 653 insertions(+), 245 deletions(-)
^ permalink raw reply
* Re: [PATCH v7 03/10] tracing/probes: Support dumping fetcharg program for debugging dynamic events
From: Julian Braha @ 2026-06-23 18:31 UTC (permalink / raw)
To: Masami Hiramatsu (Google), Steven Rostedt, Mathieu Desnoyers
Cc: Jonathan Corbet, Shuah Khan, linux-kernel, linux-trace-kernel,
linux-doc, linux-kselftest
In-Reply-To: <178217907822.643090.14693478306190628970.stgit@devnote2>
Hi Masami,
On 6/23/26 02:44, Masami Hiramatsu (Google) wrote:
> +config PROBE_EVENTS_DUMP_FETCHARG
> + depends on PROBE_EVENTS
> + bool "Dump of dynamic probe event fetch-arguments"
> + default n
Sorry, kconfig nitpick: could you match the style used by the rest of
the config options in this file? E.g. the type and prompt come first in
the list of attributes?
- Julian Braha
^ permalink raw reply
* [PATCH 7/7] i2c: nomadik: add support for I2C_XFER_V2 - detailed fault reporting
From: Dmitry Guzman @ 2026-06-23 16:31 UTC (permalink / raw)
To: Andi Shyti, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Linus Walleij
Cc: linux-i2c, linux-kernel, linux-trace-kernel, linux-arm-kernel,
Dmitry Guzman
In-Reply-To: <20260623-i2c-fault-reporting-v1-0-6db1a8aabf18@mobileye.com>
I2C_XFER_V2 is a new API that allows I2C clients to get the detailed
report in case of transmission failure. Previously, the only information
returned by I2C bus controller was the error code; there was no way to
find out how many messages or bytes in a certain message have been sent
or received until the fault condition occurred.
This commit introduces support of this feature in i2c-nomadik driver.
Signed-off-by: Dmitry Guzman <Dmitry.Guzman@mobileye.com>
---
drivers/i2c/busses/i2c-nomadik.c | 37 +++++++++++++++++++++++++++++++------
1 file changed, 31 insertions(+), 6 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c
index 9cff0c2757fafeaf809395e02a5e754570f65e08..1cf03d634fdc856dc335a58597e0fd31ab077078 100644
--- a/drivers/i2c/busses/i2c-nomadik.c
+++ b/drivers/i2c/busses/i2c-nomadik.c
@@ -197,6 +197,7 @@ struct i2c_nmk_client {
* @stop: stop condition.
* @xfer_wq: xfer done wait queue.
* @result: controller propogated result.
+ * @bytes_cplt: number of bytes completed in the message that caused a fault.
*/
struct nmk_i2c_dev {
struct i2c_vendor_data *vendor;
@@ -216,6 +217,7 @@ struct nmk_i2c_dev {
int stop;
struct wait_queue_head xfer_wq;
int result;
+ int bytes_cplt;
};
/* controller's abort causes */
@@ -529,6 +531,8 @@ static int read_i2c(struct nmk_i2c_dev *priv, u16 flags)
int status = 0;
bool xfer_done;
+ priv->cli.xfer_bytes = 0;
+
mcr = load_i2c_mcr_reg(priv, flags);
writel(mcr, priv->virtbase + I2C_MCR);
@@ -653,6 +657,7 @@ static int nmk_i2c_xfer_one(struct nmk_i2c_dev *priv, u16 flags)
{
int status;
+ priv->bytes_cplt = 0;
if (flags & I2C_M_RD) {
/* read operation */
priv->cli.operation = I2C_READ;
@@ -678,6 +683,16 @@ static int nmk_i2c_xfer_one(struct nmk_i2c_dev *priv, u16 flags)
status = priv->result;
}
+ if (flags & I2C_M_RD) {
+ /* For READ messages, return the number of bytes read from FIFO */
+ priv->bytes_cplt = priv->cli.xfer_bytes;
+ } else {
+ /* For WRITE messages, return the number of bytes sent on bus */
+ priv->bytes_cplt = FIELD_GET(I2C_SR_LENGTH, i2c_sr);
+ /* LENGTH value includes the last byte that has not been sent or ACKed */
+ if (priv->bytes_cplt > 0)
+ priv->bytes_cplt--;
+ }
init_hw(priv);
status = status ? status : priv->result;
@@ -687,10 +702,11 @@ static int nmk_i2c_xfer_one(struct nmk_i2c_dev *priv, u16 flags)
}
/**
- * nmk_i2c_xfer() - I2C transfer function used by kernel framework
+ * nmk_i2c_xfer_v2() - I2C transfer function used by kernel framework
* @i2c_adap: Adapter pointer to the controller
* @msgs: Pointer to data to be written.
* @num_msgs: Number of messages to be executed
+ * @report: Pointer to transfer report to be written.
*
* This is the function called by the generic kernel i2c_transfer()
* or i2c_smbus...() API calls. Note that this code is protected by the
@@ -733,14 +749,16 @@ static int nmk_i2c_xfer_one(struct nmk_i2c_dev *priv, u16 flags)
* please use the i2c_smbus_read_i2c_block_data()
* or i2c_smbus_write_i2c_block_data() API
*/
-static int nmk_i2c_xfer(struct i2c_adapter *i2c_adap,
- struct i2c_msg msgs[], int num_msgs)
+static int nmk_i2c_xfer_v2(struct i2c_adapter *i2c_adap,
+ struct i2c_msg msgs[], int num_msgs,
+ struct i2c_transfer_report *report)
{
int status = 0;
int i;
struct nmk_i2c_dev *priv = i2c_get_adapdata(i2c_adap);
pm_runtime_get_sync(&priv->adev->dev);
+ priv->bytes_cplt = 0;
/* setup the i2c controller */
setup_i2c_controller(priv);
@@ -760,10 +778,17 @@ static int nmk_i2c_xfer(struct i2c_adapter *i2c_adap,
pm_runtime_put_sync(&priv->adev->dev);
/* return the no. messages processed */
- if (status)
+ if (status) {
+ report->msgs_cplt = i;
+ report->bytes_cplt = priv->bytes_cplt;
+ report->fault_msg_idx = i;
return status;
- else
+ } else {
+ report->msgs_cplt = num_msgs;
+ report->bytes_cplt = 0;
+ report->fault_msg_idx = num_msgs;
return num_msgs;
+ }
}
/**
@@ -1014,7 +1039,7 @@ static unsigned int nmk_i2c_functionality(struct i2c_adapter *adap)
}
static const struct i2c_algorithm nmk_i2c_algo = {
- .xfer = nmk_i2c_xfer,
+ .xfer_v2 = nmk_i2c_xfer_v2,
.functionality = nmk_i2c_functionality
};
--
2.43.0
^ permalink raw reply related
* [PATCH 6/7] i2c: nomadik: add quirks max_len=2047 and no_zero_len_read
From: Dmitry Guzman @ 2026-06-23 16:31 UTC (permalink / raw)
To: Andi Shyti, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Linus Walleij
Cc: linux-i2c, linux-kernel, linux-trace-kernel, linux-arm-kernel,
Dmitry Guzman
In-Reply-To: <20260623-i2c-fault-reporting-v1-0-6db1a8aabf18@mobileye.com>
In Nomadik I2c controller, register I2C_MCR has 11-bit wide LENGTH
field. Its maximum value is 2047, so this is the maximum length of a
single message. It is less than the common maximum I2C message length in
I2C subsystem (8192), so define a quirk in order to report the
unsupported message without any attempt to transfer it.
Zero length reading doesn't work properly on this controller, so add
`I2C_AQ_NO_ZERO_LEN_READ` quirk flag.
Signed-off-by: Dmitry Guzman <Dmitry.Guzman@mobileye.com>
---
drivers/i2c/busses/i2c-nomadik.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c
index 7d93fb3876dc1324003dd19884e3fd5cbba9cfbb..9cff0c2757fafeaf809395e02a5e754570f65e08 100644
--- a/drivers/i2c/busses/i2c-nomadik.c
+++ b/drivers/i2c/busses/i2c-nomadik.c
@@ -79,6 +79,9 @@
#define I2C_MCR_STOP BIT(14) /* Stop condition */
#define I2C_MCR_LENGTH GENMASK(25, 15) /* Transaction length */
+/* Controller hardware limitation of the message length */
+#define I2C_MAX_MSG_LENGTH (I2C_MCR_LENGTH >> 15)
+
/* Status register (SR) */
#define I2C_SR_OP GENMASK(1, 0) /* Operation */
#define I2C_SR_STATUS GENMASK(3, 2) /* controller status */
@@ -238,6 +241,12 @@ static int fault_codes[] = {
EIO
};
+static const struct i2c_adapter_quirks nmk_i2c_quirks = {
+ .flags = I2C_AQ_NO_ZERO_LEN_READ,
+ .max_read_len = I2C_MAX_MSG_LENGTH,
+ .max_write_len = I2C_MAX_MSG_LENGTH,
+};
+
static inline void i2c_set_bit(void __iomem *reg, u32 mask)
{
writel(readl(reg) | mask, reg);
@@ -1162,6 +1171,7 @@ static int nmk_i2c_probe(struct amba_device *adev, const struct amba_id *id)
adap->class = I2C_CLASS_DEPRECATED;
adap->algo = &nmk_i2c_algo;
adap->timeout = usecs_to_jiffies(priv->timeout_usecs);
+ adap->quirks = &nmk_i2c_quirks;
snprintf(adap->name, sizeof(adap->name),
"Nomadik I2C at %pR", &adev->res);
--
2.43.0
^ permalink raw reply related
* [PATCH 5/7] i2c: nomadik: change print level for fault messages to debug
From: Dmitry Guzman @ 2026-06-23 16:31 UTC (permalink / raw)
To: Andi Shyti, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Linus Walleij
Cc: linux-i2c, linux-kernel, linux-trace-kernel, linux-arm-kernel,
Dmitry Guzman
In-Reply-To: <20260623-i2c-fault-reporting-v1-0-6db1a8aabf18@mobileye.com>
i2c-nomadik driver prints error message on every faulted message. This
is not a good practice, because in I2C a fault not always is an error,
sometimes it is the expected result. For example, scanning bus with
`i2cdetects` prints over 100 messages in dmesg (two messages per each
target address).
To avoid excessive prints in the log, change the print level from err to
debug.
Signed-off-by: Dmitry Guzman <Dmitry.Guzman@mobileye.com>
---
drivers/i2c/busses/i2c-nomadik.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c
index e19ace904e79cd2d83171d9f38fc103a6c5e023b..7d93fb3876dc1324003dd19884e3fd5cbba9cfbb 100644
--- a/drivers/i2c/busses/i2c-nomadik.c
+++ b/drivers/i2c/busses/i2c-nomadik.c
@@ -627,7 +627,7 @@ static int write_i2c(struct nmk_i2c_dev *priv, u16 flags)
if (!xfer_done) {
/* Controller timed out */
- dev_err(&priv->adev->dev, "write to slave 0x%x timed out\n",
+ dev_dbg(&priv->adev->dev, "write to slave 0x%x timed out\n",
priv->cli.slave_adr);
status = -ETIMEDOUT;
}
@@ -661,7 +661,7 @@ static int nmk_i2c_xfer_one(struct nmk_i2c_dev *priv, u16 flags)
i2c_sr = readl(priv->virtbase + I2C_SR);
if (FIELD_GET(I2C_SR_STATUS, i2c_sr) == I2C_ABORT) {
cause = FIELD_GET(I2C_SR_CAUSE, i2c_sr);
- dev_err(&priv->adev->dev, "%s\n",
+ dev_dbg(&priv->adev->dev, "%s\n",
cause >= ARRAY_SIZE(abort_causes) ?
"unknown reason" :
abort_causes[cause]);
--
2.43.0
^ permalink raw reply related
* [PATCH 4/7] i2c: nomadik: return proper fault codes
From: Dmitry Guzman @ 2026-06-23 16:31 UTC (permalink / raw)
To: Andi Shyti, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Linus Walleij
Cc: linux-i2c, linux-kernel, linux-trace-kernel, linux-arm-kernel,
Dmitry Guzman
In-Reply-To: <20260623-i2c-fault-reporting-v1-0-6db1a8aabf18@mobileye.com>
I2C documentation Documentation/i2c/fault-codes.rst defines fault codes
for different negative results in I2C transmittion. Previously,
i2c-nomadik driver didn't implement them properly - it returned
ETIMEDOUT on most errors and EIO on master arbitration lost.
To comply with the documentation, return the proper fault codes for
different conditions, namely:
- EAGAIN if arbitration lost
- EOVERFLOW if message is too long (>2047 bytes)
- ENXIO if target address is not acknowledged
- EIO on other errors detected by controller (for example, NACK on data)
- ETIMEDOUT if driver gets timeout waiting for message completion
without any fault condition detected by the controller (for example,
too long message, or SDA/SCL line stuck on 0).
Signed-off-by: Dmitry Guzman <Dmitry.Guzman@mobileye.com>
---
drivers/i2c/busses/i2c-nomadik.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c
index 3eb06988c026e5c573fcf55d83de7136b5ca7ac9..e19ace904e79cd2d83171d9f38fc103a6c5e023b 100644
--- a/drivers/i2c/busses/i2c-nomadik.c
+++ b/drivers/i2c/busses/i2c-nomadik.c
@@ -226,6 +226,18 @@ static const char *abort_causes[] = {
"overflow, maxsize is 2047 bytes",
};
+/* Linux fault codes for controller abort causes */
+static int fault_codes[] = {
+ ENXIO,
+ EIO,
+ EIO,
+ EAGAIN,
+ EIO,
+ EIO,
+ EOVERFLOW,
+ EIO
+};
+
static inline void i2c_set_bit(void __iomem *reg, u32 mask)
{
writel(readl(reg) | mask, reg);
@@ -653,6 +665,8 @@ static int nmk_i2c_xfer_one(struct nmk_i2c_dev *priv, u16 flags)
cause >= ARRAY_SIZE(abort_causes) ?
"unknown reason" :
abort_causes[cause]);
+ priv->result = -fault_codes[cause];
+ status = priv->result;
}
init_hw(priv);
@@ -865,7 +879,7 @@ static irqreturn_t i2c_irq_handler(int irq, void *arg)
/* Master Arbitration lost interrupt */
case I2C_IT_MAL:
- priv->result = -EIO;
+ priv->result = -EAGAIN;
init_hw(priv);
i2c_set_bit(priv->virtbase + I2C_ICR, I2C_IT_MAL);
--
2.43.0
^ permalink raw reply related
* [PATCH 3/7] i2c: nomadik: do not try to retransmit I2C message series on errors
From: Dmitry Guzman @ 2026-06-23 16:31 UTC (permalink / raw)
To: Andi Shyti, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Linus Walleij
Cc: linux-i2c, linux-kernel, linux-trace-kernel, linux-arm-kernel,
Dmitry Guzman
In-Reply-To: <20260623-i2c-fault-reporting-v1-0-6db1a8aabf18@mobileye.com>
i2c-nomadik driver of I2C bus controller in `xfer` callback retransmits
the whole message series in cause of any fault, and returns fault only
after third failed attempt. This behavior contradicts with API because
not only it hides hardware faults, but also re-sends messages, while
they are not guaranteed to be idempotent.
Remove the triple attempt to send messages in `xfer` callback.
Signed-off-by: Dmitry Guzman <Dmitry.Guzman@mobileye.com>
---
drivers/i2c/busses/i2c-nomadik.c | 30 ++++++++++++------------------
1 file changed, 12 insertions(+), 18 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c
index e4e5c6943c66144058fba857d7bf6c0be79ed5bd..3eb06988c026e5c573fcf55d83de7136b5ca7ac9 100644
--- a/drivers/i2c/busses/i2c-nomadik.c
+++ b/drivers/i2c/busses/i2c-nomadik.c
@@ -716,27 +716,21 @@ static int nmk_i2c_xfer(struct i2c_adapter *i2c_adap,
int status = 0;
int i;
struct nmk_i2c_dev *priv = i2c_get_adapdata(i2c_adap);
- int j;
pm_runtime_get_sync(&priv->adev->dev);
- /* Attempt three times to send the message queue */
- for (j = 0; j < 3; j++) {
- /* setup the i2c controller */
- setup_i2c_controller(priv);
-
- for (i = 0; i < num_msgs; i++) {
- priv->cli.slave_adr = msgs[i].addr;
- priv->cli.buffer = msgs[i].buf;
- priv->cli.count = msgs[i].len;
- priv->stop = (i < (num_msgs - 1)) ? 0 : 1;
- priv->result = 0;
-
- status = nmk_i2c_xfer_one(priv, msgs[i].flags);
- if (status != 0)
- break;
- }
- if (status == 0)
+ /* setup the i2c controller */
+ setup_i2c_controller(priv);
+
+ for (i = 0; i < num_msgs; i++) {
+ priv->cli.slave_adr = msgs[i].addr;
+ priv->cli.buffer = msgs[i].buf;
+ priv->cli.count = msgs[i].len;
+ priv->stop = (i < (num_msgs - 1)) ? 0 : 1;
+ priv->result = 0;
+
+ status = nmk_i2c_xfer_one(priv, msgs[i].flags);
+ if (status != 0)
break;
}
--
2.43.0
^ permalink raw reply related
* [PATCH 2/7] i2c: nomadik: optimize layout of struct nmk_i2c_dev
From: Dmitry Guzman @ 2026-06-23 16:31 UTC (permalink / raw)
To: Andi Shyti, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Linus Walleij
Cc: linux-i2c, linux-kernel, linux-trace-kernel, linux-arm-kernel,
Dmitry Guzman
In-Reply-To: <20260623-i2c-fault-reporting-v1-0-6db1a8aabf18@mobileye.com>
Put two bool variables `xfer_done` and `has_32b_bus` and two char
variables `tft` and `rft` together in order to reduce struct size
wasted for padding.
Signed-off-by: Dmitry Guzman <Dmitry.Guzman@mobileye.com>
---
drivers/i2c/busses/i2c-nomadik.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/i2c/busses/i2c-nomadik.c b/drivers/i2c/busses/i2c-nomadik.c
index b63ee51c1652080e414f4302bee16905914c1288..e4e5c6943c66144058fba857d7bf6c0be79ed5bd 100644
--- a/drivers/i2c/busses/i2c-nomadik.c
+++ b/drivers/i2c/busses/i2c-nomadik.c
@@ -187,13 +187,13 @@ struct i2c_nmk_client {
* @clk_freq: clock frequency for the operation mode
* @tft: Tx FIFO Threshold in bytes
* @rft: Rx FIFO Threshold in bytes
+ * @xfer_done: xfer done boolean.
+ * @has_32b_bus: controller is on a bus that only supports 32-bit accesses.
* @timeout_usecs: Slave response timeout
* @sm: speed mode
* @stop: stop condition.
* @xfer_wq: xfer done wait queue.
- * @xfer_done: xfer done boolean.
* @result: controller propogated result.
- * @has_32b_bus: controller is on a bus that only supports 32-bit accesses.
*/
struct nmk_i2c_dev {
struct i2c_vendor_data *vendor;
@@ -206,13 +206,13 @@ struct nmk_i2c_dev {
u32 clk_freq;
unsigned char tft;
unsigned char rft;
+ bool xfer_done;
+ bool has_32b_bus;
u32 timeout_usecs;
enum i2c_freq_mode sm;
int stop;
struct wait_queue_head xfer_wq;
- bool xfer_done;
int result;
- bool has_32b_bus;
};
/* controller's abort causes */
--
2.43.0
^ permalink raw reply related
* [PATCH 1/7] i2c: add I2C_XFER_V2 - support for detailed transfer reporting
From: Dmitry Guzman @ 2026-06-23 16:31 UTC (permalink / raw)
To: Andi Shyti, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Linus Walleij
Cc: linux-i2c, linux-kernel, linux-trace-kernel, linux-arm-kernel,
Dmitry Guzman
In-Reply-To: <20260623-i2c-fault-reporting-v1-0-6db1a8aabf18@mobileye.com>
In I2C subsystem there is API that allows sending/receiving a number of
messages in a single call. I2C_RDWR ioctl, as well as i2c_transfer kernel
API function, returns only a single error code. In case of a fault,
there is no way to know which message in the series caused a fault, and
how many bytes have been sent or received before the fault.
This commit introduces i2c_transfer_v2 kernel API function and
I2C_RDWR_V2 ioctl. They provide the same functionality as the old ones,
but also accept additional pointer to `i2c_transfer_report` structure
and fill it with detailed fault report: number of messages transferred
successfully, index of message that caused fault, number of bytes
transferred (if a fault occurred in the middle of the last message).
I2C bus controller driver may implement either both callbacks or any one
of them. The implementation of both callbacks may make sense if the
precise detection of the fault position requires different handling with
the hardware that causes to extra CPU load or other consequences that
may be unwanted if the precise fault report is not required. If the
precise fault detection is free, the driver may implement only `xfer_v2`
callback - the infrastructure will provide pointer to a dummy fault
report that will be dropped if the client uses old API.
Signed-off-by: Dmitry Guzman <Dmitry.Guzman@mobileye.com>
---
Documentation/i2c/dev-interface.rst | 46 ++++++++++++++++
drivers/i2c/i2c-core-base.c | 107 +++++++++++++++++++++++++-----------
drivers/i2c/i2c-dev.c | 79 ++++++++++++++++++++++----
include/linux/i2c.h | 12 ++++
include/trace/events/i2c.h | 6 +-
include/uapi/linux/i2c-dev.h | 9 +++
include/uapi/linux/i2c.h | 21 +++++++
7 files changed, 232 insertions(+), 48 deletions(-)
diff --git a/Documentation/i2c/dev-interface.rst b/Documentation/i2c/dev-interface.rst
index c277a8e1202b51403a8d00d6c92fca13da1afc58..45a8b94f585b57889c153fbb110a5826879484f6 100644
--- a/Documentation/i2c/dev-interface.rst
+++ b/Documentation/i2c/dev-interface.rst
@@ -140,6 +140,52 @@ The following IOCTLs are defined:
The slave address and whether to use ten bit address mode has to be
set in each message, overriding the values set with the above ioctl's.
+``ioctl(file, I2C_RDWR_V2, struct i2c_rdwr_v2_ioctl_data *msgset)``
+ Does the same combined read/write transaction as I2C_RDWR, but also
+ provides detailed fault report. The argument is a pointer to a::
+
+ struct i2c_rdwr_v2_ioctl_data {
+ struct i2c_rdwr_ioctl_data rdwr_data;
+ struct i2c_transfer_report report;
+ };
+
+ The rdwr_data is the same structure as the argument for I2C_RDWR ioctl.
+ The report is the structure that the transfer report is written to::
+
+ struct i2c_transfer_report {
+ __s32 fault_msg_idx;
+ __s32 msgs_cplt;
+ __s32 bytes_cplt;
+ };
+
+ msgs_cplt is the number of messages that has been sent or received
+ successfully. If there are read messages within this range, the returned
+ data is guaranteed to be valid. If a message has been read from the
+ device but the read data is lost (for example, FIFO is flushed before
+ CPU read it), this message must not be counted. If the controller cannot
+ determine the number of completed messages, the value is -EOPNOTSUPP.
+
+ fault_msg_idx is the number of message that caused a fault. In case of a
+ fault, it is not necessary equal to msgs_cplt. For example, if the driver
+ validates the whole batch before starting transmission, detects that it
+ cannot send it, it returns -EOPNOTSUPP error immediately, so msgs_cplt is 0,
+ while fault_msg_idx points to the message that cannot be sent. Another
+ example when these two value may be different is I2C controller that
+ flushes RX FIFO when an error is detected before CPU reads data from it.
+
+ If there is no fault, the fault_msg_idx value is equal to msgs_cplt.
+
+ bytes_cplt indicates the number of bytes sent/received in the message at
+ index msgs_cplt. If this is a read message, it is guaranteed that these
+ bytes in the message data buffer are valid. If the controller cannot
+ determine the byte number, the value should be -EOPNOTSUPP. If there was
+ no fault, the value should be 0.
+
+ To discover if the device supports detailed fault reporting, use I2C_RDWR_V2
+ ioctl with nmsgs = 0. If the driver supports it, the return value shall be 0.
+ If the driver supports only legacy I2C_RDWR, the return value shall be
+ -EOPNOTSUPP. In any case, nothing is done on the bus.
+
``ioctl(file, I2C_SMBUS, struct i2c_smbus_ioctl_data *args)``
If possible, use the provided ``i2c_smbus_*`` methods described below instead
of issuing direct ioctls.
diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index 3ec04787a7373f113a15ee3fb35db425ae470427..c3694618b94fbdfd79a71d7cbd8d7c69c9638a17 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -2170,15 +2170,17 @@ module_exit(i2c_exit);
/* Check if val is exceeding the quirk IFF quirk is non 0 */
#define i2c_quirk_exceeded(val, quirk) ((quirk) && ((val) > (quirk)))
-static int i2c_quirk_error(struct i2c_adapter *adap, struct i2c_msg *msg, char *err_msg)
+static struct i2c_msg *i2c_quirk_error(struct i2c_adapter *adap,
+ struct i2c_msg *msg, char *err_msg)
{
dev_err_ratelimited(&adap->dev, "adapter quirk: %s (addr 0x%04x, size %u, %s)\n",
err_msg, msg->addr, msg->len,
str_read_write(msg->flags & I2C_M_RD));
- return -EOPNOTSUPP;
+ return msg;
}
-static int i2c_check_for_quirks(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
+static struct i2c_msg *i2c_check_for_quirks(struct i2c_adapter *adap,
+ struct i2c_msg *msgs, int num)
{
const struct i2c_adapter_quirks *q = adap->quirks;
int max_num = q->max_num_msgs, i;
@@ -2229,31 +2231,51 @@ static int i2c_check_for_quirks(struct i2c_adapter *adap, struct i2c_msg *msgs,
}
}
- return 0;
+ return NULL;
}
/**
- * __i2c_transfer - unlocked flavor of i2c_transfer
+ * __i2c_transfer_v2 - unlocked flavor of i2c_transfer_v2
* @adap: Handle to I2C bus
* @msgs: One or more messages to execute before STOP is issued to
* terminate the operation; each message begins with a START.
* @num: Number of messages to be executed.
+ * @report: The buffer for detailed transfer report (may be NULL if not required)
*
* Returns negative errno, else the number of messages executed.
+ * Writes the detailed transfer report to the structure pointed by 'report'.
*
* Adapter lock must be held when calling this function. No debug logging
* takes place.
*/
-int __i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
+int __i2c_transfer_v2(struct i2c_adapter *adap, struct i2c_msg *msgs, int num,
+ struct i2c_transfer_report *report)
{
+ struct i2c_transfer_report dummy_report;
unsigned long orig_jiffies;
int ret, try;
- if (!adap->algo->master_xfer) {
+ if (report) {
+ report->msgs_cplt = -EOPNOTSUPP;
+ report->bytes_cplt = -EOPNOTSUPP;
+ report->fault_msg_idx = -EOPNOTSUPP;
+
+ if (!adap->algo->xfer_v2)
+ return -EOPNOTSUPP;
+ }
+
+ if (!adap->algo->master_xfer && !adap->algo->xfer_v2) {
dev_dbg(&adap->dev, "I2C level transfers not supported\n");
return -EOPNOTSUPP;
}
+ /*
+ * If the controller only supports "v2" callback and the report is not requested,
+ * provide pointer to a dummy report.
+ */
+ if (!(adap->algo->master_xfer) && (!report))
+ report = &dummy_report;
+
if (WARN_ON(!msgs || num < 1))
return -EINVAL;
@@ -2261,8 +2283,18 @@ int __i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
if (ret)
return ret;
- if (adap->quirks && i2c_check_for_quirks(adap, msgs, num))
- return -EOPNOTSUPP;
+ if (adap->quirks) {
+ struct i2c_msg *bad_msg = i2c_check_for_quirks(adap, msgs, num);
+
+ if (bad_msg) {
+ if (report) {
+ report->msgs_cplt = 0;
+ report->bytes_cplt = 0;
+ report->fault_msg_idx = bad_msg - msgs;
+ }
+ return -EOPNOTSUPP;
+ }
+ }
/*
* i2c_trace_msg_key gets enabled when tracepoint i2c_transfer gets
@@ -2283,8 +2315,12 @@ int __i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
for (ret = 0, try = 0; try <= adap->retries; try++) {
if (i2c_in_atomic_xfer_mode() && adap->algo->master_xfer_atomic)
ret = adap->algo->master_xfer_atomic(adap, msgs, num);
- else
- ret = adap->algo->master_xfer(adap, msgs, num);
+ else {
+ if (report)
+ ret = adap->algo->xfer_v2(adap, msgs, num, report);
+ else
+ ret = adap->algo->master_xfer(adap, msgs, num);
+ }
if (ret != -EAGAIN)
break;
@@ -2293,58 +2329,63 @@ int __i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
}
if (static_branch_unlikely(&i2c_trace_msg_key)) {
- int i;
- for (i = 0; i < ret; i++)
+ int n;
+
+ if (report)
+ n = report->msgs_cplt;
+ else
+ n = ret;
+ for (int i = 0; i < n; i++)
if (msgs[i].flags & I2C_M_RD)
- trace_i2c_reply(adap, &msgs[i], i);
+ trace_i2c_reply(adap, &msgs[i], msgs[i].len, i);
+ if (report && report->bytes_cplt > 0 && msgs[n].flags & I2C_M_RD)
+ trace_i2c_reply(adap, &msgs[n], report->bytes_cplt, n);
trace_i2c_result(adap, num, ret);
}
return ret;
}
+EXPORT_SYMBOL(__i2c_transfer_v2);
+
+int __i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
+{
+ return __i2c_transfer_v2(adap, msgs, num, NULL);
+}
EXPORT_SYMBOL(__i2c_transfer);
/**
- * i2c_transfer - execute a single or combined I2C message
+ * i2c_transfer_v2 - execute a single or combined I2C message
* @adap: Handle to I2C bus
* @msgs: One or more messages to execute before STOP is issued to
* terminate the operation; each message begins with a START.
* @num: Number of messages to be executed.
+ * @report: Pointer for transmission fault report.
*
* Returns negative errno, else the number of messages executed.
*
* Note that there is no requirement that each message be sent to
* the same slave address, although that is the most common model.
*/
-int i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
+int i2c_transfer_v2(struct i2c_adapter *adap, struct i2c_msg *msgs, int num,
+ struct i2c_transfer_report *report)
{
int ret;
- /* REVISIT the fault reporting model here is weak:
- *
- * - When we get an error after receiving N bytes from a slave,
- * there is no way to report "N".
- *
- * - When we get a NAK after transmitting N bytes to a slave,
- * there is no way to report "N" ... or to let the master
- * continue executing the rest of this combined message, if
- * that's the appropriate response.
- *
- * - When for example "num" is two and we successfully complete
- * the first message but get an error part way through the
- * second, it's unclear whether that should be reported as
- * one (discarding status on the second message) or errno
- * (discarding status on the first one).
- */
ret = __i2c_lock_bus_helper(adap);
if (ret)
return ret;
- ret = __i2c_transfer(adap, msgs, num);
+ ret = __i2c_transfer_v2(adap, msgs, num, report);
i2c_unlock_bus(adap, I2C_LOCK_SEGMENT);
return ret;
}
+EXPORT_SYMBOL(i2c_transfer_v2);
+
+int i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
+{
+ return i2c_transfer_v2(adap, msgs, num, NULL);
+}
EXPORT_SYMBOL(i2c_transfer);
/**
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c
index ccaac5e29f906bec0bf3b0e4a259391053469c2f..90456e6c04b4131dde9a9c2a5bd2075dd32b4baf 100644
--- a/drivers/i2c/i2c-dev.c
+++ b/drivers/i2c/i2c-dev.c
@@ -240,12 +240,18 @@ static int i2cdev_check_addr(struct i2c_adapter *adapter, unsigned int addr)
return result;
}
-static noinline int i2cdev_ioctl_rdwr(struct i2c_client *client,
- unsigned nmsgs, struct i2c_msg *msgs)
+static noinline int i2cdev_ioctl_rdwr_v2(struct i2c_client *client,
+ unsigned int nmsgs, struct i2c_msg *msgs,
+ struct i2c_transfer_report __user *user_report)
{
+ struct i2c_transfer_report report;
u8 __user **data_ptrs;
int i, res;
+ report.msgs_cplt = -EOPNOTSUPP;
+ report.fault_msg_idx = -EOPNOTSUPP;
+ report.bytes_cplt = -EOPNOTSUPP;
+
/* Adapter must support I2C transfers */
if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C))
return -EOPNOTSUPP;
@@ -259,6 +265,7 @@ static noinline int i2cdev_ioctl_rdwr(struct i2c_client *client,
/* Limit the size of the message to a sane amount */
if (msgs[i].len > 8192) {
res = -EINVAL;
+ report.fault_msg_idx = i;
break;
}
@@ -266,6 +273,7 @@ static noinline int i2cdev_ioctl_rdwr(struct i2c_client *client,
msgs[i].buf = memdup_user(data_ptrs[i], msgs[i].len);
if (IS_ERR(msgs[i].buf)) {
res = PTR_ERR(msgs[i].buf);
+ report.fault_msg_idx = i;
break;
}
/* memdup_user allocates with GFP_KERNEL, so DMA is ok */
@@ -289,6 +297,7 @@ static noinline int i2cdev_ioctl_rdwr(struct i2c_client *client,
I2C_SMBUS_BLOCK_MAX) {
i++;
res = -EINVAL;
+ report.fault_msg_idx = i;
break;
}
@@ -303,9 +312,34 @@ static noinline int i2cdev_ioctl_rdwr(struct i2c_client *client,
return res;
}
- res = i2c_transfer(client->adapter, msgs, nmsgs);
+ if (user_report) {
+ res = i2c_transfer_v2(client->adapter, msgs, nmsgs, &report);
+ i = report.msgs_cplt;
+ } else {
+ res = i2c_transfer(client->adapter, msgs, nmsgs);
+ if (res < 0)
+ i = 0;
+ else
+ i = nmsgs;
+ }
+
+ if (user_report && copy_to_user(user_report, &report, sizeof(report)))
+ res = -EFAULT;
+
+ /* Number of messages transferred completely or partially */
+ if (report.bytes_cplt > 0) {
+ i++;
+ msgs[i].len = report.bytes_cplt;
+ }
+
+ if (i > (int)nmsgs) {
+ pr_err("Bad i2c_transfer_report: msgs_cplt = %i, bytes_cplt = %i, nmsgs = %i\n",
+ report.msgs_cplt, report.bytes_cplt, nmsgs);
+ i = nmsgs;
+ }
+
while (i-- > 0) {
- if (res >= 0 && (msgs[i].flags & I2C_M_RD)) {
+ if (msgs[i].flags & I2C_M_RD) {
if (copy_to_user(data_ptrs[i], msgs[i].buf,
msgs[i].len))
res = -EFAULT;
@@ -439,18 +473,39 @@ static long i2cdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
funcs = i2c_get_functionality(client->adapter);
return put_user(funcs, (unsigned long __user *)arg);
- case I2C_RDWR: {
+ case I2C_RDWR:
+ case I2C_RDWR_V2:
+ {
+ struct i2c_rdwr_ioctl_data __user *user_arg;
+ struct i2c_transfer_report __user *user_rep;
struct i2c_rdwr_ioctl_data rdwr_arg;
struct i2c_msg *rdwr_pa;
int res;
- if (copy_from_user(&rdwr_arg,
- (struct i2c_rdwr_ioctl_data __user *)arg,
- sizeof(rdwr_arg)))
+ if (cmd == I2C_RDWR_V2) {
+ user_arg = &((struct i2c_rdwr_v2_ioctl_data __user *)arg)->rdwr_data;
+ user_rep = &((struct i2c_rdwr_v2_ioctl_data __user *)arg)->report;
+ } else {
+ user_arg = (struct i2c_rdwr_ioctl_data __user *)arg;
+ user_rep = NULL;
+ }
+
+ if (copy_from_user(&rdwr_arg, user_arg, sizeof(rdwr_arg)))
return -EFAULT;
- if (!rdwr_arg.msgs || rdwr_arg.nmsgs == 0)
- return -EINVAL;
+ if (!rdwr_arg.msgs || rdwr_arg.nmsgs == 0) {
+ /*
+ * I2C_RDWR_V2 ioctl with nmsgs == 0 is used for
+ * discovering of the controller capability to return
+ * detailed fault reports.
+ */
+ if (cmd == I2C_RDWR)
+ return -EINVAL;
+ if (client->adapter->algo->xfer_v2)
+ return 0;
+ else
+ return -EOPNOTSUPP;
+ }
/*
* Put an arbitrary limit on the number of messages that can
@@ -464,7 +519,7 @@ static long i2cdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
if (IS_ERR(rdwr_pa))
return PTR_ERR(rdwr_pa);
- res = i2cdev_ioctl_rdwr(client, rdwr_arg.nmsgs, rdwr_pa);
+ res = i2cdev_ioctl_rdwr_v2(client, rdwr_arg.nmsgs, rdwr_pa, user_rep);
kfree(rdwr_pa);
return res;
}
@@ -572,7 +627,7 @@ static long compat_i2cdev_ioctl(struct file *file, unsigned int cmd, unsigned lo
};
}
- res = i2cdev_ioctl_rdwr(client, rdwr_arg.nmsgs, rdwr_pa);
+ res = i2cdev_ioctl_rdwr_v2(client, rdwr_arg.nmsgs, rdwr_pa, NULL);
kfree(rdwr_pa);
return res;
}
diff --git a/include/linux/i2c.h b/include/linux/i2c.h
index 20fd41b51d5c85ee1665395c07345faafd8e2fca..0305d4daa157c27d700f31c15faf0c3984114ce0 100644
--- a/include/linux/i2c.h
+++ b/include/linux/i2c.h
@@ -131,6 +131,14 @@ int i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num);
/* Unlocked flavor */
int __i2c_transfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num);
+/* Transfer with detailed transfer reporting.
+ */
+int i2c_transfer_v2(struct i2c_adapter *adap, struct i2c_msg *msgs, int num,
+ struct i2c_transfer_report *report);
+/* Unlocked flavor */
+int __i2c_transfer_v2(struct i2c_adapter *adap, struct i2c_msg *msgs, int num,
+ struct i2c_transfer_report *report);
+
/* This is the very generalized SMBus access routine. You probably do not
want to use this, though; one of the functions below may be much easier,
and probably just as fast.
@@ -567,6 +575,10 @@ struct i2c_algorithm {
unsigned short flags, char read_write,
u8 command, int size, union i2c_smbus_data *data);
+ /* Same as xfer with detailed reporting */
+ int (*xfer_v2)(struct i2c_adapter *adap, struct i2c_msg *msgs,
+ int num, struct i2c_transfer_report *report);
+
/* To determine what the adapter supports */
u32 (*functionality)(struct i2c_adapter *adap);
diff --git a/include/trace/events/i2c.h b/include/trace/events/i2c.h
index 142a23c6593c611de9abc2a89a146b95550b23cd..2ea8e9805edf591d63dcb589340b0704fd6d38f7 100644
--- a/include/trace/events/i2c.h
+++ b/include/trace/events/i2c.h
@@ -88,8 +88,8 @@ TRACE_EVENT_FN(i2c_read,
*/
TRACE_EVENT_FN(i2c_reply,
TP_PROTO(const struct i2c_adapter *adap, const struct i2c_msg *msg,
- int num),
- TP_ARGS(adap, msg, num),
+ int data_len, int num),
+ TP_ARGS(adap, msg, data_len, num),
TP_STRUCT__entry(
__field(int, adapter_nr )
__field(__u16, msg_nr )
@@ -102,7 +102,7 @@ TRACE_EVENT_FN(i2c_reply,
__entry->msg_nr = num;
__entry->addr = msg->addr;
__entry->flags = msg->flags;
- __entry->len = msg->len;
+ __entry->len = data_len;
memcpy(__get_dynamic_array(buf), msg->buf, msg->len);
),
TP_printk("i2c-%d #%u a=%03x f=%04x l=%u [%*phD]",
diff --git a/include/uapi/linux/i2c-dev.h b/include/uapi/linux/i2c-dev.h
index 1c4cec4ddd84d739193b234d33cae7860856738e..5097568a31490e2c9c2036a7d94ab47588413beb 100644
--- a/include/uapi/linux/i2c-dev.h
+++ b/include/uapi/linux/i2c-dev.h
@@ -11,11 +11,13 @@
#include <linux/types.h>
#include <linux/compiler.h>
+#include <linux/i2c.h>
/* /dev/i2c-X ioctl commands. The ioctl's parameter is always an
* unsigned long, except for:
* - I2C_FUNCS, takes pointer to an unsigned long
* - I2C_RDWR, takes pointer to struct i2c_rdwr_ioctl_data
+ * - I2C_RDWR_V2, takes pointer to struct i2c_rdwr_v2_ioctl_data
* - I2C_SMBUS, takes pointer to struct i2c_smbus_ioctl_data
*/
#define I2C_RETRIES 0x0701 /* number of times a device address should
@@ -33,6 +35,7 @@
#define I2C_FUNCS 0x0705 /* Get the adapter functionality mask */
#define I2C_RDWR 0x0707 /* Combined R/W transfer (one STOP only) */
+#define I2C_RDWR_V2 0x0709 /* I2C_RDWR with detailed fault reporting */
#define I2C_PEC 0x0708 /* != 0 to use PEC with SMBus */
#define I2C_SMBUS 0x0720 /* SMBus transfer */
@@ -52,6 +55,12 @@ struct i2c_rdwr_ioctl_data {
__u32 nmsgs; /* number of i2c_msgs */
};
+/* This is the structure as used in the I2C_RDWR_V2 ioctl call */
+struct i2c_rdwr_v2_ioctl_data {
+ struct i2c_rdwr_ioctl_data rdwr_data;
+ struct i2c_transfer_report report;
+};
+
#define I2C_RDWR_IOCTL_MAX_MSGS 42
/* Originally defined with a typo, keep it for compatibility */
#define I2C_RDRW_IOCTL_MAX_MSGS I2C_RDWR_IOCTL_MAX_MSGS
diff --git a/include/uapi/linux/i2c.h b/include/uapi/linux/i2c.h
index 2a226657d9f8238365453121321fd70dc11dac02..5e8e7d3536c85f2fe604a285258b070f2efffbb2 100644
--- a/include/uapi/linux/i2c.h
+++ b/include/uapi/linux/i2c.h
@@ -135,6 +135,27 @@ struct i2c_msg {
I2C_FUNC_SMBUS_READ_BLOCK_DATA | \
I2C_FUNC_SMBUS_BLOCK_PROC_CALL)
+/* Detailed transfer report */
+
+struct i2c_transfer_report {
+ __s32 fault_msg_idx; /* In case of a fault, index of the message that caused
+ * the fault. If the bus driver cannot determine it, it
+ * puts a negative error code. If there is no fault, the
+ * value is equal to number of messages transferred.
+ */
+ __s32 msgs_cplt; /* Number of messages that are known to be transferred
+ * successfully. If the bus driver cannot determine it, it
+ * puts a negative error code. If there is no fault, the
+ * value is equal to number of messages transferred.
+ */
+ __s32 bytes_cplt; /* In case of a fault, number of bytes in the message at
+ * index `msgs_cplt` that are known to be transferred
+ * successfully. If the bus driver cannot determine the
+ * number of bytes, it puts a negative error value.
+ * If there is no fault, the value is 0.
+ */
+};
+
/*
* Data for SMBus Messages
*/
--
2.43.0
^ permalink raw reply related
* [PATCH 0/7] I2C - detailed transfer reporting in case of a fault
From: Dmitry Guzman @ 2026-06-23 16:31 UTC (permalink / raw)
To: Andi Shyti, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Linus Walleij
Cc: linux-i2c, linux-kernel, linux-trace-kernel, linux-arm-kernel,
Dmitry Guzman
The existing API has function `i2c_xfer` that transfers one or more
messages, and it only returns a single error code if the transfer was
failed. It doesn't allow to know how many of the messages were
transferred successfully, neither how many bytes were transferred in the
message that caused the fault, and also it drops all data received from
target device before the fault. There is a comment about this in
drivers/i2c/i2c-core-base.c: "REVISIT the fault reporting model here is
weak".
This patch series implements new API function `i2c_xfer_v2` that does
the same as `i2c_xfer` but also returns detailed transfer report, including
number of messages and bytes transferred before the fault. This also allows
client to get the bytes read from the target before the fault occurred.
For user space clients, new ioctl `I2C_RDWR_V2` is introduced.
In this patchset, the introduced functionality is implemented in
`i2c-nomadik` driver. Several other improvements in this driver related
to fault handling are also included in this patchset. It has been tested
on EyeQ6H-EPM6 board.
The implementation is split up into patches:
Patch #1 Introduce callback `xfer_v2` in struct `i2c_algorithm`,
function `i2c_xfer_v2`, ioctl `I2C_RDWR_V2`, structures for I2C
transfer reporting and implement all driver-independent functionality.
Patch #2 Optimize struct layout in `i2c-nomadik`.
Patch #3 Remove automatic retransfer in `i2c-nomadik`.
Patch #4 Fix error codes returned by `xfer` callback in `i2c-nomadik`.
Patch #5 Replace `dev_err` with `dev_dbg` on I2C faults in `i2c-nomadik`.
Patch #6 Add quirks that describe some limitations of `i2c-nomadik`.
Patch #7 Add support for `xfer_v2` in `i2c-nomadik`.
Signed-off-by: Dmitry Guzman <Dmitry.Guzman@mobileye.com>
---
Dmitry Guzman (7):
i2c: add I2C_XFER_V2 - support for detailed transfer reporting
i2c: nomadik: optimize layout of struct nmk_i2c_dev
i2c: nomadik: do not try to retransmit I2C message series on errors
i2c: nomadik: return proper fault codes
i2c: nomadik: change print level for fault messages to debug
i2c: nomadik: add quirks max_len=2047 and no_zero_len_read
i2c: nomadik: add support for I2C_XFER_V2 - detailed fault reporting
Documentation/i2c/dev-interface.rst | 46 ++++++++++++++++
drivers/i2c/busses/i2c-nomadik.c | 105 ++++++++++++++++++++++++-----------
drivers/i2c/i2c-core-base.c | 107 +++++++++++++++++++++++++-----------
drivers/i2c/i2c-dev.c | 79 ++++++++++++++++++++++----
include/linux/i2c.h | 12 ++++
include/trace/events/i2c.h | 6 +-
include/uapi/linux/i2c-dev.h | 9 +++
include/uapi/linux/i2c.h | 21 +++++++
8 files changed, 306 insertions(+), 79 deletions(-)
---
base-commit: 502d801f0ab03e4f32f9a33d203154ce84887921
change-id: 20260623-i2c-fault-reporting-9236c9affc2d
Best regards,
--
Dmitry Guzman <Dmitry.Guzman@mobileye.com>
^ permalink raw reply
* [PATCH v6 8/8] x86/setup: prepend embedded bootconfig cmdline before parse_early_param
From: Breno Leitao @ 2026-06-23 16:15 UTC (permalink / raw)
To: Masami Hiramatsu, Andrew Morton, Nathan Chancellor, paulmck,
Nicolas Schier, Nick Desaulniers, Bill Wendling, Justin Stitt,
Jonathan Corbet, Shuah Khan
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, linux-kernel, linux-trace-kernel, linux-kbuild,
bpf, llvm, linux-doc, Breno Leitao, kernel-team
In-Reply-To: <20260623-bootconfig_using_tools-v6-0-640c2f587a3c@debian.org>
Call xbc_prepend_embedded_cmdline() in setup_arch() right after the
CONFIG_CMDLINE merge and before strscpy(command_line, ...) so the
build-time-rendered embedded bootconfig "kernel" subtree is part of
boot_command_line by the time parse_early_param() runs. early_param()
handlers (mem=, earlycon=, loglevel=, ...) now see values supplied via
CONFIG_BOOT_CONFIG_EMBED_FILE without parsing bootconfig at runtime.
Gate the prepend on the same opt-in the runtime parser uses: prepend
when "bootconfig" is present on the command line, or when
CONFIG_BOOT_CONFIG_FORCE is set. Detect it with parse_args(), exactly
as setup_boot_config() does, so both agree on what counts as opt-in:
any "bootconfig" key regardless of value (bare, =0, =1, ...), and only
before the "--" that separates init arguments. Sharing the parser keeps
the early and late paths from diverging -- e.g. "bootconfig=0" or a
"-- bootconfig" meant for init must not apply the embedded keys early
while the runtime parser skips them.
The prepend necessarily runs before setup_boot_config() detects an
initrd bootconfig, so an initrd cannot override the embedded "kernel"
keys for early_param(). This is intentional: the embedded cmdline acts
like a build-time CONFIG_CMDLINE. An initrd bootconfig's "kernel" keys
never reached early_param() anyway (they apply late via
extra_command_line), so nothing is lost -- the initrd keys still apply
late, with last-wins keeping the embedded values in effect.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
arch/x86/Kconfig | 1 +
arch/x86/kernel/setup.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0de23e6471973..8ab11199c16d5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -127,6 +127,7 @@ config X86
select ARCH_SUPPORTS_NUMA_BALANCING if X86_64
select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <= 4096
select ARCH_SUPPORTS_CFI if X86_64
+ select ARCH_SUPPORTS_CMDLINE_FROM_BOOTCONFIG
select ARCH_USES_CFI_TRAPS if X86_64 && CFI
select ARCH_SUPPORTS_LTO_CLANG
select ARCH_SUPPORTS_LTO_CLANG_THIN
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 46882ce79c3a4..c973a2cebcd04 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -6,6 +6,7 @@
* parts of early kernel initialization.
*/
#include <linux/acpi.h>
+#include <linux/bootconfig.h>
#include <linux/console.h>
#include <linux/cpu.h>
#include <linux/crash_dump.h>
@@ -881,6 +882,37 @@ static void __init x86_report_nx(void)
* Note: On x86_64, fixmaps are ready for use even before this is called.
*/
+#ifdef CONFIG_CMDLINE_FROM_BOOTCONFIG
+static int __init bootconfig_optin(char *param, char *val,
+ const char *unused, void *arg)
+{
+ if (!strcmp(param, "bootconfig"))
+ *(bool *)arg = true;
+ return 0;
+}
+
+/*
+ * Did the user opt in to bootconfig on the kernel command line? Use
+ * parse_args() so this matches setup_boot_config() exactly, including
+ * stopping at the "--" that separates init arguments.
+ */
+static bool __init bootconfig_cmdline_requested(void)
+{
+ static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;
+ bool found = false;
+
+ if (IS_ENABLED(CONFIG_BOOT_CONFIG_FORCE))
+ return true;
+
+ strscpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
+ if (IS_ERR(parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0,
+ &found, bootconfig_optin)))
+ return false;
+
+ return found;
+}
+#endif
+
void __init setup_arch(char **cmdline_p)
{
#ifdef CONFIG_X86_32
@@ -924,6 +956,17 @@ void __init setup_arch(char **cmdline_p)
builtin_cmdline_added = true;
#endif
+#ifdef CONFIG_CMDLINE_FROM_BOOTCONFIG
+ /*
+ * Prepend the build-time-rendered embedded "kernel" keys here so
+ * parse_early_param() below sees them, gating on the same opt-in
+ * as the runtime parser (see bootconfig_cmdline_requested()).
+ */
+ if (bootconfig_cmdline_requested())
+ xbc_prepend_embedded_cmdline(boot_command_line,
+ COMMAND_LINE_SIZE);
+#endif
+
strscpy(command_line, boot_command_line, COMMAND_LINE_SIZE);
*cmdline_p = command_line;
--
2.53.0-Meta
^ permalink raw reply related
* [PATCH v6 7/8] bootconfig: skip runtime kernel.* render once prepended early
From: Breno Leitao @ 2026-06-23 16:15 UTC (permalink / raw)
To: Masami Hiramatsu, Andrew Morton, Nathan Chancellor, paulmck,
Nicolas Schier, Nick Desaulniers, Bill Wendling, Justin Stitt,
Jonathan Corbet, Shuah Khan
Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, linux-kernel, linux-trace-kernel, linux-kbuild,
bpf, llvm, linux-doc, Breno Leitao, kernel-team
In-Reply-To: <20260623-bootconfig_using_tools-v6-0-640c2f587a3c@debian.org>
setup_boot_config() folds the embedded bootconfig "kernel" subtree into
the command line via xbc_make_cmdline("kernel"). A subsequent patch lets
an architecture prepend the build-time-rendered embedded "kernel" keys
to boot_command_line early in setup_arch(); rendering them again here
would then duplicate every key in saved_command_line and make
accumulating handlers (console=, earlycon=, ...) re-register the same
value.
Track whether the bootconfig data came from the embedded source
(from_embedded) and skip the runtime render only when the early prepend
actually happened, as reported by xbc_embedded_cmdline_applied(). On
architectures that do not select ARCH_SUPPORTS_CMDLINE_FROM_BOOTCONFIG
that helper is a stub returning false, so this path is unchanged and the
embedded "kernel" keys still reach the cmdline via the runtime parser
exactly as before.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
init/main.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/init/main.c b/init/main.c
index e363232b428b4..260bd5242f94e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -378,12 +378,15 @@ static void __init setup_boot_config(void)
int pos, ret;
size_t size;
char *err;
+ bool from_embedded = false;
/* Cut out the bootconfig data even if we have no bootconfig option */
data = get_boot_config_from_initrd(&size);
/* If there is no bootconfig in initrd, try embedded one. */
- if (!data)
+ if (!data) {
data = xbc_get_embedded_bootconfig(&size);
+ from_embedded = true;
+ }
strscpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
err = parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
@@ -421,8 +424,24 @@ static void __init setup_boot_config(void)
} else {
xbc_get_info(&ret, NULL);
pr_info("Load bootconfig: %ld bytes %d nodes\n", (long)size, ret);
- /* keys starting with "kernel." are passed via cmdline */
- extra_command_line = xbc_make_cmdline("kernel");
+ /*
+ * keys starting with "kernel." are passed via cmdline. When
+ * this bootconfig came from the embedded source and
+ * setup_arch() already prepended the rendered "kernel" subtree
+ * to boot_command_line, rendering again here would duplicate
+ * the keys in saved_command_line and make accumulating handlers
+ * (console=, earlycon=, ...) re-register the same value. Skip
+ * only when the prepend really happened.
+ *
+ * On arches that do not select ARCH_SUPPORTS_CMDLINE_FROM_BOOTCONFIG,
+ * CONFIG_CMDLINE_FROM_BOOTCONFIG is unselectable and
+ * xbc_embedded_cmdline_applied() collapses to a stub returning
+ * false, so this path still runs and the embedded "kernel"
+ * keys reach the cmdline via the runtime parser exactly as
+ * before this series.
+ */
+ if (!from_embedded || !xbc_embedded_cmdline_applied())
+ extra_command_line = xbc_make_cmdline("kernel");
/* Also, "init." keys are init arguments */
extra_init_args = xbc_make_cmdline("init");
}
--
2.53.0-Meta
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox