* [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall @ 2016-08-29 9:30 Marcin Nowakowski [not found] ` <1472463007-6469-1-git-send-email-marcin.nowakowski-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Marcin Nowakowski @ 2016-08-29 9:30 UTC (permalink / raw) To: linux-mips-u79uwXL29TY76Z2rM5mHXA, Steven Rostedt, Ingo Molnar, open list:ABI/API, open list Cc: Marcin Nowakowski Syscall metadata makes an assumption that only a single syscall number corresponds to a given method. This is true for most archs, but can break tracing otherwise. For MIPS platforms, depending on the choice of supported ABIs, up to 3 system call numbers can correspond to the same call - depending on which ABI the userspace app uses. When init_ftrace_syscalls() sets up the syscall_nr field in metadata, it would overwrite that with the highest number matching a given syscall. To avoid this, change the syscall_nr member of syscall_metadata to an array - for most archs the array will be of size 1 and is not going to add any overhead. If an arch requires multiple syscall_nr to be supported, it needs to define its own NR_syscall_tables to override the default behaviour. Signed-off-by: Marcin Nowakowski <marcin.nowakowski-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org> --- include/linux/syscalls.h | 2 +- include/trace/syscall.h | 5 +- kernel/trace/trace_syscalls.c | 103 ++++++++++++++++++++++++++++++------------ 3 files changed, 78 insertions(+), 32 deletions(-) diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index d022390..6f4af11 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -160,7 +160,7 @@ extern struct trace_event_functions exit_syscall_print_funcs; static struct syscall_metadata __used \ __syscall_meta_##sname = { \ .name = "sys"#sname, \ - .syscall_nr = -1, /* Filled in at boot */ \ + .syscall_nr[0 ... (NR_syscall_tables-1)] = -1, /* Filled in at boot */ \ .nb_args = nb, \ .types = nb ? types_##sname : NULL, \ .args = nb ? args_##sname : NULL, \ diff --git a/include/trace/syscall.h b/include/trace/syscall.h index 7434f0f..f7073922 100644 --- a/include/trace/syscall.h +++ b/include/trace/syscall.h @@ -8,6 +8,9 @@ #include <asm/ptrace.h> +#ifndef NR_syscall_tables +#define NR_syscall_tables 1 +#endif /* * A syscall entry in the ftrace syscalls array. @@ -23,7 +26,7 @@ */ struct syscall_metadata { const char *name; - int syscall_nr; + int syscall_nr[NR_syscall_tables]; int nb_args; const char **types; const char **args; diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index b2b6efc..ed22c50 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -403,16 +403,24 @@ static int reg_event_syscall_enter(struct trace_event_file *file, { struct trace_array *tr = file->tr; int ret = 0; - int num; + int num, i; - num = ((struct syscall_metadata *)call->data)->syscall_nr; + num = ((struct syscall_metadata *)call->data)->syscall_nr[0]; if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)) return -ENOSYS; mutex_lock(&syscall_trace_lock); if (!tr->sys_refcount_enter) ret = register_trace_sys_enter(ftrace_syscall_enter, tr); + if (!ret) { - rcu_assign_pointer(tr->enter_syscall_files[num], file); + for (i = 0; i < NR_syscall_tables; i++) { + struct syscall_metadata *metadata = call->data; + + num = metadata->syscall_nr[i]; + if (num > 0 && num < NR_syscalls) + rcu_assign_pointer( + tr->enter_syscall_files[num], file); + } tr->sys_refcount_enter++; } mutex_unlock(&syscall_trace_lock); @@ -423,14 +431,18 @@ static void unreg_event_syscall_enter(struct trace_event_file *file, struct trace_event_call *call) { struct trace_array *tr = file->tr; - int num; + int num, i; - num = ((struct syscall_metadata *)call->data)->syscall_nr; + num = ((struct syscall_metadata *)call->data)->syscall_nr[0]; if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)) return; mutex_lock(&syscall_trace_lock); tr->sys_refcount_enter--; - RCU_INIT_POINTER(tr->enter_syscall_files[num], NULL); + for (i = 0; i < NR_syscall_tables; i++) { + num = ((struct syscall_metadata *)call->data)->syscall_nr[i]; + if (num > 0 && num < NR_syscalls) + RCU_INIT_POINTER(tr->enter_syscall_files[num], NULL); + } if (!tr->sys_refcount_enter) unregister_trace_sys_enter(ftrace_syscall_enter, tr); mutex_unlock(&syscall_trace_lock); @@ -441,16 +453,23 @@ static int reg_event_syscall_exit(struct trace_event_file *file, { struct trace_array *tr = file->tr; int ret = 0; - int num; + int num, i; - num = ((struct syscall_metadata *)call->data)->syscall_nr; + num = ((struct syscall_metadata *)call->data)->syscall_nr[0]; if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)) return -ENOSYS; mutex_lock(&syscall_trace_lock); if (!tr->sys_refcount_exit) ret = register_trace_sys_exit(ftrace_syscall_exit, tr); if (!ret) { - rcu_assign_pointer(tr->exit_syscall_files[num], file); + for (i = 0; i < NR_syscall_tables; i++) { + struct syscall_metadata *metadata = call->data; + + num = metadata->syscall_nr[i]; + if (num > 0 && num < NR_syscalls) + rcu_assign_pointer( + tr->exit_syscall_files[num], file); + } tr->sys_refcount_exit++; } mutex_unlock(&syscall_trace_lock); @@ -461,14 +480,18 @@ static void unreg_event_syscall_exit(struct trace_event_file *file, struct trace_event_call *call) { struct trace_array *tr = file->tr; - int num; + int num, i; - num = ((struct syscall_metadata *)call->data)->syscall_nr; + num = ((struct syscall_metadata *)call->data)->syscall_nr[0]; if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls)) return; mutex_lock(&syscall_trace_lock); tr->sys_refcount_exit--; - RCU_INIT_POINTER(tr->exit_syscall_files[num], NULL); + for (i = 0; i < NR_syscall_tables; i++) { + num = ((struct syscall_metadata *)call->data)->syscall_nr[i]; + if (num > 0 && num < NR_syscalls) + RCU_INIT_POINTER(tr->exit_syscall_files[num], NULL); + } if (!tr->sys_refcount_exit) unregister_trace_sys_exit(ftrace_syscall_exit, tr); mutex_unlock(&syscall_trace_lock); @@ -479,7 +502,7 @@ static int __init init_syscall_trace(struct trace_event_call *call) int id; int num; - num = ((struct syscall_metadata *)call->data)->syscall_nr; + num = ((struct syscall_metadata *)call->data)->syscall_nr[0]; if (num < 0 || num >= NR_syscalls) { pr_debug("syscall %s metadata not mapped, disabling ftrace event\n", ((struct syscall_metadata *)call->data)->name); @@ -542,13 +565,19 @@ void __init init_ftrace_syscalls(void) } for (i = 0; i < NR_syscalls; i++) { + int j; addr = arch_syscall_addr(i); meta = find_syscall_meta(addr); if (!meta) continue; - meta->syscall_nr = i; syscalls_metadata[i] = meta; + for (j = 0; j < NR_syscall_tables; j++) { + if (meta->syscall_nr[j] == -1) { + meta->syscall_nr[j] = i; + break; + } + } } } @@ -602,9 +631,7 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id) static int perf_sysenter_enable(struct trace_event_call *call) { int ret = 0; - int num; - - num = ((struct syscall_metadata *)call->data)->syscall_nr; + int num, i; mutex_lock(&syscall_trace_lock); if (!sys_perf_refcount_enter) @@ -613,7 +640,13 @@ static int perf_sysenter_enable(struct trace_event_call *call) pr_info("event trace: Could not activate" "syscall entry trace point"); } else { - set_bit(num, enabled_perf_enter_syscalls); + for (i = 0; i < NR_syscall_tables; i++) { + struct syscall_metadata *metadata = call->data; + + num = metadata->syscall_nr[i]; + if (num > 0 && num < NR_syscalls) + set_bit(num, enabled_perf_enter_syscalls); + } sys_perf_refcount_enter++; } mutex_unlock(&syscall_trace_lock); @@ -622,13 +655,17 @@ static int perf_sysenter_enable(struct trace_event_call *call) static void perf_sysenter_disable(struct trace_event_call *call) { - int num; - - num = ((struct syscall_metadata *)call->data)->syscall_nr; + int num, i; mutex_lock(&syscall_trace_lock); sys_perf_refcount_enter--; - clear_bit(num, enabled_perf_enter_syscalls); + for (i = 0; i < NR_syscall_tables; i++) { + struct syscall_metadata *metadata = call->data; + + num = metadata->syscall_nr[i]; + if (num > 0 && num < NR_syscalls) + clear_bit(num, enabled_perf_enter_syscalls); + } if (!sys_perf_refcount_enter) unregister_trace_sys_enter(perf_syscall_enter, NULL); mutex_unlock(&syscall_trace_lock); @@ -674,9 +711,7 @@ static void perf_syscall_exit(void *ignore, struct pt_regs *regs, long ret) static int perf_sysexit_enable(struct trace_event_call *call) { int ret = 0; - int num; - - num = ((struct syscall_metadata *)call->data)->syscall_nr; + int num, i; mutex_lock(&syscall_trace_lock); if (!sys_perf_refcount_exit) @@ -685,7 +720,13 @@ static int perf_sysexit_enable(struct trace_event_call *call) pr_info("event trace: Could not activate" "syscall exit trace point"); } else { - set_bit(num, enabled_perf_exit_syscalls); + for (i = 0; i < NR_syscall_tables; i++) { + struct syscall_metadata *metadata = call->data; + + num = metadata->syscall_nr[i]; + if (num > 0 && num < NR_syscalls) + set_bit(num, enabled_perf_exit_syscalls); + } sys_perf_refcount_exit++; } mutex_unlock(&syscall_trace_lock); @@ -694,13 +735,15 @@ static int perf_sysexit_enable(struct trace_event_call *call) static void perf_sysexit_disable(struct trace_event_call *call) { - int num; - - num = ((struct syscall_metadata *)call->data)->syscall_nr; + int num, i; mutex_lock(&syscall_trace_lock); sys_perf_refcount_exit--; - clear_bit(num, enabled_perf_exit_syscalls); + for (i = 0; i < NR_syscall_tables; i++) { + num = ((struct syscall_metadata *)call->data)->syscall_nr[i]; + if (num > 0 && num < NR_syscalls) + clear_bit(num, enabled_perf_exit_syscalls); + } if (!sys_perf_refcount_exit) unregister_trace_sys_exit(perf_syscall_exit, NULL); mutex_unlock(&syscall_trace_lock); -- 2.7.4 ^ permalink raw reply related [flat|nested] 18+ messages in thread
[parent not found: <1472463007-6469-1-git-send-email-marcin.nowakowski-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <1472463007-6469-1-git-send-email-marcin.nowakowski-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org> @ 2016-08-29 23:55 ` Andy Lutomirski [not found] ` <CALCETrW1m9ozck-ugX6AKnL7oNA8rvMTjhFGqtVSvKL9BMXMZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Andy Lutomirski @ 2016-08-29 23:55 UTC (permalink / raw) To: Marcin Nowakowski Cc: Linux API, linux-mips-u79uwXL29TY76Z2rM5mHXA, Ingo Molnar, Steven Rostedt, open list On Aug 29, 2016 11:30 AM, "Marcin Nowakowski" <marcin.nowakowski-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org> wrote: > > Syscall metadata makes an assumption that only a single syscall number > corresponds to a given method. This is true for most archs, but > can break tracing otherwise. > > For MIPS platforms, depending on the choice of supported ABIs, up to 3 > system call numbers can correspond to the same call - depending on which > ABI the userspace app uses. MIPS isn't special here. x86 does the same thing. Why isn't this a problem on x86? Also, you seem to be partially reinventing AUDIT_ARCH here. Can you use that and integrate with syscall_get_arch()? --Andy ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CALCETrW1m9ozck-ugX6AKnL7oNA8rvMTjhFGqtVSvKL9BMXMZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <CALCETrW1m9ozck-ugX6AKnL7oNA8rvMTjhFGqtVSvKL9BMXMZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-08-30 8:14 ` Marcin Nowakowski 2016-08-30 18:52 ` Andy Lutomirski 0 siblings, 1 reply; 18+ messages in thread From: Marcin Nowakowski @ 2016-08-30 8:14 UTC (permalink / raw) To: Andy Lutomirski Cc: Linux API, Ingo Molnar, Steven Rostedt, open list, linux-mips-6z/3iImG2C8G8FEW9MqTrA On 30.08.2016 01:55, Andy Lutomirski wrote: > On Aug 29, 2016 11:30 AM, "Marcin Nowakowski" > <marcin.nowakowski-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org> wrote: >> >> Syscall metadata makes an assumption that only a single syscall number >> corresponds to a given method. This is true for most archs, but >> can break tracing otherwise. >> >> For MIPS platforms, depending on the choice of supported ABIs, up to 3 >> system call numbers can correspond to the same call - depending on which >> ABI the userspace app uses. > > MIPS isn't special here. x86 does the same thing. Why isn't this a > problem on x86? > Hi Andy, My understanding is that MIPS is quite different to what most other architectures do ... First of all x86 disables tracing of compat syscalls as that didn't work properly because of wrong mapping of syscall numbers to syscalls: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f431b634f Moreover, when trace_syscalls is initialised, the syscall metadata is updated to include the right syscall numbers. That uses arch_syscall_addr method, which has a default implementation in kernel/trace/trace_syscalls.c: unsigned long __init __weak arch_syscall_addr(int nr) { return (unsigned long)sys_call_table[nr]; } that works for x86 and only uses 'native' syscalls, ie. for x86_64 will not map any of the ia32_sys_call_table entries. So on one hand we have the code that disables tracing for x86_64 compat, on the other we only ensure that the native calls are mapped. It is quite different for MIPS where syscall numbers for different ABIs have distinct call numbers, so the following code maps the syscalls (for O32 -> 4xxx, N64 -> 5xxx, N32 -> 6xxx): unsigned long __init arch_syscall_addr(int nr) { if (nr >= __NR_N32_Linux && nr <= __NR_N32_Linux + __NR_N32_Linux_syscalls) return (unsigned long)sysn32_call_table[nr - __NR_N32_Linux]; if (nr >= __NR_64_Linux && nr <= __NR_64_Linux + __NR_64_Linux_syscalls) return (unsigned long)sys_call_table[nr - __NR_64_Linux]; if (nr >= __NR_O32_Linux && nr <= __NR_O32_Linux + __NR_O32_Linux_syscalls) return (unsigned long)sys32_call_table[nr - __NR_O32_Linux]; return (unsigned long) &sys_ni_syscall; } As a result when init_ftrace_syscalls() loops through all the possible syscall numbers, it first finds an O32 implementation, then N64 and finally N32. As the current code doesn't expect multiple references to a given syscall number, it always overrides the metadata with the last found - as a result only N32 syscalls are mapped. This is generally unexpected and wrong behaviour, and to makes things worse - since when N32 support is enabled, it overwrites N64 entries, it becomes impossible to trace native syscalls. > Also, you seem to be partially reinventing AUDIT_ARCH here. Can you > use that and integrate with syscall_get_arch()? Please correct me if I don't understand what you meant here, but I don't see how these can be integrated ... For MIPS syscall_get_arch() properly determines arch type and calling convention, but that information is not enough to determine what call was made and how to map it to syscall metadata from another calling convention. Marcin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall 2016-08-30 8:14 ` Marcin Nowakowski @ 2016-08-30 18:52 ` Andy Lutomirski 2016-08-30 19:29 ` Steven Rostedt 0 siblings, 1 reply; 18+ messages in thread From: Andy Lutomirski @ 2016-08-30 18:52 UTC (permalink / raw) To: Marcin Nowakowski Cc: Linux API, Ingo Molnar, Steven Rostedt, open list, Linux MIPS Mailing List On Tue, Aug 30, 2016 at 1:14 AM, Marcin Nowakowski <marcin.nowakowski@imgtec.com> wrote: > > > On 30.08.2016 01:55, Andy Lutomirski wrote: >> >> On Aug 29, 2016 11:30 AM, "Marcin Nowakowski" >> <marcin.nowakowski@imgtec.com> wrote: >>> >>> >>> Syscall metadata makes an assumption that only a single syscall number >>> corresponds to a given method. This is true for most archs, but >>> can break tracing otherwise. >>> >>> For MIPS platforms, depending on the choice of supported ABIs, up to 3 >>> system call numbers can correspond to the same call - depending on which >>> ABI the userspace app uses. >> >> >> MIPS isn't special here. x86 does the same thing. Why isn't this a >> problem on x86? >> > > Hi Andy, > > My understanding is that MIPS is quite different to what most other > architectures do ... > First of all x86 disables tracing of compat syscalls as that didn't work > properly because of wrong mapping of syscall numbers to syscalls: > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f431b634f > > Moreover, when trace_syscalls is initialised, the syscall metadata is > updated to include the right syscall numbers. That uses arch_syscall_addr > method, which has a default implementation in kernel/trace/trace_syscalls.c: > > unsigned long __init __weak arch_syscall_addr(int nr) > { > return (unsigned long)sys_call_table[nr]; > } > > that works for x86 and only uses 'native' syscalls, ie. for x86_64 will not > map any of the ia32_sys_call_table entries. So on one hand we have the code > that disables tracing for x86_64 compat, on the other we only ensure that > the native calls are mapped. > It is quite different for MIPS where syscall numbers for different ABIs have > distinct call numbers, so the following code maps the syscalls > (for O32 -> 4xxx, N64 -> 5xxx, N32 -> 6xxx): x86 has that, too. There are three types of x86 syscalls: i386 (AUDIT_ARCH_I386, low nr), x86_64 (AUDIT_ARCH_X86_64, low nr, nr can overlap i386 with differnt meanings), and x32 (AUDIT_ARCH_X86_64, high nr). > > unsigned long __init arch_syscall_addr(int nr) > { > if (nr >= __NR_N32_Linux && nr <= __NR_N32_Linux + > __NR_N32_Linux_syscalls) > return (unsigned long)sysn32_call_table[nr - > __NR_N32_Linux]; > if (nr >= __NR_64_Linux && nr <= __NR_64_Linux + > __NR_64_Linux_syscalls) > return (unsigned long)sys_call_table[nr - __NR_64_Linux]; > if (nr >= __NR_O32_Linux && nr <= __NR_O32_Linux + > __NR_O32_Linux_syscalls) > return (unsigned long)sys32_call_table[nr - __NR_O32_Linux]; > return (unsigned long) &sys_ni_syscall; > } > > As a result when init_ftrace_syscalls() loops through all the possible > syscall numbers, it first finds an O32 implementation, then N64 and finally > N32. As the current code doesn't expect multiple references to a given > syscall number, it always overrides the metadata with the last found - as a > result only N32 syscalls are mapped. Okay, I think I see what's going on. init_ftrace_syscalls() does: meta = find_syscall_meta(addr); Unless I'm missing some reason why this is a sensible thing to do, this seems overcomplicated and incorrect. There is exactly one caller of find_syscall_meta() and that caller knows the syscall number. Why doesn't it just look up the metadata by *number* instead of by syscall implementation address? There are plenty of architectures for which multiple logically different syscalls can share an implementation (e.g. pretty much everything that calls in_compat_syscall()). Can't this be radically simplified by just calling syscall_nr_to_meta() instead and deleting find_syscall_meta()? Or is there some reason that it makes sense for one syscall_metadata to have multiple syscalls nrs? (Also, keep in mind that, on x86, the nr is insufficient to identify the syscall. You really need to know both nr and arch to identify the syscall, so sticking an array of syscall nrs somewhere doesn't accurately express the x86 situation.) --Andy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall 2016-08-30 18:52 ` Andy Lutomirski @ 2016-08-30 19:29 ` Steven Rostedt [not found] ` <20160830152955.17633511-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Steven Rostedt @ 2016-08-30 19:29 UTC (permalink / raw) To: Andy Lutomirski Cc: Marcin Nowakowski, Linux API, Ingo Molnar, open list, Linux MIPS Mailing List On Tue, 30 Aug 2016 11:52:39 -0700 Andy Lutomirski <luto@amacapital.net> wrote: > Okay, I think I see what's going on. init_ftrace_syscalls() does: > > meta = find_syscall_meta(addr); > > Unless I'm missing some reason why this is a sensible thing to do, > this seems overcomplicated and incorrect. There is exactly one caller > of find_syscall_meta() and that caller knows the syscall number. Why > doesn't it just look up the metadata by *number* instead of by syscall > implementation address? There are plenty of architectures for which > multiple logically different syscalls can share an implementation > (e.g. pretty much everything that calls in_compat_syscall()). The problem is that the meta data is created at the syscalls themselves. Look at all the macro magic in include/linux/syscalls.h, and search for __syscall_metadata. The meta data is created via linker magic, and the find_syscall_meta() is what finds a specific system call and the meta data associated with it. Then it can use the number to system call mapping. Yes, this code needs some loving. -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20160830152955.17633511-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <20160830152955.17633511-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> @ 2016-08-30 19:53 ` Andy Lutomirski 2016-08-30 20:58 ` Steven Rostedt [not found] ` <CALCETrXA0XbYuqYzWMJA+KjFS31YL0cTVtrvCnmRY_GMK6oNpw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 2 replies; 18+ messages in thread From: Andy Lutomirski @ 2016-08-30 19:53 UTC (permalink / raw) To: Steven Rostedt Cc: Marcin Nowakowski, Linux API, Ingo Molnar, open list, Linux MIPS Mailing List On Tue, Aug 30, 2016 at 12:29 PM, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote: > On Tue, 30 Aug 2016 11:52:39 -0700 > Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > > >> Okay, I think I see what's going on. init_ftrace_syscalls() does: >> >> meta = find_syscall_meta(addr); >> >> Unless I'm missing some reason why this is a sensible thing to do, >> this seems overcomplicated and incorrect. There is exactly one caller >> of find_syscall_meta() and that caller knows the syscall number. Why >> doesn't it just look up the metadata by *number* instead of by syscall >> implementation address? There are plenty of architectures for which >> multiple logically different syscalls can share an implementation >> (e.g. pretty much everything that calls in_compat_syscall()). > > The problem is that the meta data is created at the syscalls > themselves. Look at all the macro magic in include/linux/syscalls.h, > and search for __syscall_metadata. The meta data is created via linker > magic, and the find_syscall_meta() is what finds a specific system call > and the meta data associated with it. Egads! OK, I see why this is a mess. I guess we should be creating the metadata from the syscall tables instead of from the syscall definitions, but I guess that's currently a nasty per-arch mess. Could we at least have an array of (arch, nr) instead of just an array of nrs in the metadata? --Andy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall 2016-08-30 19:53 ` Andy Lutomirski @ 2016-08-30 20:58 ` Steven Rostedt [not found] ` <20160830165830.5e494c43-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> [not found] ` <CALCETrXA0XbYuqYzWMJA+KjFS31YL0cTVtrvCnmRY_GMK6oNpw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 18+ messages in thread From: Steven Rostedt @ 2016-08-30 20:58 UTC (permalink / raw) To: Andy Lutomirski Cc: Marcin Nowakowski, Linux API, Ingo Molnar, open list, Linux MIPS Mailing List On Tue, 30 Aug 2016 12:53:53 -0700 Andy Lutomirski <luto@amacapital.net> wrote: > Egads! OK, I see why this is a mess. :-) > > I guess we should be creating the metadata from the syscall tables > instead of from the syscall definitions, but I guess that's currently > a nasty per-arch mess. Yep. > > Could we at least have an array of (arch, nr) instead of just an array > of nrs in the metadata? I guess I'm not following you on what would be used for "arch". -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20160830165830.5e494c43-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <20160830165830.5e494c43-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> @ 2016-08-30 21:45 ` Andy Lutomirski 2016-08-30 22:03 ` Steven Rostedt 0 siblings, 1 reply; 18+ messages in thread From: Andy Lutomirski @ 2016-08-30 21:45 UTC (permalink / raw) To: Steven Rostedt Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List, Marcin Nowakowski On Aug 30, 2016 1:58 PM, "Steven Rostedt" <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote: > > On Tue, 30 Aug 2016 12:53:53 -0700 > Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > > > > Egads! OK, I see why this is a mess. > > :-) > > > > > I guess we should be creating the metadata from the syscall tables > > instead of from the syscall definitions, but I guess that's currently > > a nasty per-arch mess. > > Yep. > I wonder: could more of it be dynamically allocated? I.e. statically generate metadata with args and name and whatever but without any nr. Then dynamically allocate the map from nr to metadata? > > > > Could we at least have an array of (arch, nr) instead of just an array > > of nrs in the metadata? > > I guess I'm not following you on what would be used for "arch". Whatever syscall_get_arch() would return for the syscall. For x86, for example, most syscalls have a compat nr and a non-compat nr. How does tracing currently handle that? --Andy ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall 2016-08-30 21:45 ` Andy Lutomirski @ 2016-08-30 22:03 ` Steven Rostedt [not found] ` <20160830180328.4e579db3-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Steven Rostedt @ 2016-08-30 22:03 UTC (permalink / raw) To: Andy Lutomirski Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List, Marcin Nowakowski On Tue, 30 Aug 2016 14:45:05 -0700 Andy Lutomirski <luto@amacapital.net> wrote: > I wonder: could more of it be dynamically allocated? I.e. statically > generate metadata with args and name and whatever but without any nr. > Then dynamically allocate the map from nr to metadata? Any ideas on how to do that? > > > > > > > Could we at least have an array of (arch, nr) instead of just an array > > > of nrs in the metadata? > > > > I guess I'm not following you on what would be used for "arch". > > Whatever syscall_get_arch() would return for the syscall. For x86, > for example, most syscalls have a compat nr and a non-compat nr. How > does tracing currently handle that? We currently disable tracing compat syscalls. What the current code does, is that the macro and linker magic creates a list of meta data structures, that have a name attached to them. Then on boot up, we scan the list of syscall numbers and then ask the arch for the system call they represent to get the actual function itself: addr = arch_syscall_addr(i); where 'i' is the system call nr. Then the find_syscall_meta(addr) will do a ksyms_lookup to convert the addr into the system call name, and then search the meta data for one that has that name attached to it. Yes it is ugly. But we don't currently have a method to automatically match the meta data with the system call numbers. The system call macros only have access to the names and the parameters, not the numbers that are associated with them. -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20160830180328.4e579db3-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <20160830180328.4e579db3-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> @ 2016-08-30 22:08 ` Andy Lutomirski [not found] ` <CALCETrWjpcKqFHvxS35Csd3An1QNMXW8yiHChuWfuWTvVu8_ig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Andy Lutomirski @ 2016-08-30 22:08 UTC (permalink / raw) To: Steven Rostedt Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List, Marcin Nowakowski On Tue, Aug 30, 2016 at 3:03 PM, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote: > On Tue, 30 Aug 2016 14:45:05 -0700 > Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > >> I wonder: could more of it be dynamically allocated? I.e. statically >> generate metadata with args and name and whatever but without any nr. >> Then dynamically allocate the map from nr to metadata? > > Any ideas on how to do that? This might be as simple as dropping the syscall_nr field from syscall_metadata. I admit I'm not familiar with this code at all, but I'm not really sure why that field is needed. init_ftrace_syscalls is already dynamically allocating an array that maps nr to metadata, and I don't see what in the code actually needs that mapping to be one-to-one or needs the reverse mapping. > >> >> > > >> > > Could we at least have an array of (arch, nr) instead of just an array >> > > of nrs in the metadata? >> > >> > I guess I'm not following you on what would be used for "arch". >> >> Whatever syscall_get_arch() would return for the syscall. For x86, >> for example, most syscalls have a compat nr and a non-compat nr. How >> does tracing currently handle that? > > We currently disable tracing compat syscalls. > > What the current code does, is that the macro and linker magic creates > a list of meta data structures, that have a name attached to them. > > Then on boot up, we scan the list of syscall numbers and then ask the > arch for the system call they represent to get the actual function > itself: > > addr = arch_syscall_addr(i); > > where 'i' is the system call nr. > > Then the find_syscall_meta(addr) will do a ksyms_lookup to convert the > addr into the system call name, and then search the meta data for one > that has that name attached to it. > > Yes it is ugly. But we don't currently have a method to automatically > match the meta data with the system call numbers. The system call > macros only have access to the names and the parameters, not the > numbers that are associated with them. > Yeah, I think I get it now. But I think my suggestion of removing syscall_nr entirely might actually work. You'd have to initialize more than one syscalls_metadata array, but they could share the underlying metadata objects. --Andy -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CALCETrWjpcKqFHvxS35Csd3An1QNMXW8yiHChuWfuWTvVu8_ig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <CALCETrWjpcKqFHvxS35Csd3An1QNMXW8yiHChuWfuWTvVu8_ig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-08-30 22:30 ` Steven Rostedt [not found] ` <20160830183030.3e9f67f0-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Steven Rostedt @ 2016-08-30 22:30 UTC (permalink / raw) To: Andy Lutomirski Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List, Marcin Nowakowski On Tue, 30 Aug 2016 15:08:19 -0700 Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > On Tue, Aug 30, 2016 at 3:03 PM, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote: > > On Tue, 30 Aug 2016 14:45:05 -0700 > > Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > > > >> I wonder: could more of it be dynamically allocated? I.e. statically > >> generate metadata with args and name and whatever but without any nr. > >> Then dynamically allocate the map from nr to metadata? > > > > Any ideas on how to do that? > > This might be as simple as dropping the syscall_nr field from > syscall_metadata. I admit I'm not familiar with this code at all, but > I'm not really sure why that field is needed. init_ftrace_syscalls is > already dynamically allocating an array that maps nr to metadata, and > I don't see what in the code actually needs that mapping to be > one-to-one or needs the reverse mapping. The issue is that the syscall trace points are called by a single location, that passes in the syscall_nr, and we need a way to map that syscall_nr to the metadata. System calls are really a meta tracepoint. They share a single real tracepoint called raw_syscalls:sys_enter and raw_syscalls:sys_exit. When you enable a system call like sys_enter_read, what really happens is that the sys_enter tracepoint is attached with a function called ftrace_syscall_enter(). This calls trace_get_syscall_nr(current, regs), to extract the actual syscall_nr that was called. This is used to find the "file" that is mapped to the system call (the tracefs file that enabled the system call). trace_file = tr->enter_syscall_files[syscall_nr]; And the meta data (what is used to tell us what to save) is found with the syscall_nr_to_meta() function. Now the metadata is used to extract the arguments of the system call: syscall_get_arguments(current, regs, 0, sys_data->nb_args, etnry->args); As well as the size needed. There's no need to map syscall meta to nr, we need a way to map the nr to the syscall metadata, and when there's more than a one to one mapping, we need a way to differentiate that in the raw syscall tracepoints. -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20160830183030.3e9f67f0-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <20160830183030.3e9f67f0-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> @ 2016-08-30 23:09 ` Andy Lutomirski [not found] ` <CALCETrWx+1Pdob8mU_X1hOWPWqj31ihVL3Z1R0PqjVeExZ_HAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 18+ messages in thread From: Andy Lutomirski @ 2016-08-30 23:09 UTC (permalink / raw) To: Steven Rostedt Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List, Marcin Nowakowski On Tue, Aug 30, 2016 at 3:30 PM, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote: > On Tue, 30 Aug 2016 15:08:19 -0700 > Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > >> On Tue, Aug 30, 2016 at 3:03 PM, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote: >> > On Tue, 30 Aug 2016 14:45:05 -0700 >> > Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: >> > >> >> I wonder: could more of it be dynamically allocated? I.e. statically >> >> generate metadata with args and name and whatever but without any nr. >> >> Then dynamically allocate the map from nr to metadata? >> > >> > Any ideas on how to do that? >> >> This might be as simple as dropping the syscall_nr field from >> syscall_metadata. I admit I'm not familiar with this code at all, but >> I'm not really sure why that field is needed. init_ftrace_syscalls is >> already dynamically allocating an array that maps nr to metadata, and >> I don't see what in the code actually needs that mapping to be >> one-to-one or needs the reverse mapping. > > The issue is that the syscall trace points are called by a single > location, that passes in the syscall_nr, and we need a way to map that > syscall_nr to the metadata. > > System calls are really a meta tracepoint. They share a single real > tracepoint called raw_syscalls:sys_enter and raw_syscalls:sys_exit. > When you enable a system call like sys_enter_read, what really happens > is that the sys_enter tracepoint is attached with a function called > ftrace_syscall_enter(). > > This calls trace_get_syscall_nr(current, regs), to extract the actual > syscall_nr that was called. This is used to find the "file" that is > mapped to the system call (the tracefs file that enabled the system > call). > > trace_file = tr->enter_syscall_files[syscall_nr]; > > And the meta data (what is used to tell us what to save) is found with > the syscall_nr_to_meta() function. > > Now the metadata is used to extract the arguments of the system call: > > syscall_get_arguments(current, regs, 0, sys_data->nb_args, > etnry->args); > > As well as the size needed. > > There's no need to map syscall meta to nr, we need a way to map the nr > to the syscall metadata, and when there's more than a one to one > mapping, we need a way to differentiate that in the raw syscall > tracepoints. But none of this should be a problem at all for MIPS, right? AFAICT the only problem for MIPS is that there *is* a mapping from metadata to nr. If that mapping got removed, MIPS should just work, right? For x86 compat, I think that adding arch should be sufficient. Specifically, rather than having just one enter_syscall_files array, have one per audit arch. Then call syscall_get_arch() as well as syscall_get_nr() and use both to lookup the metadata. AFAIK this should work on all architectures, although you might need some arch helpers to enumerate all the arches and their respective syscall tables (and max syscall nrs). --Andy ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CALCETrWx+1Pdob8mU_X1hOWPWqj31ihVL3Z1R0PqjVeExZ_HAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <CALCETrWx+1Pdob8mU_X1hOWPWqj31ihVL3Z1R0PqjVeExZ_HAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-08-30 23:28 ` Steven Rostedt 2016-08-31 0:01 ` Andy Lutomirski 2016-08-31 7:00 ` Marcin Nowakowski 0 siblings, 2 replies; 18+ messages in thread From: Steven Rostedt @ 2016-08-30 23:28 UTC (permalink / raw) To: Andy Lutomirski Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List, Marcin Nowakowski On Tue, 30 Aug 2016 16:09:04 -0700 Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote: > But none of this should be a problem at all for MIPS, right? AFAICT > the only problem for MIPS is that there *is* a mapping from metadata > to nr. If that mapping got removed, MIPS should just work, right? Wait, where's the mapping of metadata to nr. I don't see that, nor do I see a need for that. The issue is that we have metadata that expresses how to record a syscall, and we map syscall nr to metadata, because when tracing is active, the only thing we have to find that metadata is the syscall nr. Now if a syscall nr has more than one way to record (a single nr for multiple syscalls), then we get into trouble. That's why we have trouble with compat syscalls. The same number maps to different syscalls, and we don't know how to differentiate that. > > For x86 compat, I think that adding arch should be sufficient. > Specifically, rather than having just one enter_syscall_files array, > have one per audit arch. Then call syscall_get_arch() as well as > syscall_get_nr() and use both to lookup the metadata. AFAIK this > should work on all architectures, although you might need some arch > helpers to enumerate all the arches and their respective syscall > tables (and max syscall nrs). OK, if the regs can get us to the arch, then this might work. That is, perhaps we can have multiple tables (not really sure how to make that happen in an arch agnostic way), and then have two functions: trace_get_syscall_nr(current, regs) trace_get_syscall_arch(current, regs) Although, that "arch" may be confusing. -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall 2016-08-30 23:28 ` Steven Rostedt @ 2016-08-31 0:01 ` Andy Lutomirski 2016-08-31 14:08 ` Marcin Nowakowski 2016-08-31 7:00 ` Marcin Nowakowski 1 sibling, 1 reply; 18+ messages in thread From: Andy Lutomirski @ 2016-08-31 0:01 UTC (permalink / raw) To: Steven Rostedt Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List, Marcin Nowakowski On Tue, Aug 30, 2016 at 4:28 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > On Tue, 30 Aug 2016 16:09:04 -0700 > Andy Lutomirski <luto@amacapital.net> wrote: > >> But none of this should be a problem at all for MIPS, right? AFAICT >> the only problem for MIPS is that there *is* a mapping from metadata >> to nr. If that mapping got removed, MIPS should just work, right? > > Wait, where's the mapping of metadata to nr. I don't see that, nor do I > see a need for that. The issue is that we have metadata that expresses > how to record a syscall, and we map syscall nr to metadata, because > when tracing is active, the only thing we have to find that metadata is > the syscall nr. It's in init_ftrace_syscalls(): meta->syscall_nr = i; and everything that uses that. I think that this is the main problem that the patch that started this thread changes, and I think that deleting it would be cleaner than this patch. > > Now if a syscall nr has more than one way to record (a single nr for > multiple syscalls), then we get into trouble. That's why we have > trouble with compat syscalls. The same number maps to different > syscalls, and we don't know how to differentiate that. > > >> >> For x86 compat, I think that adding arch should be sufficient. >> Specifically, rather than having just one enter_syscall_files array, >> have one per audit arch. Then call syscall_get_arch() as well as >> syscall_get_nr() and use both to lookup the metadata. AFAIK this >> should work on all architectures, although you might need some arch >> helpers to enumerate all the arches and their respective syscall >> tables (and max syscall nrs). > > OK, if the regs can get us to the arch, then this might work. > > That is, perhaps we can have multiple tables (not really sure how to > make that happen in an arch agnostic way), and then have two functions: > > trace_get_syscall_nr(current, regs) > trace_get_syscall_arch(current, regs) Sadly, syscall_get_arch() doesn't take a regs parameter -- it looks at current. If it were made more general, it would need a task pointer, not a regs pointer, but would just looking at current be okay for tracing? syscall_get_arch() does work on all archs that support seccomp filters, though. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall 2016-08-31 0:01 ` Andy Lutomirski @ 2016-08-31 14:08 ` Marcin Nowakowski 0 siblings, 0 replies; 18+ messages in thread From: Marcin Nowakowski @ 2016-08-31 14:08 UTC (permalink / raw) To: Andy Lutomirski, Steven Rostedt Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List On 31.08.2016 02:01, Andy Lutomirski wrote: > On Tue, Aug 30, 2016 at 4:28 PM, Steven Rostedt <rostedt@goodmis.org> wrote: >> On Tue, 30 Aug 2016 16:09:04 -0700 >> Andy Lutomirski <luto@amacapital.net> wrote: >> >>> But none of this should be a problem at all for MIPS, right? AFAICT >>> the only problem for MIPS is that there *is* a mapping from metadata >>> to nr. If that mapping got removed, MIPS should just work, right? >> >> Wait, where's the mapping of metadata to nr. I don't see that, nor do I >> see a need for that. The issue is that we have metadata that expresses >> how to record a syscall, and we map syscall nr to metadata, because >> when tracing is active, the only thing we have to find that metadata is >> the syscall nr. > > It's in init_ftrace_syscalls(): > > meta->syscall_nr = i; > > and everything that uses that. I think that this is the main problem > that the patch that started this thread changes, and I think that > deleting it would be cleaner than this patch. > >> >> Now if a syscall nr has more than one way to record (a single nr for >> multiple syscalls), then we get into trouble. That's why we have >> trouble with compat syscalls. The same number maps to different >> syscalls, and we don't know how to differentiate that. > >> >> >>> >>> For x86 compat, I think that adding arch should be sufficient. >>> Specifically, rather than having just one enter_syscall_files array, >>> have one per audit arch. Then call syscall_get_arch() as well as >>> syscall_get_nr() and use both to lookup the metadata. AFAIK this >>> should work on all architectures, although you might need some arch >>> helpers to enumerate all the arches and their respective syscall >>> tables (and max syscall nrs). >> >> OK, if the regs can get us to the arch, then this might work. >> >> That is, perhaps we can have multiple tables (not really sure how to >> make that happen in an arch agnostic way), and then have two functions: >> >> trace_get_syscall_nr(current, regs) >> trace_get_syscall_arch(current, regs) > > Sadly, syscall_get_arch() doesn't take a regs parameter -- it looks at > current. If it were made more general, it would need a task pointer, > not a regs pointer, but would just looking at current be okay for > tracing? I think using current should be enough here? Anyway in all other cases where syscall_get_arch is used, any other methods that require a task pointer take it from current as well. > syscall_get_arch() does work on all archs that support seccomp filters, though. And as it happens I think it works on all archs that have COMPAT and that pose any problems here? So for any arch that doesn't support syscall_get_arch it should be safe to have a fallback that defaults to indicating a single arch type. However, it seems to me that the method used currently for mapping the syscall metadata to the syscall numbers wouldn't be possible when the extra dimension of arch type is added? Would it require some sort of metadata pointer table generation at build time? If all archs used the same method for building syscall tables (as suggested by Arnd) then we could have a single build-time method for adding the syscall metadata map to the syscall tables? (as all the symbol names are known at that time and could be coalesced to find the metadata location from a syscall number/arch and name)... Marcin ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall 2016-08-30 23:28 ` Steven Rostedt 2016-08-31 0:01 ` Andy Lutomirski @ 2016-08-31 7:00 ` Marcin Nowakowski 1 sibling, 0 replies; 18+ messages in thread From: Marcin Nowakowski @ 2016-08-31 7:00 UTC (permalink / raw) To: Steven Rostedt, Andy Lutomirski Cc: Linux API, Ingo Molnar, open list, Linux MIPS Mailing List On 31.08.2016 01:28, Steven Rostedt wrote: > On Tue, 30 Aug 2016 16:09:04 -0700 > Andy Lutomirski <luto@amacapital.net> wrote: > >> But none of this should be a problem at all for MIPS, right? AFAICT >> the only problem for MIPS is that there *is* a mapping from metadata >> to nr. If that mapping got removed, MIPS should just work, right? > > Wait, where's the mapping of metadata to nr. I don't see that, nor do I > see a need for that. The issue is that we have metadata that expresses > how to record a syscall, and we map syscall nr to metadata, because > when tracing is active, the only thing we have to find that metadata is > the syscall nr. > > Now if a syscall nr has more than one way to record (a single nr for > multiple syscalls), then we get into trouble. That's why we have > trouble with compat syscalls. The same number maps to different > syscalls, and we don't know how to differentiate that. Generally it looks like the MIPS case is more simple than other arches - as the syscall numbers for different ABIs are not overlapping, so when trying to get syscall metadata the number is sufficient. It's just a case that the current code doesn't handle correctly (and just to makes things worse, while the compat syscalls are not supported at the moment for any arch, because of how the syscalls are numbered, tracing for native 64-bit code is currently broken). But I agree with all the earlier comments that the current solution isn't very nice - and even though my patch would solve the issue for MIPS, all the existing issues would still remain elsewhere. >> >> For x86 compat, I think that adding arch should be sufficient. >> Specifically, rather than having just one enter_syscall_files array, >> have one per audit arch. Then call syscall_get_arch() as well as >> syscall_get_nr() and use both to lookup the metadata. AFAIK this >> should work on all architectures, although you might need some arch >> helpers to enumerate all the arches and their respective syscall >> tables (and max syscall nrs). > > OK, if the regs can get us to the arch, then this might work. > > That is, perhaps we can have multiple tables (not really sure how to > make that happen in an arch agnostic way), and then have two functions: > > trace_get_syscall_nr(current, regs) > trace_get_syscall_arch(current, regs) > > Although, that "arch" may be confusing. > > -- Steve > ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <CALCETrXA0XbYuqYzWMJA+KjFS31YL0cTVtrvCnmRY_GMK6oNpw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall [not found] ` <CALCETrXA0XbYuqYzWMJA+KjFS31YL0cTVtrvCnmRY_GMK6oNpw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2016-08-31 8:24 ` Arnd Bergmann 2016-09-01 15:24 ` Steven Rostedt 0 siblings, 1 reply; 18+ messages in thread From: Arnd Bergmann @ 2016-08-31 8:24 UTC (permalink / raw) To: Andy Lutomirski Cc: Steven Rostedt, Marcin Nowakowski, Linux API, Ingo Molnar, open list, Linux MIPS Mailing List On Tuesday, August 30, 2016 12:53:53 PM CEST Andy Lutomirski wrote: > Egads! OK, I see why this is a mess. > > I guess we should be creating the metadata from the syscall tables > instead of from the syscall definitions, but I guess that's currently > a nasty per-arch mess. > I've been thinking for a while about how to improve the situation around adding new syscalls, which currently involves adding a number and an entry in a .S file on most architectures (some already have their own method to simplify it, and others using a shared table in asm-generic). I was thinking of extending the x86 way of doing this to all architectures, and adding a way to have all future syscalls require only one addition in a single file that gets included by the architecture specific files for the existing syscalls. Assuming we do this, would that work for generating the metadata from the same file like we do with arch/x86/entry/syscalls/syscall{tbl,hdr}.sh ? Arnd ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall 2016-08-31 8:24 ` Arnd Bergmann @ 2016-09-01 15:24 ` Steven Rostedt 0 siblings, 0 replies; 18+ messages in thread From: Steven Rostedt @ 2016-09-01 15:24 UTC (permalink / raw) To: Arnd Bergmann Cc: Andy Lutomirski, Marcin Nowakowski, Linux API, Ingo Molnar, open list, Linux MIPS Mailing List On Wed, 31 Aug 2016 10:24:56 +0200 Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org> wrote: > On Tuesday, August 30, 2016 12:53:53 PM CEST Andy Lutomirski wrote: > > Egads! OK, I see why this is a mess. > > > > I guess we should be creating the metadata from the syscall tables > > instead of from the syscall definitions, but I guess that's currently > > a nasty per-arch mess. > > > > I've been thinking for a while about how to improve the situation > around adding new syscalls, which currently involves adding a number > and an entry in a .S file on most architectures (some already have > their own method to simplify it, and others using a shared table > in asm-generic). > > I was thinking of extending the x86 way of doing this to all > architectures, and adding a way to have all future syscalls require > only one addition in a single file that gets included by the > architecture specific files for the existing syscalls. > > Assuming we do this, would that work for generating the metadata > from the same file like we do with > arch/x86/entry/syscalls/syscall{tbl,hdr}.sh ? I can't answer this because I'm not sure exactly how you would do this. Perhaps you could give it a try and code will be the answer to all my questions ;-) -- Steve ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2016-09-01 15:24 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-29 9:30 [PATCH 1/2] tracing/syscalls: allow multiple syscall numbers per syscall Marcin Nowakowski [not found] ` <1472463007-6469-1-git-send-email-marcin.nowakowski-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org> 2016-08-29 23:55 ` Andy Lutomirski [not found] ` <CALCETrW1m9ozck-ugX6AKnL7oNA8rvMTjhFGqtVSvKL9BMXMZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-08-30 8:14 ` Marcin Nowakowski 2016-08-30 18:52 ` Andy Lutomirski 2016-08-30 19:29 ` Steven Rostedt [not found] ` <20160830152955.17633511-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> 2016-08-30 19:53 ` Andy Lutomirski 2016-08-30 20:58 ` Steven Rostedt [not found] ` <20160830165830.5e494c43-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> 2016-08-30 21:45 ` Andy Lutomirski 2016-08-30 22:03 ` Steven Rostedt [not found] ` <20160830180328.4e579db3-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> 2016-08-30 22:08 ` Andy Lutomirski [not found] ` <CALCETrWjpcKqFHvxS35Csd3An1QNMXW8yiHChuWfuWTvVu8_ig-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-08-30 22:30 ` Steven Rostedt [not found] ` <20160830183030.3e9f67f0-f9ZlEuEWxVcJvu8Pb33WZ0EMvNT87kid@public.gmane.org> 2016-08-30 23:09 ` Andy Lutomirski [not found] ` <CALCETrWx+1Pdob8mU_X1hOWPWqj31ihVL3Z1R0PqjVeExZ_HAA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-08-30 23:28 ` Steven Rostedt 2016-08-31 0:01 ` Andy Lutomirski 2016-08-31 14:08 ` Marcin Nowakowski 2016-08-31 7:00 ` Marcin Nowakowski [not found] ` <CALCETrXA0XbYuqYzWMJA+KjFS31YL0cTVtrvCnmRY_GMK6oNpw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2016-08-31 8:24 ` Arnd Bergmann 2016-09-01 15:24 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).