All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gregory Price <gregory.price@memverge.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Gregory Price <gourry.memverge@gmail.com>,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	oleg@redhat.com, avagin@gmail.com, peterz@infradead.org,
	luto@kernel.org, krisman@collabora.com, corbet@lwn.net,
	shuah@kernel.org
Subject: Re: [PATCH v9 1/1] ptrace,syscall_user_dispatch: checkpoint/restore support for SUD
Date: Mon, 13 Feb 2023 16:00:17 -0500	[thread overview]
Message-ID: <Y+qk4fkIph40KyDh@memverge.com> (raw)
In-Reply-To: <871qmttiqa.ffs@tglx>

On Mon, Feb 13, 2023 at 09:26:21PM +0100, Thomas Gleixner wrote:
> On Fri, Feb 10 2023 at 02:25, Gregory Price wrote:
> >  
> >  As the ABI of these intercepted syscalls is unknown to Linux, these
> > -syscalls are not instrumentable via ptrace or the syscall tracepoints.
> > +syscalls are not instrumentable via ptrace or the syscall tracepoints,
> > +however an interfaces to suspend, checkpoint, and restore syscall user
> > +dispatch configuration has been added to ptrace to assist userland
> > +checkpoint/restart software.
> 
> The above is incomprehensible word salad to me. Once the ptrace
> functions are there then they can be used independent of CRIU, no?
> 

The verbiage here is half-baked, I'll just break out a separate
paragraph to explain better (or drop entirely, if that's preferable).

Since SUD isn't really designed for anything other than syscall
emulation, there's not much of a use for these get/set interfaces
outside the context of checkpoint/restart.  GDB and friends are already
perfectly happy and capable of debugging SUD enabled software in the
absense of these interfaces and have no need to disable the feature.

> > + * struct ptrace_sud_config - Per-task configuration for SUD
> > + * @mode:	One of PR_SYS_DISPATCH_ON or PR_SYS_DISPATCH_OFF
> > + * @selector:	Tracee's user virtual address of SUD selector
> > + * @offset:	SUD exclusion area (virtual address)
> > + * @len:	Length of SUD exclusion area
> > + *
> > + * Used to get/set the syscall user dispatch configuration for tracee.
> > + * process.  Selector is optional (may be NULL), and if invalid will produce
> > + * a SIGSEGV in the tracee upon first access.
> > + *
> > + * If mode is PR_SYS_DISPATCH_ON, syscall dispatch will be enabled. If
> > + * PR_SYS_DISPATCH_OFF, syscall dispatch will be disabled and all other
> > + * parameters must be 0.  The value in *selector (if not null), also determines
> > + * whether syscall dispatch will occur.
> > + *
> > + * The SUD Exclusion area described by offset/len is the virtual address space
> > + * from which syscalls will not produce a user dispatch.
> > + */
> > +struct ptrace_sud_config {
> > +	__u64 mode;
> > +	__s8 *selector;
> 
> How is this correct for a 32bit ptracer running on a 64bit kernel? Aside
> of not wiring up the compat syscall without any argumentation in the
> changelog.
> 

you're right, these would need to be unsigned long/pointers, i will
take a closer look at how ptrace manages this elsewhere and come back.

> 
> > --- a/kernel/entry/syscall_user_dispatch.c
> > +++ b/kernel/entry/syscall_user_dispatch.c
> 
> This section:
> 
> > -int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
> > -			      unsigned long len, char __user *selector)
> > +static int task_set_syscall_user_dispatch(struct task_struct *task, unsigned long mode,
> > +		                          unsigned long offset, unsigned long len,
> > +		                          char __user *selector)
> >  {
> >  	switch (mode) {
> >  	case PR_SYS_DISPATCH_OFF:
> > @@ -94,15 +96,60 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
> >  		return -EINVAL;
> >  	}
> >  
> > -	current->syscall_dispatch.selector = selector;
> > -	current->syscall_dispatch.offset = offset;
> > -	current->syscall_dispatch.len = len;
> > -	current->syscall_dispatch.on_dispatch = false;
> > +	task->syscall_dispatch.selector = selector;
> > +	task->syscall_dispatch.offset = offset;
> > +	task->syscall_dispatch.len = len;
> > +	task->syscall_dispatch.on_dispatch = false;
> >  
> >  	if (mode == PR_SYS_DISPATCH_ON)
> > -		set_syscall_work(SYSCALL_USER_DISPATCH);
> > +		set_task_syscall_work(task, SYSCALL_USER_DISPATCH);
> >  	else
> > -		clear_syscall_work(SYSCALL_USER_DISPATCH);
> > +		clear_task_syscall_work(task, SYSCALL_USER_DISPATCH);
> >  
> >  	return 0;
> >  }
> > +
> > +int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
> > +		              unsigned long len, char __user *selector)
> > +{
> > +	return task_set_syscall_user_dispatch(current, mode, offset, len, selector);
> > +}
> 
> until here want's to be a seperate preparatory patch.
> 

I had considered this, but didn't know what was preferable, given that
there's not much reason to create the functions outside the context of
this patch.

Will do.

> > +++ b/tools/testing/selftests/ptrace/get_set_sud.c
> > +	child = fork();
> > +	ASSERT_GE(child, 0);
> > +	if (child == 0) {
> > +		ASSERT_EQ(0, sys_ptrace(PTRACE_TRACEME, 0, 0, 0)) {
> > +			TH_LOG("PTRACE_TRACEME: %m");
> > +		}
> > +		kill(getpid(), SIGSTOP);
> > +		sleep(2);
> 
> The purpose of this sleep is what?
> 

artifact of taking other tests as an outline, will drop it and rerun.

> > +		_exit(1);
> > +	}
> > +
> > +	waitpid(child, &status, 0);
> > +
> > +	config.mode = PR_SYS_DISPATCH_ON;
> > +	config.selector = (void*)0xDEADBEEF;
> > +	config.offset = 0x12345678;
> > +	config.len = 0x87654321;
> 
> What's the purpose of these magic numbers? memset(&config, 0xff,...) is
> sufficient, no?
> 
> Thanks,
> 
>         tglx

Nothing, will drop.

~Gregory

  reply	other threads:[~2023-02-13 21:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-10  7:25 [PATCH v9 0/1] Checkpoint Support for Syscall User Dispatch Gregory Price
2023-02-10  7:25 ` [PATCH v9 1/1] ptrace,syscall_user_dispatch: checkpoint/restore support for SUD Gregory Price
2023-02-13 20:26   ` Thomas Gleixner
2023-02-13 21:00     ` Gregory Price [this message]
2023-02-14  0:00     ` Gregory Price
2023-02-16 13:33       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y+qk4fkIph40KyDh@memverge.com \
    --to=gregory.price@memverge.com \
    --cc=avagin@gmail.com \
    --cc=corbet@lwn.net \
    --cc=gourry.memverge@gmail.com \
    --cc=krisman@collabora.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.