From: Andrei Vagin <avagin@gmail.com>
To: Jann Horn <jannh@google.com>
Cc: Dmitry Safonov <dima@arista.com>,
kernel list <linux-kernel@vger.kernel.org>,
Adrian Reber <adrian@lisas.de>, Andrei Vagin <avagin@openvz.org>,
Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Christian Brauner <christian.brauner@ubuntu.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Dmitry Safonov <0x7f454c46@gmail.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Jeff Dike <jdike@addtoit.com>, Oleg Nesterov <oleg@redhat.com>,
Pavel Emelyanov <xemul@virtuozzo.com>,
Shuah Khan <shuah@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
containers@lists.linux-foundation.org, criu@openvz.org,
Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCHv3 19/27] timens/fs/proc: Introduce /proc/pid/timens_offsets
Date: Wed, 1 May 2019 23:08:43 -0700 [thread overview]
Message-ID: <20190502060842.GA4608@gmail.com> (raw)
In-Reply-To: <CAG48ez1ws9qXyNHfScY1RoajG=pquFe4y9QpM1c_xtPSeC2SNA@mail.gmail.com>
Hi Jann,
Thank you for the review. Here are a few comments inline.
On Thu, Apr 25, 2019 at 08:16:41PM +0200, Jann Horn wrote:
> On Thu, Apr 25, 2019 at 6:15 PM Dmitry Safonov <dima@arista.com> wrote:
> > API to set time namespace offsets for children processes, i.e.:
> > echo "clockid off_ses off_nsec" > /proc/self/timens_offsets
> [...]
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 6a803a0b75df..76d58e9b5178 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> [...]
> > @@ -1521,6 +1522,103 @@ static const struct file_operations proc_pid_sched_autogroup_operations = {
> >
> > #endif /* CONFIG_SCHED_AUTOGROUP */
> >
> > +#ifdef CONFIG_TIME_NS
> > +static int timens_offsets_show(struct seq_file *m, void *v)
> > +{
> > + struct inode *inode = m->private;
> > + struct task_struct *p;
> > +
> > + p = get_proc_task(inode);
>
> (FYI, this could also be "p = get_proc_task(file_inode(m->file));".
> But this works, too.)
>
> > + if (!p)
> > + return -ESRCH;
> > + proc_timens_show_offsets(p, m);
> > +
> > + put_task_struct(p);
> > +
> > + return 0;
> > +}
> > +
> > +static ssize_t
> > +timens_offsets_write(struct file *file, const char __user *buf,
> > + size_t count, loff_t *ppos)
> > +{
> > + struct inode *inode = file_inode(file);
> > + struct proc_timens_offset offsets[2];
> > + char *kbuf = NULL, *pos, *next_line;
> > + struct task_struct *p;
> > + int ret, noffsets;
> > +
> > + /* Only allow < page size writes at the beginning of the file */
> > + if ((*ppos != 0) || (count >= PAGE_SIZE))
> > + return -EINVAL;
> > +
> > + /* Slurp in the user data */
> > + kbuf = memdup_user_nul(buf, count);
> > + if (IS_ERR(kbuf))
> > + return PTR_ERR(kbuf);
> > +
> > + /* Parse the user data */
> > + ret = -EINVAL;
> > + noffsets = 0;
> > + pos = kbuf;
> > + for (; pos; pos = next_line) {
> > + struct proc_timens_offset *off = &offsets[noffsets];
> > + int err;
> > +
> > + /* Find the end of line and ensure I don't look past it */
> > + next_line = strchr(pos, '\n');
> > + if (next_line) {
> > + *next_line = '\0';
> > + next_line++;
> > + if (*next_line == '\0')
> > + next_line = NULL;
> > + }
> > +
> > + err = sscanf(pos, "%u %lld %lu", &off->clockid,
> > + &off->val.tv_sec, &off->val.tv_nsec);
> > + if (err != 3 || off->val.tv_nsec >= NSEC_PER_SEC)
> > + goto out;
> > + if (noffsets++ == ARRAY_SIZE(offsets))
> > + break;
>
> This is equivalent to:
>
> if (noffsets == ARRAY_SIZE(offsets)) {
noffsets++;
> break;
}
> noffsets++;
>
> So we can reach the start of the loop with
> noffsets==ARRAY_SIZE(offsets), right? Which means that an
> out-of-bounds write can happen?
good catch. I will add a test for this case.
>
> I think that for code like this, it makes sense to write the increment
> and the check out separately so that it's easier to spot problems;
> e.g. like this:
>
> noffsets++;
> if (noffsets == ARRAY_SIZE(offsets))
> break;
will rewrite this way. Thanks!
>
> > + }
> > +
> > + ret = -ESRCH;
> > + p = get_proc_task(inode);
> > + if (!p)
> > + goto out;
> > + ret = proc_timens_set_offset(p, offsets, noffsets);
> > + put_task_struct(p);
> > + if (ret)
> > + goto out;
> > +
> > + ret = count;
> > +out:
> > + kfree(kbuf);
> > + return ret;
> > +}
> > +
> > +static int timens_offsets_open(struct inode *inode, struct file *filp)
> > +{
> > + int ret;
> > +
> > + ret = single_open(filp, timens_offsets_show, NULL);
> > + if (!ret) {
> > + struct seq_file *m = filp->private_data;
> > +
> > + m->private = inode;
> > + }
> > + return ret;
> > +}
>
> Why did you write it like this? Wouldn't the following be equivalent?
Probably I looked at sched_autogroup_open
>
> static int timens_offsets_open(struct inode *inode, struct file *file)
> {
> return single_open(file, timens_offsets_show, inode);
> }
>
> (But also, you can reach the inode of a seq_file as file_inode(m->file).)
>
> [...]
> > diff --git a/kernel/time_namespace.c b/kernel/time_namespace.c
> > index e806accc4eaf..9ad4b63c4ed2 100644
> > --- a/kernel/time_namespace.c
> > +++ b/kernel/time_namespace.c
> [...]
> > +
> > +int proc_timens_set_offset(struct task_struct *p,
> > + struct proc_timens_offset *offsets, int noffsets)
> > +{
> > + struct ns_common *ns;
> > + struct time_namespace *time_ns;
> > + struct timens_offsets *ns_offsets;
> > + struct timespec64 *offset;
> > + struct timespec64 tp;
> > + int i, err;
> > +
> > + ns = timens_for_children_get(p);
> > + if (!ns)
> > + return -ESRCH;
> > + time_ns = to_time_ns(ns);
> > +
> > + if (!time_ns->offsets || time_ns->initialized ||
> > + !ns_capable(time_ns->user_ns, CAP_SYS_TIME)) {
>
> Capability checks in VFS read/write handlers are bad. Please pass
> through the file pointer to this function and replace the call with
> "file_ns_capable(file, time_ns->user_ns, CAP_SYS_TIME)".
Will fix. Thanks!
>
> > + put_time_ns(time_ns);
> > + return -EPERM;
> > + }
> [...]
> > +}
WARNING: multiple messages have this Message-ID (diff)
From: Andrei Vagin <avagin@gmail.com>
To: Jann Horn <jannh@google.com>
Cc: Dmitry Safonov <dima@arista.com>,
kernel list <linux-kernel@vger.kernel.org>,
Adrian Reber <adrian@lisas.de>, Andrei Vagin <avagin@openvz.org>,
Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Christian Brauner <christian.brauner@ubuntu.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Dmitry Safonov <0x7f454c46@gmail.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Jeff Dike <jdike@addtoit.com>, Oleg Nesterov <oleg@redhat.com>,
Pavel Emelyanov <xemul@virtuozzo.com>,
Shuah Khan <shuah@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Vincenzo Frascino <vincenzo.frascino@arm.com>,
containers@lists.linux-foundation.org, criu@openvz.org,
Linux API <linux-api@vger.kernel.org>,
the arch/x86 maintainers <x86@kernel.org>
Subject: Re: [PATCHv3 19/27] timens/fs/proc: Introduce /proc/pid/timens_offsets
Date: Wed, 1 May 2019 23:08:43 -0700 [thread overview]
Message-ID: <20190502060842.GA4608@gmail.com> (raw)
In-Reply-To: <CAG48ez1ws9qXyNHfScY1RoajG=pquFe4y9QpM1c_xtPSeC2SNA@mail.gmail.com>
Hi Jann,
Thank you for the review. Here are a few comments inline.
On Thu, Apr 25, 2019 at 08:16:41PM +0200, Jann Horn wrote:
> On Thu, Apr 25, 2019 at 6:15 PM Dmitry Safonov <dima@arista.com> wrote:
> > API to set time namespace offsets for children processes, i.e.:
> > echo "clockid off_ses off_nsec" > /proc/self/timens_offsets
> [...]
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 6a803a0b75df..76d58e9b5178 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> [...]
> > @@ -1521,6 +1522,103 @@ static const struct file_operations proc_pid_sched_autogroup_operations = {
> >
> > #endif /* CONFIG_SCHED_AUTOGROUP */
> >
> > +#ifdef CONFIG_TIME_NS
> > +static int timens_offsets_show(struct seq_file *m, void *v)
> > +{
> > + struct inode *inode = m->private;
> > + struct task_struct *p;
> > +
> > + p = get_proc_task(inode);
>
> (FYI, this could also be "p = get_proc_task(file_inode(m->file));".
> But this works, too.)
>
> > + if (!p)
> > + return -ESRCH;
> > + proc_timens_show_offsets(p, m);
> > +
> > + put_task_struct(p);
> > +
> > + return 0;
> > +}
> > +
> > +static ssize_t
> > +timens_offsets_write(struct file *file, const char __user *buf,
> > + size_t count, loff_t *ppos)
> > +{
> > + struct inode *inode = file_inode(file);
> > + struct proc_timens_offset offsets[2];
> > + char *kbuf = NULL, *pos, *next_line;
> > + struct task_struct *p;
> > + int ret, noffsets;
> > +
> > + /* Only allow < page size writes at the beginning of the file */
> > + if ((*ppos != 0) || (count >= PAGE_SIZE))
> > + return -EINVAL;
> > +
> > + /* Slurp in the user data */
> > + kbuf = memdup_user_nul(buf, count);
> > + if (IS_ERR(kbuf))
> > + return PTR_ERR(kbuf);
> > +
> > + /* Parse the user data */
> > + ret = -EINVAL;
> > + noffsets = 0;
> > + pos = kbuf;
> > + for (; pos; pos = next_line) {
> > + struct proc_timens_offset *off = &offsets[noffsets];
> > + int err;
> > +
> > + /* Find the end of line and ensure I don't look past it */
> > + next_line = strchr(pos, '\n');
> > + if (next_line) {
> > + *next_line = '\0';
> > + next_line++;
> > + if (*next_line == '\0')
> > + next_line = NULL;
> > + }
> > +
> > + err = sscanf(pos, "%u %lld %lu", &off->clockid,
> > + &off->val.tv_sec, &off->val.tv_nsec);
> > + if (err != 3 || off->val.tv_nsec >= NSEC_PER_SEC)
> > + goto out;
> > + if (noffsets++ == ARRAY_SIZE(offsets))
> > + break;
>
> This is equivalent to:
>
> if (noffsets == ARRAY_SIZE(offsets)) {
noffsets++;
> break;
}
> noffsets++;
>
> So we can reach the start of the loop with
> noffsets==ARRAY_SIZE(offsets), right? Which means that an
> out-of-bounds write can happen?
good catch. I will add a test for this case.
>
> I think that for code like this, it makes sense to write the increment
> and the check out separately so that it's easier to spot problems;
> e.g. like this:
>
> noffsets++;
> if (noffsets == ARRAY_SIZE(offsets))
> break;
will rewrite this way. Thanks!
>
> > + }
> > +
> > + ret = -ESRCH;
> > + p = get_proc_task(inode);
> > + if (!p)
> > + goto out;
> > + ret = proc_timens_set_offset(p, offsets, noffsets);
> > + put_task_struct(p);
> > + if (ret)
> > + goto out;
> > +
> > + ret = count;
> > +out:
> > + kfree(kbuf);
> > + return ret;
> > +}
> > +
> > +static int timens_offsets_open(struct inode *inode, struct file *filp)
> > +{
> > + int ret;
> > +
> > + ret = single_open(filp, timens_offsets_show, NULL);
> > + if (!ret) {
> > + struct seq_file *m = filp->private_data;
> > +
> > + m->private = inode;
> > + }
> > + return ret;
> > +}
>
> Why did you write it like this? Wouldn't the following be equivalent?
Probably I looked at sched_autogroup_open
>
> static int timens_offsets_open(struct inode *inode, struct file *file)
> {
> return single_open(file, timens_offsets_show, inode);
> }
>
> (But also, you can reach the inode of a seq_file as file_inode(m->file).)
>
> [...]
> > diff --git a/kernel/time_namespace.c b/kernel/time_namespace.c
> > index e806accc4eaf..9ad4b63c4ed2 100644
> > --- a/kernel/time_namespace.c
> > +++ b/kernel/time_namespace.c
> [...]
> > +
> > +int proc_timens_set_offset(struct task_struct *p,
> > + struct proc_timens_offset *offsets, int noffsets)
> > +{
> > + struct ns_common *ns;
> > + struct time_namespace *time_ns;
> > + struct timens_offsets *ns_offsets;
> > + struct timespec64 *offset;
> > + struct timespec64 tp;
> > + int i, err;
> > +
> > + ns = timens_for_children_get(p);
> > + if (!ns)
> > + return -ESRCH;
> > + time_ns = to_time_ns(ns);
> > +
> > + if (!time_ns->offsets || time_ns->initialized ||
> > + !ns_capable(time_ns->user_ns, CAP_SYS_TIME)) {
>
> Capability checks in VFS read/write handlers are bad. Please pass
> through the file pointer to this function and replace the call with
> "file_ns_capable(file, time_ns->user_ns, CAP_SYS_TIME)".
Will fix. Thanks!
>
> > + put_time_ns(time_ns);
> > + return -EPERM;
> > + }
> [...]
> > +}
next prev parent reply other threads:[~2019-05-02 6:08 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-25 16:13 [PATCHv3 00/27] kernel: Introduce Time Namespace Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 01/27] ns: " Dmitry Safonov
2019-04-25 19:10 ` Jann Horn
2019-04-25 19:10 ` Jann Horn
2019-04-25 16:13 ` [PATCHv3 02/27] timens: Add timens_offsets Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 03/27] timens: Introduce CLOCK_MONOTONIC offsets Dmitry Safonov
2019-04-25 19:52 ` Thomas Gleixner
2019-04-25 16:13 ` [PATCHv3 04/27] timens: Introduce CLOCK_BOOTTIME offset Dmitry Safonov
2019-04-25 20:08 ` Cyrill Gorcunov
2019-04-25 16:13 ` [PATCHv3 05/27] timerfd/timens: Take into account ns clock offsets Dmitry Safonov
2019-04-25 21:28 ` Thomas Gleixner
2019-05-03 7:00 ` Andrei Vagin
2019-04-25 16:13 ` [PATCHv3 06/27] posix-timers/timens: Take into account " Dmitry Safonov
2019-04-25 21:45 ` Thomas Gleixner
2019-04-25 16:13 ` [PATCHv3 07/27] timens/kernel: Take into account timens clock offsets in clock_nanosleep Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 08/27] timens: Shift /proc/uptime Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 09/27] x86/vdso2c: Correct err messages on file opening Dmitry Safonov
2019-04-25 16:13 ` [PATCHv3 10/27] x86/vdso2c: Convert iterator to unsigned Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 11/27] x86/vdso/Makefile: Add vobjs32 Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 12/27] x86/vdso: Restrict splitting VVAR VMA Dmitry Safonov
2019-04-25 18:41 ` Jann Horn
2019-04-25 18:41 ` Jann Horn
2019-04-25 18:46 ` Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 13/27] x86/vdso: Rename vdso_image {.data=>.text} Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 14/27] x86/vdso: Add offsets page in vvar Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 15/27] x86/vdso: Allocate timens vdso Dmitry Safonov
2019-04-25 18:32 ` Jann Horn
2019-04-25 18:32 ` Jann Horn
2019-04-25 19:05 ` Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 16/27] x86/vdso: Switch image on setns()/unshare()/clone() Dmitry Safonov
2019-04-25 17:53 ` Jann Horn
2019-04-25 17:53 ` Jann Horn
2019-04-25 18:02 ` Dmitry Safonov
2019-04-25 18:02 ` Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 17/27] vdso: introduce timens_static_branch Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 18/27] timens: Add align for timens_offsets Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 19/27] timens/fs/proc: Introduce /proc/pid/timens_offsets Dmitry Safonov
2019-04-25 18:16 ` Jann Horn
2019-04-25 18:16 ` Jann Horn
2019-05-02 6:08 ` Andrei Vagin [this message]
2019-05-02 6:08 ` Andrei Vagin
2019-04-25 16:14 ` [PATCHv3 20/27] selftest/timens: Add Time Namespace test for supported clocks Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 21/27] selftest/timens: Add a test for timerfd Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 22/27] selftest/timens: Add a test for clock_nanosleep() Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 23/27] selftest/timens: Add procfs selftest Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 24/27] selftest/timens: Add timer offsets test Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 25/27] x86/vdso: Align VDSO functions by CPU L1 cache line Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 26/27] selftests: Add a simple perf test for clock_gettime() Dmitry Safonov
2019-04-25 16:14 ` [PATCHv3 27/27] selftest/timens: Check that a right vdso is mapped after fork and exec Dmitry Safonov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190502060842.GA4608@gmail.com \
--to=avagin@gmail.com \
--cc=0x7f454c46@gmail.com \
--cc=adrian@lisas.de \
--cc=arnd@arndb.de \
--cc=avagin@openvz.org \
--cc=christian.brauner@ubuntu.com \
--cc=containers@lists.linux-foundation.org \
--cc=criu@openvz.org \
--cc=dima@arista.com \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=hpa@zytor.com \
--cc=jannh@google.com \
--cc=jdike@addtoit.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=vincenzo.frascino@arm.com \
--cc=xemul@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.