linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Cyrill Gorcunov <gorcunov@gmail.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	Alexey Dobriyan <adobriyan@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Pavel Emelyanov <xemul@parallels.com>,
	Andrey Vagin <avagin@openvz.org>, Ingo Molnar <mingo@elte.hu>,
	Thomas Gleixner <tglx@linutronix.de>,
	Glauber Costa <glommer@parallels.com>,
	Andi Kleen <andi@firstfloor.org>, Tejun Heo <tj@kernel.org>,
	Matt Helsley <matthltc@us.ibm.com>,
	Pekka Enberg <penberg@kernel.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Vasiliy Kulikov <segoon@openwall.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Valdis.Kletnieks@vt.edu
Subject: Re: [RFC] syscalls, x86: Add __NR_kcmp syscall
Date: Wed, 18 Jan 2012 12:01:03 +0400	[thread overview]
Message-ID: <20120118080103.GA2889@moon> (raw)
In-Reply-To: <m1obu29fnf.fsf@fess.ebiederm.org>

On Tue, Jan 17, 2012 at 01:35:00PM -0800, Eric W. Biederman wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
> > On 01/17/2012 06:44 AM, Cyrill Gorcunov wrote:
> >> On Tue, Jan 17, 2012 at 04:38:14PM +0200, Alexey Dobriyan wrote:
> >>> On 1/17/12, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> >>>> +#define KCMP_EQ		0
> >>>> +#define KCMP_LT		1
> >>>> +#define KCMP_GT		2
> >>>
> >>> LT and GT are meaningless.
> >>>
> >> 
> >> I found symbolic names better than open-coded values. But sure,
> >> if this is problem it could be dropped.
> >> 
> >> Or you mean that in general anything but 'equal' is useless?
> >> 
> >
> > Why on Earth would user space need to know which order in memory certain
> > kernel objects are?
> 
> For checkpoint restart and for some other kinds of introspection what is
> needed is a comparison function to see if two processes share the same
> object.  The most interesting of these objects from a checkpoint restart case
> are file descriptors, and there can be a lot of file descriptors.
> 
> The order in memory does not matter.  What does matter is that the
> comparison function return some ordering between objects.  The algorithm
> for figuring out of N items which of them are duplicates is O(N^2) if
> the comparison function can only return equal or not equal.  The
> algorithm for finding duplications is only O(NlogN) if the comparison
> function will return an ordering among the objects.
> 

Yes, thanks Eric, I missed this text in patch description, my bad. And
yes, performance will degrade with plain eq/ne approach. But as Pavel
stated in another email

 | We can compare the e.g. files' target inodes (ino + dev) and positions and
 | comparing each-to-each only for those having these pairs equal. Looking at
 | the existing large containers with tens thousands of fd-s we have this
 | gives us maximum 6 files to compare, and performing 15 syscalls for this suits
 | us for now.

> > Keep in mind that this is *exactly* the kind of information which makes
> > rootkits easier.
> 
> I would be very surprised if basic in memory ordering information was
> not already available from simple creation ordering.
> 

I think Peter means the scenario where we say have some bug in slab/slub
code which happens on say some Nth allocation and attacker somehow reveal
at least one memory address of struct file, then using such syscall an
attacker might inspect a series of fd (and associated struct file) and guess
which addresses the rest of "struct file" are. In most cases this wont help
(if a system is under more/less high load and open/close files fast enough
 'cause "struct file" comes from kmem caches) but on some non-heavy loaded
machine this might do a trick and narrow addresses (if say there only 10
fds which allocated from cache in a row and you somehow know address of
one associated struct file).

In short -- I don't know if it's indeed really serious issue or not
(since from my POV it'll require at least a couple of bugs in a row
 to happen before the attacker might use this information). OTOH, shit
happens exactly in 'impossible' scenarios ;)

> If using the in memory ordering is a problem in practice there are a lot
> of other possible ways to order the kernel objects.  Allocating sequence
> numbers for the kernel objects, passing the pointers through a
> cryptographically secure hash before comparing them, etc.
> 

We've been trying this already ;)

> It does look like Cyrill's patch description lacked the important bit of
> information about the algorithm complexity requiring an ordering among
> kernel objects.  Cyrill you probably want to describe more prominently
> what is happening now and why in your patch description rather than give
> the history of different approaches.
> 

Yeah, i'll write detailed change log, gimme some time. Thanks Eric!

Btw, extending this syscall to lt/ge variant will be easy, so this is
not a problem I think. At moment we guarantee to return 0/1 on succes,
and < 0 on error, so if we start returing 2/3 in a sake of ordering
the applications which were using only 0/1 values wont crash (if they
are not crappy written ones).

	Cyrill

  reply	other threads:[~2012-01-18  8:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-17 14:27 [RFC] syscalls, x86: Add __NR_kcmp syscall Cyrill Gorcunov
2012-01-17 14:38 ` Alexey Dobriyan
2012-01-17 14:44   ` Cyrill Gorcunov
2012-01-17 18:47     ` H. Peter Anvin
2012-01-17 21:15       ` Cyrill Gorcunov
2012-01-17 21:40         ` Eric W. Biederman
2012-01-18  5:07           ` Pavel Emelyanov
2012-01-17 21:35       ` Eric W. Biederman
2012-01-18  8:01         ` Cyrill Gorcunov [this message]
2012-01-18  9:12           ` KOSAKI Motohiro
2012-01-18  9:19             ` Pavel Emelyanov
2012-01-18  9:23               ` KOSAKI Motohiro
2012-01-18 11:57                 ` Cyrill Gorcunov
2012-01-18 16:46                   ` KOSAKI Motohiro
2012-01-18 17:20                     ` Cyrill Gorcunov
2012-01-18 22:05         ` david
2012-01-18 22:49           ` Cyrill Gorcunov
2012-01-18 23:29             ` Eric W. Biederman
2012-01-19  6:55               ` Cyrill Gorcunov
2012-01-20  3:16                 ` Eric W. Biederman
2012-01-20  8:40                   ` Cyrill Gorcunov
2012-01-20  9:02                     ` Cyrill Gorcunov
2012-01-20 14:51                       ` H. Peter Anvin
2012-01-20 16:29                         ` Cyrill Gorcunov
2012-01-20 16:57                           ` H. Peter Anvin
2012-01-20 18:19                             ` Cyrill Gorcunov
2012-01-20 18:22                               ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120118080103.GA2889@moon \
    --to=gorcunov@gmail.com \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=avagin@openvz.org \
    --cc=ebiederm@xmission.com \
    --cc=eric.dumazet@gmail.com \
    --cc=glommer@parallels.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthltc@us.ibm.com \
    --cc=mingo@elte.hu \
    --cc=penberg@kernel.org \
    --cc=segoon@openwall.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).