[kernel-hardening] GSoC midterm accomplishments

public inbox for kernel-hardening@lists.openwall.com
 help / color / mirror / Atom feed

* [kernel-hardening] GSoC midterm accomplishments
@ 2011-07-19  9:27 Vasiliy Kulikov
  2011-07-22 14:49 ` [kernel-hardening] kernel infoleaks (was: GSoC midterm accomplishments) Vasiliy Kulikov
  0 siblings, 1 reply; 3+ messages in thread
From: Vasiliy Kulikov @ 2011-07-19  9:27 UTC (permalink / raw)
  To: kernel-hardening

Hi,

This is a summary what is done and what is to be done for upstream kernel
hardening during my GSoC participation.

[+] means success (applied or to be applied)
[-] means failure
[...] means "patch needs reworking" or additional patches are required
[F] means "didn't try, but will work on it"

HARDEN_VM86 [-]

Currently it lacks a proper configuration mechanism.  Given more generic
feature (slightly different, though), seccomp, was rejected, there is no
way it is applied in the current form.  New thoughts are needed.

HARDEN_PROC [...]

The first try was unsuccessful as it is not enough generic.  More
generic list of world-r/w files was suggested, I'll try implement this
variant.  In the future it can be extended by the feature suggested by
Andrew Morton, chmod'able and chown'able /proc/PID/* files inheritable
via fork (this specific feature seems unrelated to our goal).

Another interface of getting proc information, taskstats, should be
restricted too.  However, given procfs restriction is not yes/no
anymore, taskstats and procfs restrictions have to be separated.  Also
there is a little chance taskstats will be restricted by default
(recalling Linus' complains).

As to networking restrictions, NETLINK_INET_DIAG netlink socket should
be restricted too.  I don't know yet how to define these restrictions
consistent way, though (for procfs and netlink).

While stating the usefulness of the feature, 2 security bugs were fixed
- taskstats local DoS and taskstats/procfs io infoleak which could be
used to learn e.g. alien's password length.  (The taskstats fix is
pending, however.)

Also discussions on LKML generated some new thoughts in my mind, see
below.

HARDEN_SHM [+]

The code was accepted into -mm tree of Andrew Morton and even got some
feedback about optimizations.  I hope it will be accepted into Linux 3.1.

HARDEN_EXECVE [+]

This feature was accepted by Linus (surprise for me).  It is not yet
applied, pending...

HARDEN_STACK{,_SMART} [...]

I've posted the current issues with the feature as the separate thread.
The way kernel and glibc handle GNU_STACK it is not as simple as just say
"kernel, enforce NX stack".

GRKERNSEC_SYSFS_RESTRICT [-/...]

sysfs umask=/gid= mount options were rejected by GregKH as provided threats
are not impressive enough.  Probably we need to state the threats more
clearly or identify other threats like side channel attacks or learning
some information that might be used to adjust future attack's direction.

Also debugfs mount options introduction worth trying.  As GregKH states,
"there are no rules in debugfs", so breaking something via debugfs
restriction is probably not a big deal (unless it is perf :-).

GRKERNSEC_MODHARDEN [F]

I didn't try to push it yet.  To do so, I have to ping Dan to identify
what has blocked its appliance.

GRKERNSEC_SOCKET [F]

Didn't try it yet.  I'll try to implement it, but it is a low priority
task for me.

PAX_USERCOPY [...]

RFCv1 was NAK'ed by Linus as a crazy thing.  I wrongly thought that a
"release early" is a good strategy for the kernel :)  v2 got some
feedback, which I'll address soon.

PAX_MPROTECT [F]

This is really worth trying to push.

PAX_KERNEXEC [+/...]

I'll try to push some minor bits of it (I'll have no time for learning
this part of PaX, it is much more complicated than USERCOPY/MPROTECT) -
at least struct *ops constification parts.

PAX_REFCOUNT [F]

I'll try to push it.  At least it is a source of clarification what
atomic_t is for ;-)

printk() user supplied strings [+/F]

One solution of the issue was accepted by Linus - %s is for 7bit ASCII
strings only without control characters.  The fix needs identifying and
patching %s data with \n inside and UTF-8 strings (the latter needs
introducing new format string without filtering).

With the help of coccinelle I hope I'll finish it.

Solar recommended to create a wiki page to track accomplishment and
anchor links to LKML discussions, I'll create it soon.

Now about newly created tasks.  While thinking about arguments to
agitate to restrict world access to procfs files and taskstats, I've
identified some sources of possible infoleaks that could be used in side
channel attacks.  The files/interfaces are as follows:

/proc/PID/{limits,sched_*,stat,statm,status,wchan}
inotify_add_watch(2)
ustat(2)
*statfs(2)
*statvfs(2)
sysinfo(2)

I didn't precisely investigate in what situations these infoleaks might
be useful to an attacker, but I found some cases where inotify
disclosures somewhat private information.  I'll post about it in a few
days when I adjust my thoughts.

Thanks,

-- 
Vasiliy

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [kernel-hardening] kernel infoleaks (was: GSoC midterm accomplishments)
  2011-07-19  9:27 [kernel-hardening] GSoC midterm accomplishments Vasiliy Kulikov
@ 2011-07-22 14:49 ` Vasiliy Kulikov
  2011-07-28 10:41   ` [kernel-hardening] Re: kernel infoleaks Vasiliy Kulikov
  0 siblings, 1 reply; 3+ messages in thread
From: Vasiliy Kulikov @ 2011-07-22 14:49 UTC (permalink / raw)
  To: kernel-hardening

Hi,

On Tue, Jul 19, 2011 at 13:27 +0400, Vasiliy Kulikov wrote:
> Now about newly created tasks.  While thinking about arguments to
> agitate to restrict world access to procfs files and taskstats, I've
> identified some sources of possible infoleaks that could be used in side
> channel attacks.  The files/interfaces are as follows:
> 
> /proc/PID/{limits,sched_*,stat,statm,status,wchan}
> inotify_add_watch(2)
> ustat(2)
> *statfs(2)
> *statvfs(2)
> sysinfo(2)
> 
> I didn't precisely investigate in what situations these infoleaks might
> be useful to an attacker, but I found some cases where inotify
> disclosures somewhat private information.  I'll post about it in a few
> days when I adjust my thoughts.

This is a follow up of kernel possible infoleaks.  I've divided them
into groups.

* procfs

Historically almost all /proc/PID/* files are readable by all users, IMO
without actual need.

- sched_* files might be used to simplify timing attacks.  "classical"
  timing attack would measure the time delta, but such measurement might
  be smashed by a scheduler.  Here the kernel grants already measured
  numbers.

- status reveals memory usage.  It might reveal whether a mmap() is
  done, how much stacks was used, how much memory is locked.  If
  malloc() expands the heap, it is visible too.

  Also I think the knowledge of task's capabilities is something other
  users should not care of (the same for limits).

- stat, statm reveal process' times and rss.

- mountinfo, mounts might reveal path information of private namespaces.

* inotify_add_watch(2)

>From the manpage: "The inotify API provides a mechanism for monitoring
file system events.".  It allows users to monitor fs changes (e.g. for
re-indexing) and accesses.  I see 2 issues here: 

1) While fs changes are monitor'able via getdents(2)/*stat(2), it is a
poll'able mechanism and it is exposured to races (unlike inotify
delivery "for sure").  If it is *known* that some fs activity exposes
some private information then inotify simplifies gathering this
information.  Surely it depends on scheduler load, disk load, number of
files in the directory, etc. etc.  But if the event is very rare (e.g.
a daily/weekly cron job) even the 20x decrease of race win chance is
good.

2) Inotify exposes information not gather'able (AFAICS) via other means:
file reads, file writes, the file descriptor associated with the file is
closed (and closing of RO fd and RW fd are different events).

Some ways to (ab)use inotify:

- If there is a PAM module with "requisite" control field, the
  following modules read some /etc/ files and the directories where
  these files are located are readable by a user, he may learn that this
  specific requisite module failed.  This might be a /etc/pam.d/
  misconfiguration though.

- If there is a PAM module that checks user's authority against 2 files
  sequentially, then watching for accesses of the second file reveals
  information whether the first check failed (similar to requisite).
  This might be a PAM module infoleak though, which is probably
  identifiable via time measurement.

- Watching for /etc/passwd and /etc/.pwd.lock might reveal information
  whether a user changes his password.  It is not inotify specific, the
  same can be learned via stat'ing the lock file.  (Note: watching for
  passwd process in /proc/ is not sufficient as a user is able to
  terminate passwd without actual pass change.)

- Watching for /dev/null opening/closing may reveal whether significant
  events happened (e.g. privilege dropping).  I couldn't find any such
  event that is not visible via procfs (euid change).

- Watching for /lib/ reveals DSO usage.  It differs from $PATH
  running binaries monitoring as the latter is identifiable via
  /proc/PID/cmdline.  If DSO is used for handling specific file type (e.g.
  media/compression format), the information that such file is opened is
  revealed.

- Watching for / reveals root's "ls /root/".

- Watching for /var/run/screen/ can be used to monitor "screen -r"
  events.  Poll variant is still procfs.

- Bash uses /etc/bash_completion.d/* to initialize completion engine at
  the start time.  However, some db files can be used for actual
  completion.  Watching for these db files reveals user's will to run
  this command (compared to /proc/pid/cmdline it happens _before_ the
  command is run and even if it is not run at all).

- Watching for /tmp/ may reveal private (not accessable by world)
  /tmp/*/ directories activity.

* ustat(2), statfs(2), statvfs(2).

It's possible to learn the precise free inodes number and free blocks
number.  It's possible to call statvfs() in a loop and get somewhat
precise information about other users' activity.  If there are 2 users
logged in, one may learn other user created/removed files number and how
much data (rounded to a block size) he did removed/added (the mistake is
daemons' acitivity, but anyway).  On SMP it's possible to get every
inode creation/deletion event information.

* sysinfo(2).

The same as *stat*, but now with free memory.  Also it is related to
kernel activity, so if there is a correlation of a significant memory
allocation and a private event, the event might be disclosured.

Suggestions about what to do with these things or how they can be abused
another way are welcomed.

Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [kernel-hardening] Re: kernel infoleaks
  2011-07-22 14:49 ` [kernel-hardening] kernel infoleaks (was: GSoC midterm accomplishments) Vasiliy Kulikov
@ 2011-07-28 10:41   ` Vasiliy Kulikov
  0 siblings, 0 replies; 3+ messages in thread
From: Vasiliy Kulikov @ 2011-07-28 10:41 UTC (permalink / raw)
  To: kernel-hardening; +Cc: security

(CC'ed security@kernel.org.  This is not a high severity and security@
might already know about it, but better safe than sorry.)

On Fri, Jul 22, 2011 at 18:49 +0400, Vasiliy Kulikov wrote:
> Hi,
> 
> On Tue, Jul 19, 2011 at 13:27 +0400, Vasiliy Kulikov wrote:
> > Now about newly created tasks.  While thinking about arguments to
> > agitate to restrict world access to procfs files and taskstats, I've
> > identified some sources of possible infoleaks that could be used in side
> > channel attacks.  The files/interfaces are as follows:
> > 
> > /proc/PID/{limits,sched_*,stat,statm,status,wchan}
> > inotify_add_watch(2)
> > ustat(2)
> > *statfs(2)
> > *statvfs(2)
> > sysinfo(2)
> > 
> > I didn't precisely investigate in what situations these infoleaks might
> > be useful to an attacker, but I found some cases where inotify
> > disclosures somewhat private information.  I'll post about it in a few
> > days when I adjust my thoughts.
> 
> This is a follow up of kernel possible infoleaks.  I've divided them
> into groups.
> 
> * procfs
> 
> Historically almost all /proc/PID/* files are readable by all users, IMO
> without actual need.
> 
> 
> - sched_* files might be used to simplify timing attacks.  "classical"
>   timing attack would measure the time delta, but such measurement might
>   be smashed by a scheduler.  Here the kernel grants already measured
>   numbers.
> 
> - status reveals memory usage.  It might reveal whether a mmap() is
>   done, how much stacks was used, how much memory is locked.  If
>   malloc() expands the heap, it is visible too.
> 
>   Also I think the knowledge of task's capabilities is something other
>   users should not care of (the same for limits).
> 
> - stat, statm reveal process' times and rss.
> 
> - mountinfo, mounts might reveal path information of private namespaces.
> 
> 
> * inotify_add_watch(2)
> 
> From the manpage: "The inotify API provides a mechanism for monitoring
> file system events.".  It allows users to monitor fs changes (e.g. for
> re-indexing) and accesses.  I see 2 issues here: 
> 
> 1) While fs changes are monitor'able via getdents(2)/*stat(2), it is a
> poll'able mechanism and it is exposured to races (unlike inotify
> delivery "for sure").  If it is *known* that some fs activity exposes
> some private information then inotify simplifies gathering this
> information.  Surely it depends on scheduler load, disk load, number of
> files in the directory, etc. etc.  But if the event is very rare (e.g.
> a daily/weekly cron job) even the 20x decrease of race win chance is
> good.
> 
> 2) Inotify exposes information not gather'able (AFAICS) via other means:
> file reads, file writes, the file descriptor associated with the file is
> closed (and closing of RO fd and RW fd are different events).
> 
> Some ways to (ab)use inotify:
> 
> - If there is a PAM module with "requisite" control field, the
>   following modules read some /etc/ files and the directories where
>   these files are located are readable by a user, he may learn that this
>   specific requisite module failed.  This might be a /etc/pam.d/
>   misconfiguration though.
> 
> - If there is a PAM module that checks user's authority against 2 files
>   sequentially, then watching for accesses of the second file reveals
>   information whether the first check failed (similar to requisite).
>   This might be a PAM module infoleak though, which is probably
>   identifiable via time measurement.
> 
> - Watching for /etc/passwd and /etc/.pwd.lock might reveal information
>   whether a user changes his password.  It is not inotify specific, the
>   same can be learned via stat'ing the lock file.  (Note: watching for
>   passwd process in /proc/ is not sufficient as a user is able to
>   terminate passwd without actual pass change.)
> 
> - Watching for /dev/null opening/closing may reveal whether significant
>   events happened (e.g. privilege dropping).  I couldn't find any such
>   event that is not visible via procfs (euid change).
> 
> - Watching for /lib/ reveals DSO usage.  It differs from $PATH
>   running binaries monitoring as the latter is identifiable via
>   /proc/PID/cmdline.  If DSO is used for handling specific file type (e.g.
>   media/compression format), the information that such file is opened is
>   revealed.
> 
> - Watching for / reveals root's "ls /root/".
> 
> - Watching for /var/run/screen/ can be used to monitor "screen -r"
>   events.  Poll variant is still procfs.
> 
> - Bash uses /etc/bash_completion.d/* to initialize completion engine at
>   the start time.  However, some db files can be used for actual
>   completion.  Watching for these db files reveals user's will to run
>   this command (compared to /proc/pid/cmdline it happens _before_ the
>   command is run and even if it is not run at all).
> 
> - Watching for /tmp/ may reveal private (not accessable by world)
>   /tmp/*/ directories activity.
> 
> 
> * ustat(2), statfs(2), statvfs(2).
> 
> It's possible to learn the precise free inodes number and free blocks
> number.  It's possible to call statvfs() in a loop and get somewhat
> precise information about other users' activity.  If there are 2 users
> logged in, one may learn other user created/removed files number and how
> much data (rounded to a block size) he did removed/added (the mistake is
> daemons' acitivity, but anyway).  On SMP it's possible to get every
> inode creation/deletion event information.
> 
> * sysinfo(2).
> 
> The same as *stat*, but now with free memory.  Also it is related to
> kernel activity, so if there is a correlation of a significant memory
> allocation and a private event, the event might be disclosured.
> 
> 
> Suggestions about what to do with these things or how they can be abused
> another way are welcomed.

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-07-28 10:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-19  9:27 [kernel-hardening] GSoC midterm accomplishments Vasiliy Kulikov
2011-07-22 14:49 ` [kernel-hardening] kernel infoleaks (was: GSoC midterm accomplishments) Vasiliy Kulikov
2011-07-28 10:41   ` [kernel-hardening] Re: kernel infoleaks Vasiliy Kulikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox