* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface [not found] <20150309204321.AAF412E0@viggo.jf.intel.com> @ 2015-03-09 21:31 ` Kees Cook [not found] ` <20150309204322.50DA6B5D@viggo.jf.intel.com> ` (2 subsequent siblings) 3 siblings, 0 replies; 14+ messages in thread From: Kees Cook @ 2015-03-09 21:31 UTC (permalink / raw) To: Dave Hansen Cc: Eric W. Biederman, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen On Mon, Mar 9, 2015 at 1:43 PM, Dave Hansen <dave@sr71.net> wrote: > > From: Dave Hansen <dave.hansen@linux.intel.com> > > Physical addresses are sensitive information. There are > existing, known exploits that are made easier if physical > information is available. Here is one example: > > http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > > If you know the physical address of something you also know at > which kernel virtual address you can find something (modulo > highmem). It means that things that keep the kernel from > accessing user mappings (like SMAP/SMEP) can be worked around > because the _kernel_ mapping can get used instead. > > But, /proc/$pid/pagemap exposes the physical addresses of all > pages accessible to userspace. This works against all of the > efforts to keep kernel addresses out of places where unprivileged > apps can find them. > > This patch introduces a "paranoid" option for /proc. It can be > enabled like this: > > mount -o remount,paranoid /proc > > Or when /proc is mounted initially. When 'paranoid' mode is > active, opens to /proc/$pid/pagemap will return -EPERM for users > without CAP_SYS_RAWIO. It can be disabled like this: > > mount -o remount,notparanoid /proc > > The option is applied to the pid namespace, so an app that wanted > a separate policy from the rest of the system could get run in > its own pid namespace. > > I'm not really that stuck on the name. I'm not opposed to making > it apply only to pagemap or to giving it a pagemap-specific > name. > > pagemap is also the kind of feature that could be used to escalate > privileged from root in to the kernel. It probably needs to be > protected in the same way that /dev/mem or module loading is in > cases where the kernel needs to be protected from root, thus the > choice to use CAP_SYS_RAWIO. > > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Seems reasonable. I would note that even CAP_SYS_RAWIO isn't enough to actually do anything with RAM in /dev/mem. That's entirely controlled by CONFIG_STRICT_DEVMEM. I think /proc/kpagecount and /proc/kpageflags should get filtered as well, instead of them relying on the uid=0 check. Reviewed-by: Kees Cook <keescook@chromium.org> -Kees > --- > > b/fs/proc/root.c | 10 +++++++++- > b/fs/proc/task_mmu.c | 11 +++++++++++ > b/include/linux/pid_namespace.h | 1 + > 3 files changed, 21 insertions(+), 1 deletion(-) > > diff -puN fs/proc/root.c~privileged-pagemap fs/proc/root.c > --- a/fs/proc/root.c~privileged-pagemap 2015-03-09 13:33:12.104796793 -0700 > +++ b/fs/proc/root.c 2015-03-09 13:33:12.111797109 -0700 > @@ -39,10 +39,12 @@ static int proc_set_super(struct super_b > } > > enum { > - Opt_gid, Opt_hidepid, Opt_err, > + Opt_gid, Opt_hidepid, Opt_paranoid, Opt_notparanoid, Opt_err, > }; > > static const match_table_t tokens = { > + {Opt_paranoid, "paranoid"}, > + {Opt_notparanoid, "notparanoid"}, > {Opt_hidepid, "hidepid=%u"}, > {Opt_gid, "gid=%u"}, > {Opt_err, NULL}, > @@ -70,6 +72,12 @@ static int proc_parse_options(char *opti > return 0; > pid->pid_gid = make_kgid(current_user_ns(), option); > break; > + case Opt_paranoid: > + pid->paranoid = 1; > + break; > + case Opt_notparanoid: > + pid->paranoid = 0; > + break; > case Opt_hidepid: > if (match_int(&args[0], &option)) > return 0; > diff -puN fs/proc/task_mmu.c~privileged-pagemap fs/proc/task_mmu.c > --- a/fs/proc/task_mmu.c~privileged-pagemap 2015-03-09 13:33:12.106796883 -0700 > +++ b/fs/proc/task_mmu.c 2015-03-09 13:33:12.112797154 -0700 > @@ -1322,9 +1322,20 @@ out: > > static int pagemap_open(struct inode *inode, struct file *file) > { > + struct pid_namespace *ns = inode->i_sb->s_fs_info; > + > pr_warn_once("Bits 55-60 of /proc/PID/pagemap entries are about " > "to stop being page-shift some time soon. See the " > "linux/Documentation/vm/pagemap.txt for details.\n"); > + > + /* > + * Use the RAWIO capability bit. If you can not go open > + * /dev/mem, then you also have no business knowing the > + * physical addresses of things. > + */ > + if (ns->paranoid && !capable(CAP_SYS_RAWIO)) > + return -EPERM; > + > return 0; > } > > diff -puN include/linux/pid_namespace.h~privileged-pagemap include/linux/pid_namespace.h > --- a/include/linux/pid_namespace.h~privileged-pagemap 2015-03-09 13:33:12.108796973 -0700 > +++ b/include/linux/pid_namespace.h 2015-03-09 13:33:12.112797154 -0700 > @@ -43,6 +43,7 @@ struct pid_namespace { > struct work_struct proc_work; > kgid_t pid_gid; > int hide_pid; > + int paranoid; > int reboot; /* group exit code if this pidns was rebooted */ > struct ns_common ns; > }; > _ -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 14+ messages in thread
[parent not found: <20150309204322.50DA6B5D@viggo.jf.intel.com>]
* Re: [RFC][PATCH 2/2] proc: config options for making privileged /proc the default [not found] ` <20150309204322.50DA6B5D@viggo.jf.intel.com> @ 2015-03-09 21:32 ` Kees Cook 0 siblings, 0 replies; 14+ messages in thread From: Kees Cook @ 2015-03-09 21:32 UTC (permalink / raw) To: Dave Hansen Cc: Eric W. Biederman, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen On Mon, Mar 9, 2015 at 1:43 PM, Dave Hansen <dave@sr71.net> wrote: > > From: Dave Hansen <dave.hansen@linux.intel.com> > > This is for folks where /proc is mounted very early or where it > is not convenient to go changing fstab everywhere. > > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> It'd be nice to have a way to do this in a more arbitrary fashion. For example, select a CONFIG to also add things like hidepid=2, paranoid, etc. -Kees > --- > > b/fs/proc/Kconfig | 17 +++++++++++++++++ > b/kernel/pid_namespace.c | 1 + > 2 files changed, 18 insertions(+) > > diff -puN fs/proc/Kconfig~privileged-pagemap-default-config fs/proc/Kconfig > --- a/fs/proc/Kconfig~privileged-pagemap-default-config 2015-03-09 13:32:23.628610423 -0700 > +++ b/fs/proc/Kconfig 2015-03-09 13:32:23.633610649 -0700 > @@ -71,3 +71,20 @@ config PROC_PAGE_MONITOR > /proc/pid/smaps, /proc/pid/clear_refs, /proc/pid/pagemap, > /proc/kpagecount, and /proc/kpageflags. Disabling these > interfaces will reduce the size of the kernel by approximately 4kb. > + > +config PROC_PARANOID_DEFAULT > + default y > + depends on PROC_FS > + bool "Enable paranoid /proc mount option by default" > + help > + Access to some sensitive /proc files is restricted when the > + "paranoid" mount option is specified: > + > + mount -o paranoid -t proc none /proc > + > + Enabling this config option will set the "paranoid" option > + by default on all /proc mounts. It may still be disabled at > + mount or remount time: > + > + mount -o remount,notparanoid -/proc > + > diff -puN kernel/pid_namespace.c~privileged-pagemap-default-config kernel/pid_namespace.c > --- a/kernel/pid_namespace.c~privileged-pagemap-default-config 2015-03-09 13:32:23.630610514 -0700 > +++ b/kernel/pid_namespace.c 2015-03-09 13:32:23.633610649 -0700 > @@ -115,6 +115,7 @@ static struct pid_namespace *create_pid_ > ns->parent = get_pid_ns(parent_pid_ns); > ns->user_ns = get_user_ns(user_ns); > ns->nr_hashed = PIDNS_HASH_ADDING; > + ns->paranoid = IS_ENABLED(CONFIG_PROC_PARANOID_DEFAULT); > INIT_WORK(&ns->proc_work, proc_cleanup_work); > > set_bit(0, ns->pidmap[0].page); > _ -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface [not found] <20150309204321.AAF412E0@viggo.jf.intel.com> 2015-03-09 21:31 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Kees Cook [not found] ` <20150309204322.50DA6B5D@viggo.jf.intel.com> @ 2015-03-09 22:13 ` Eric W. Biederman 2015-03-09 22:22 ` Kees Cook 2015-03-12 22:35 ` Andrew Morton 3 siblings, 1 reply; 14+ messages in thread From: Eric W. Biederman @ 2015-03-09 22:13 UTC (permalink / raw) To: Dave Hansen Cc: Andrew Morton, Kees Cook, tytso, Oleg Nesterov, linux-kernel, dave.hansen Dave Hansen <dave@sr71.net> writes: > From: Dave Hansen <dave.hansen@linux.intel.com> > > Physical addresses are sensitive information. There are > existing, known exploits that are made easier if physical > information is available. Here is one example: > > http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > > If you know the physical address of something you also know at > which kernel virtual address you can find something (modulo > highmem). It means that things that keep the kernel from > accessing user mappings (like SMAP/SMEP) can be worked around > because the _kernel_ mapping can get used instead. > > But, /proc/$pid/pagemap exposes the physical addresses of all > pages accessible to userspace. This works against all of the > efforts to keep kernel addresses out of places where unprivileged > apps can find them. > > This patch introduces a "paranoid" option for /proc. It can be > enabled like this: > > mount -o remount,paranoid /proc > > Or when /proc is mounted initially. When 'paranoid' mode is > active, opens to /proc/$pid/pagemap will return -EPERM for users > without CAP_SYS_RAWIO. It can be disabled like this: > > mount -o remount,notparanoid /proc > > The option is applied to the pid namespace, so an app that wanted > a separate policy from the rest of the system could get run in > its own pid namespace. > > I'm not really that stuck on the name. I'm not opposed to making > it apply only to pagemap or to giving it a pagemap-specific > name. > > pagemap is also the kind of feature that could be used to escalate > privileged from root in to the kernel. It probably needs to be > protected in the same way that /dev/mem or module loading is in > cases where the kernel needs to be protected from root, thus the > choice to use CAP_SYS_RAWIO. There is already a way to make pagemap go away. It is called CONFIG_PROC_PAGE_MONITOR. I suspect the right answer here is if you enable kernel address randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the two options conflict with each other. That is a lot less code and a lot less to maintain. On the other hand if this is truly a valuable interface that you can't part with we need an alternative to pagemaps that does the same job with out the exploit potential. And I don't how to do that. Arguing in favor of just making the options conflict is the fact that kernel address randomization is pretty much snake oil. At least on x86_64 the address pool is so small it can be trivially brute forced. I think there are maybe 10 bits you can randomize within. As for a way to disable this I expect it would do better with something like a set once flag that prevents a process and all of it's children from accessing this file. *Blink* *Blink* Did you say you are worried about escalting privileges from root into the kernel space. That is non-sense. We give root the power to shot themselves in the foot and any proc option will be something that root will be able to get around. The pieces of the patch description don't add up. Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-09 22:13 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Eric W. Biederman @ 2015-03-09 22:22 ` Kees Cook 2015-03-09 23:08 ` Eric W. Biederman 0 siblings, 1 reply; 14+ messages in thread From: Kees Cook @ 2015-03-09 22:22 UTC (permalink / raw) To: Eric W. Biederman Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Dave Hansen <dave@sr71.net> writes: > >> From: Dave Hansen <dave.hansen@linux.intel.com> >> >> Physical addresses are sensitive information. There are >> existing, known exploits that are made easier if physical >> information is available. Here is one example: >> >> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf >> >> If you know the physical address of something you also know at >> which kernel virtual address you can find something (modulo >> highmem). It means that things that keep the kernel from >> accessing user mappings (like SMAP/SMEP) can be worked around >> because the _kernel_ mapping can get used instead. >> >> But, /proc/$pid/pagemap exposes the physical addresses of all >> pages accessible to userspace. This works against all of the >> efforts to keep kernel addresses out of places where unprivileged >> apps can find them. >> >> This patch introduces a "paranoid" option for /proc. It can be >> enabled like this: >> >> mount -o remount,paranoid /proc >> >> Or when /proc is mounted initially. When 'paranoid' mode is >> active, opens to /proc/$pid/pagemap will return -EPERM for users >> without CAP_SYS_RAWIO. It can be disabled like this: >> >> mount -o remount,notparanoid /proc >> >> The option is applied to the pid namespace, so an app that wanted >> a separate policy from the rest of the system could get run in >> its own pid namespace. >> >> I'm not really that stuck on the name. I'm not opposed to making >> it apply only to pagemap or to giving it a pagemap-specific >> name. >> >> pagemap is also the kind of feature that could be used to escalate >> privileged from root in to the kernel. It probably needs to be >> protected in the same way that /dev/mem or module loading is in >> cases where the kernel needs to be protected from root, thus the >> choice to use CAP_SYS_RAWIO. > > > There is already a way to make pagemap go away. It is called > CONFIG_PROC_PAGE_MONITOR. > > I suspect the right answer here is if you enable kernel address > randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the > two options conflict with each other. It's not a good idea to make CONFIG options conflict with each other like this as it puts distros is a tricky spot to decide which to use. Allowing both and having a runtime flag of some kind tends to be the better option (e.g. kASLR vs Hibernation). > That is a lot less code and a lot less to maintain. > > On the other hand if this is truly a valuable interface that you can't > part with we need an alternative to pagemaps that does the same job > with out the exploit potential. And I don't how to do that. > > Arguing in favor of just making the options conflict is the fact that > kernel address randomization is pretty much snake oil. At least on > x86_64 the address pool is so small it can be trivially brute forced. I > think there are maybe 10 bits you can randomize within. > > As for a way to disable this I expect it would do better with something > like a set once flag that prevents a process and all of it's children > from accessing this file. > > *Blink* *Blink* Did you say you are worried about escalting privileges > from root into the kernel space. That is non-sense. We give root the > power to shot themselves in the foot and any proc option will be > something that root will be able to get around. > > The pieces of the patch description don't add up. No, that's an entirely valid use-case. You can trust the kernel but not root. This is the point of the "trusted_kernel" patch series that disables all sorts of dangerous interfaces that allow root to get at physical memory. This situation is more a memory leak than a direct compromise, so it seems like providing at least some runtime control of it (separate from potential future "trusted_kernel" stuff) makes sense. -Kees > > Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > Eric -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-09 22:22 ` Kees Cook @ 2015-03-09 23:08 ` Eric W. Biederman 2015-03-09 23:40 ` Kees Cook ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Eric W. Biederman @ 2015-03-09 23:08 UTC (permalink / raw) To: Kees Cook Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen Kees Cook <keescook@chromium.org> writes: > On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: >> Dave Hansen <dave@sr71.net> writes: >> >>> From: Dave Hansen <dave.hansen@linux.intel.com> >>> >>> Physical addresses are sensitive information. There are >>> existing, known exploits that are made easier if physical >>> information is available. Here is one example: >>> >>> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf >>> >>> If you know the physical address of something you also know at >>> which kernel virtual address you can find something (modulo >>> highmem). It means that things that keep the kernel from >>> accessing user mappings (like SMAP/SMEP) can be worked around >>> because the _kernel_ mapping can get used instead. >>> >>> But, /proc/$pid/pagemap exposes the physical addresses of all >>> pages accessible to userspace. This works against all of the >>> efforts to keep kernel addresses out of places where unprivileged >>> apps can find them. >>> >>> This patch introduces a "paranoid" option for /proc. It can be >>> enabled like this: >>> >>> mount -o remount,paranoid /proc >>> >>> Or when /proc is mounted initially. When 'paranoid' mode is >>> active, opens to /proc/$pid/pagemap will return -EPERM for users >>> without CAP_SYS_RAWIO. It can be disabled like this: >>> >>> mount -o remount,notparanoid /proc >>> >>> The option is applied to the pid namespace, so an app that wanted >>> a separate policy from the rest of the system could get run in >>> its own pid namespace. >>> >>> I'm not really that stuck on the name. I'm not opposed to making >>> it apply only to pagemap or to giving it a pagemap-specific >>> name. >>> >>> pagemap is also the kind of feature that could be used to escalate >>> privileged from root in to the kernel. It probably needs to be >>> protected in the same way that /dev/mem or module loading is in >>> cases where the kernel needs to be protected from root, thus the >>> choice to use CAP_SYS_RAWIO. >> >> >> There is already a way to make pagemap go away. It is called >> CONFIG_PROC_PAGE_MONITOR. >> >> I suspect the right answer here is if you enable kernel address >> randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the >> two options conflict with each other. > > It's not a good idea to make CONFIG options conflict with each other > like this as it puts distros is a tricky spot to decide which to use. > Allowing both and having a runtime flag of some kind tends to be the > better option (e.g. kASLR vs Hibernation). But there is a fundamental conflict. As such it might as well be expressed in Kconfig. >> That is a lot less code and a lot less to maintain. >> >> On the other hand if this is truly a valuable interface that you can't >> part with we need an alternative to pagemaps that does the same job >> with out the exploit potential. And I don't how to do that. >> >> Arguing in favor of just making the options conflict is the fact that >> kernel address randomization is pretty much snake oil. At least on >> x86_64 the address pool is so small it can be trivially brute forced. I >> think there are maybe 10 bits you can randomize within. >> >> As for a way to disable this I expect it would do better with something >> like a set once flag that prevents a process and all of it's children >> from accessing this file. >> >> *Blink* *Blink* Did you say you are worried about escalting privileges >> from root into the kernel space. That is non-sense. We give root the >> power to shot themselves in the foot and any proc option will be >> something that root will be able to get around. >> >> The pieces of the patch description don't add up. > > No, that's an entirely valid use-case. You can trust the kernel but > not root. This is the point of the "trusted_kernel" patch series that > disables all sorts of dangerous interfaces that allow root to get at > physical memory. > > This situation is more a memory leak than a direct compromise, so it > seems like providing at least some runtime control of it (separate > from potential future "trusted_kernel" stuff) makes sense. I am too tired to argue about the kASLR snake-oil. I do not think a proc mount option is at all apropriate for controlling the behavior of the pagemap file. And "paranoid" is entirely too generic of a string to have any meaning. Either just tighten the permissions when kASLR is enabled, or have the file go away entirely. If you want run-time knobs there are all kinds of run-time knobs you can use. If the concern is to protect against root getting into the kernel the "trusted_kernel" snake-oil just compile out the pagemap file. Nothing else is remotely interesting from a mainenance point of view. As I said. Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-09 23:08 ` Eric W. Biederman @ 2015-03-09 23:40 ` Kees Cook 2015-03-09 23:43 ` Eric W. Biederman 2015-03-10 2:28 ` Dave Hansen 2 siblings, 0 replies; 14+ messages in thread From: Kees Cook @ 2015-03-09 23:40 UTC (permalink / raw) To: Eric W. Biederman Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen On Mon, Mar 9, 2015 at 4:08 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > Kees Cook <keescook@chromium.org> writes: > >> On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: >>> Dave Hansen <dave@sr71.net> writes: >>> >>>> From: Dave Hansen <dave.hansen@linux.intel.com> >>>> >>>> Physical addresses are sensitive information. There are >>>> existing, known exploits that are made easier if physical >>>> information is available. Here is one example: >>>> >>>> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf >>>> >>>> If you know the physical address of something you also know at >>>> which kernel virtual address you can find something (modulo >>>> highmem). It means that things that keep the kernel from >>>> accessing user mappings (like SMAP/SMEP) can be worked around >>>> because the _kernel_ mapping can get used instead. >>>> >>>> But, /proc/$pid/pagemap exposes the physical addresses of all >>>> pages accessible to userspace. This works against all of the >>>> efforts to keep kernel addresses out of places where unprivileged >>>> apps can find them. >>>> >>>> This patch introduces a "paranoid" option for /proc. It can be >>>> enabled like this: >>>> >>>> mount -o remount,paranoid /proc >>>> >>>> Or when /proc is mounted initially. When 'paranoid' mode is >>>> active, opens to /proc/$pid/pagemap will return -EPERM for users >>>> without CAP_SYS_RAWIO. It can be disabled like this: >>>> >>>> mount -o remount,notparanoid /proc >>>> >>>> The option is applied to the pid namespace, so an app that wanted >>>> a separate policy from the rest of the system could get run in >>>> its own pid namespace. >>>> >>>> I'm not really that stuck on the name. I'm not opposed to making >>>> it apply only to pagemap or to giving it a pagemap-specific >>>> name. >>>> >>>> pagemap is also the kind of feature that could be used to escalate >>>> privileged from root in to the kernel. It probably needs to be >>>> protected in the same way that /dev/mem or module loading is in >>>> cases where the kernel needs to be protected from root, thus the >>>> choice to use CAP_SYS_RAWIO. >>> >>> >>> There is already a way to make pagemap go away. It is called >>> CONFIG_PROC_PAGE_MONITOR. >>> >>> I suspect the right answer here is if you enable kernel address >>> randomization you disable CONFIG_PROC_PAGE_MONTIOR. Aka you make the >>> two options conflict with each other. >> >> It's not a good idea to make CONFIG options conflict with each other >> like this as it puts distros is a tricky spot to decide which to use. >> Allowing both and having a runtime flag of some kind tends to be the >> better option (e.g. kASLR vs Hibernation). > > But there is a fundamental conflict. As such it might as well be > expressed in Kconfig. Hm? I was using kASLR vs Hibernation as an example of something that while even at odds with each other currently is available as a runtime selectable option (putting "kaslr" on the command line enables it and disables hibernation, rather than forcing a CONFIG choice to pick one or the other). > >>> That is a lot less code and a lot less to maintain. >>> >>> On the other hand if this is truly a valuable interface that you can't >>> part with we need an alternative to pagemaps that does the same job >>> with out the exploit potential. And I don't how to do that. >>> >>> Arguing in favor of just making the options conflict is the fact that >>> kernel address randomization is pretty much snake oil. At least on >>> x86_64 the address pool is so small it can be trivially brute forced. I >>> think there are maybe 10 bits you can randomize within. >>> >>> As for a way to disable this I expect it would do better with something >>> like a set once flag that prevents a process and all of it's children >>> from accessing this file. >>> >>> *Blink* *Blink* Did you say you are worried about escalting privileges >>> from root into the kernel space. That is non-sense. We give root the >>> power to shot themselves in the foot and any proc option will be >>> something that root will be able to get around. >>> >>> The pieces of the patch description don't add up. >> >> No, that's an entirely valid use-case. You can trust the kernel but >> not root. This is the point of the "trusted_kernel" patch series that >> disables all sorts of dangerous interfaces that allow root to get at >> physical memory. >> >> This situation is more a memory leak than a direct compromise, so it >> seems like providing at least some runtime control of it (separate >> from potential future "trusted_kernel" stuff) makes sense. > > I am too tired to argue about the kASLR snake-oil. No problem. :) > > I do not think a proc mount option is at all apropriate for controlling > the behavior of the pagemap file. And "paranoid" is entirely too > generic of a string to have any meaning. > > Either just tighten the permissions when kASLR is enabled, or have the > file go away entirely. > > If you want run-time knobs there are all kinds of run-time knobs you can > use. > > If the concern is to protect against root getting into the kernel the > "trusted_kernel" snake-oil just compile out the pagemap file. Nothing > else is remotely interesting from a mainenance point of view. Distros cannot opt to compile out the pagemap file. They want to provide end users with one kernel that can do both, selectable at runtime. If I want to make it harder for things that need physical page maps to attack my system, I'd like to be able to turn it on in my distro. And since I can remove CAP_SYS_RAWIO from init during my initramfs, I would love to have this flag. -Kees > > As I said. > Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com> > > Eric -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-09 23:08 ` Eric W. Biederman 2015-03-09 23:40 ` Kees Cook @ 2015-03-09 23:43 ` Eric W. Biederman 2015-03-10 0:03 ` Kees Cook 2015-03-10 2:28 ` Dave Hansen 2 siblings, 1 reply; 14+ messages in thread From: Eric W. Biederman @ 2015-03-09 23:43 UTC (permalink / raw) To: Kees Cook Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an appropriate random number ought to keep from revealing page numbers or page ajacencies while not requiring any changes in userspace. That way the revealed pfn and the physcial pfn would be different but you could still use pagemap for it's intended purpose. Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-09 23:43 ` Eric W. Biederman @ 2015-03-10 0:03 ` Kees Cook 2015-03-10 2:51 ` Dave Hansen 0 siblings, 1 reply; 14+ messages in thread From: Kees Cook @ 2015-03-10 0:03 UTC (permalink / raw) To: Eric W. Biederman Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: > > A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an > appropriate random number ought to keep from revealing page numbers or > page ajacencies while not requiring any changes in userspace. > > That way the revealed pfn and the physcial pfn would be different but > you could still use pagemap for it's intended purpose. If this could be done in a way where it was sufficiently hard to expose the random number, we should absolutely do this. And this could be done for socket handles in INET_DIAG too. We have a lot of these kinds of "handle" leaks where the handle's can be regarded as private information leakage. -Kees -- Kees Cook Chrome OS Security ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-10 0:03 ` Kees Cook @ 2015-03-10 2:51 ` Dave Hansen 2015-03-10 4:49 ` Eric W. Biederman 0 siblings, 1 reply; 14+ messages in thread From: Dave Hansen @ 2015-03-10 2:51 UTC (permalink / raw) To: Kees Cook, Eric W. Biederman Cc: Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen On 03/09/2015 05:03 PM, Kees Cook wrote: > On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: >> A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an >> appropriate random number ought to keep from revealing page numbers or >> page ajacencies while not requiring any changes in userspace. >> >> That way the revealed pfn and the physcial pfn would be different but >> you could still use pagemap for it's intended purpose. > > If this could be done in a way where it was sufficiently hard to > expose the random number, we should absolutely do this. We would need something which is both reversible (so that the given offsets can still be used in /proc/kpagemap) and also hard to do a known-plaintext-type attack on it. Transparent huge pages are a place where userspace knows the relationship between 512 adjacent physical addresses. That represents a good chunk of known data. Surely there are more of these kinds of things. Right now, for instance, the ways in which a series of sequential allocations come out of the page allocator are fairly deterministic. We would also need to do some kind of allocator randomization to ensure that userspace couldn't make good guesses about the physical addresses of things coming out of the allocator. Or, we just be sure and turn the darn thing off. :) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-10 2:51 ` Dave Hansen @ 2015-03-10 4:49 ` Eric W. Biederman 0 siblings, 0 replies; 14+ messages in thread From: Eric W. Biederman @ 2015-03-10 4:49 UTC (permalink / raw) To: Dave Hansen Cc: Kees Cook, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen Dave Hansen <dave.hansen@intel.com> writes: > On 03/09/2015 05:03 PM, Kees Cook wrote: >> On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman <ebiederm@xmission.com> wrote: >>> A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an >>> appropriate random number ought to keep from revealing page numbers or >>> page ajacencies while not requiring any changes in userspace. >>> >>> That way the revealed pfn and the physcial pfn would be different but >>> you could still use pagemap for it's intended purpose. >> >> If this could be done in a way where it was sufficiently hard to >> expose the random number, we should absolutely do this. > > We would need something which is both reversible (so that the given > offsets can still be used in /proc/kpagemap) and also hard to do a > known-plaintext-type attack on it. > > Transparent huge pages are a place where userspace knows the > relationship between 512 adjacent physical addresses. That represents a > good chunk of known data. Surely there are more of these kinds of things. > > Right now, for instance, the ways in which a series of sequential > allocations come out of the page allocator are fairly deterministic. We > would also need to do some kind of allocator randomization to ensure > that userspace couldn't make good guesses about the physical addresses > of things coming out of the allocator. > > Or, we just be sure and turn the darn thing off. :) Yes. If we are worried about something a big off switch is fine. As for a one-to-one transform that is resitant to plain text attacks I think that is the definition of a cypher. That is we should just use AES or something well know to encrypt the pafe frame numbers if we want to hide them. I don't know if the block mode of AES would be a problem or not. Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-09 23:08 ` Eric W. Biederman 2015-03-09 23:40 ` Kees Cook 2015-03-09 23:43 ` Eric W. Biederman @ 2015-03-10 2:28 ` Dave Hansen 2 siblings, 0 replies; 14+ messages in thread From: Dave Hansen @ 2015-03-10 2:28 UTC (permalink / raw) To: Eric W. Biederman, Kees Cook Cc: Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen On 03/09/2015 04:08 PM, Eric W. Biederman wrote: > If the concern is to protect against root getting into the kernel the > "trusted_kernel" snake-oil just compile out the pagemap file. Nothing > else is remotely interesting from a mainenance point of view. The paper I linked to showed one example of how pagemap makes a user->kernel exploint _easier_. Note that the authors had another way of actually doing the exploit when pagemap was not available, but it required some more trouble than if pagemap was around. I mentioned the "trusted_kernel" stuff as an aside. It's really not the main concern. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface [not found] <20150309204321.AAF412E0@viggo.jf.intel.com> ` (2 preceding siblings ...) 2015-03-09 22:13 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Eric W. Biederman @ 2015-03-12 22:35 ` Andrew Morton 2015-03-13 15:56 ` Dave Hansen 3 siblings, 1 reply; 14+ messages in thread From: Andrew Morton @ 2015-03-12 22:35 UTC (permalink / raw) To: Dave Hansen Cc: Eric W. Biederman, Kees Cook, tytso, Oleg Nesterov, linux-kernel, dave.hansen On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen <dave@sr71.net> wrote: > > From: Dave Hansen <dave.hansen@linux.intel.com> > > Physical addresses are sensitive information. There are > existing, known exploits that are made easier if physical > information is available. Here is one example: > > http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > > If you know the physical address of something you also know at > which kernel virtual address you can find something (modulo > highmem). It means that things that keep the kernel from > accessing user mappings (like SMAP/SMEP) can be worked around > because the _kernel_ mapping can get used instead. > > But, /proc/$pid/pagemap exposes the physical addresses of all > pages accessible to userspace. This works against all of the > efforts to keep kernel addresses out of places where unprivileged > apps can find them. > > This patch introduces a "paranoid" option for /proc. It can be > enabled like this: > > mount -o remount,paranoid /proc > > Or when /proc is mounted initially. When 'paranoid' mode is > active, opens to /proc/$pid/pagemap will return -EPERM for users > without CAP_SYS_RAWIO. It can be disabled like this: > > mount -o remount,notparanoid /proc > > The option is applied to the pid namespace, so an app that wanted > a separate policy from the rest of the system could get run in > its own pid namespace. > > I'm not really that stuck on the name. I'm not opposed to making > it apply only to pagemap or to giving it a pagemap-specific > name. Do we really need to disable pagemap entirely? What happens if we just obscure the addresses (ie: zero them)? > pagemap is also the kind of feature that could be used to escalate > privileged from root in to the kernel. It probably needs to be > protected in the same way that /dev/mem or module loading is in > cases where the kernel needs to be protected from root, thus the > choice to use CAP_SYS_RAWIO. Confused. If you have root, you can do mount -o notparanoid. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-12 22:35 ` Andrew Morton @ 2015-03-13 15:56 ` Dave Hansen 2015-03-13 17:16 ` Eric W. Biederman 0 siblings, 1 reply; 14+ messages in thread From: Dave Hansen @ 2015-03-13 15:56 UTC (permalink / raw) To: Andrew Morton Cc: Eric W. Biederman, Kees Cook, tytso, Oleg Nesterov, linux-kernel On 03/12/2015 03:35 PM, Andrew Morton wrote: > On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen <dave@sr71.net> wrote: >> From: Dave Hansen <dave.hansen@linux.intel.com> >> >> Physical addresses are sensitive information. There are >> existing, known exploits that are made easier if physical >> information is available. Here is one example: >> >> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf > Do we really need to disable pagemap entirely? What happens if we just > obscure the addresses (ie: zero them)? I think we have 3 basic options: 1. Disable it entirely (-EPERM or whatever). Apps using it break quickly and fairly obviously (diagnosable with an strace) 2. Zero it, or return some nonsensical thing for the physical address portion, but maintain exporting the PTE flags. Apps only caring about PTE flags work, but anything trying to do lookups in /proc/kpageflags break. If we zero it, apps pay get confused thinking they have the _actual_ pfn=0. 3. Scramble it in some way obscuring the physical address. Unscramble it upon access to /proc/kpageflags. I think you're suggesting (2). Doesn't that risk silently breaking apps? >> pagemap is also the kind of feature that could be used to escalate >> privileged from root in to the kernel. It probably needs to be >> protected in the same way that /dev/mem or module loading is in >> cases where the kernel needs to be protected from root, thus the >> choice to use CAP_SYS_RAWIO. > > Confused. If you have root, you can do mount -o notparanoid. Good point. I guess it doesn't protect us much here unless we also restrict the ability to remount. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface 2015-03-13 15:56 ` Dave Hansen @ 2015-03-13 17:16 ` Eric W. Biederman 0 siblings, 0 replies; 14+ messages in thread From: Eric W. Biederman @ 2015-03-13 17:16 UTC (permalink / raw) To: Dave Hansen; +Cc: Andrew Morton, Kees Cook, tytso, Oleg Nesterov, linux-kernel Dave Hansen <dave@sr71.net> writes: > On 03/12/2015 03:35 PM, Andrew Morton wrote: >> On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen <dave@sr71.net> wrote: >>> From: Dave Hansen <dave.hansen@linux.intel.com> >>> >>> Physical addresses are sensitive information. There are >>> existing, known exploits that are made easier if physical >>> information is available. Here is one example: >>> >>> http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf >> Do we really need to disable pagemap entirely? What happens if we just >> obscure the addresses (ie: zero them)? > > I think we have 3 basic options: > > 1. Disable it entirely (-EPERM or whatever). Apps using it break > quickly and fairly obviously (diagnosable with an strace) > 2. Zero it, or return some nonsensical thing for the physical address > portion, but maintain exporting the PTE flags. Apps only caring > about PTE flags work, but anything trying to do lookups in > /proc/kpageflags break. If we zero it, apps pay get confused > thinking they have the _actual_ pfn=0. > 3. Scramble it in some way obscuring the physical address. Unscramble > it upon access to /proc/kpageflags. > > I think you're suggesting (2). Doesn't that risk silently breaking > apps? I think 3 where the scramble is something like AES crypto is likely to scramble this well and still protect us from plain text attacks. >>> pagemap is also the kind of feature that could be used to escalate >>> privileged from root in to the kernel. It probably needs to be >>> protected in the same way that /dev/mem or module loading is in >>> cases where the kernel needs to be protected from root, thus the >>> choice to use CAP_SYS_RAWIO. >> >> Confused. If you have root, you can do mount -o notparanoid. > > Good point. I guess it doesn't protect us much here unless we also > restrict the ability to remount. And the ability to unmount... A write-once sysctl or a boot time only parameter is much more likely to be useful in the scenario where you are concerned about root. Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-03-13 17:20 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20150309204321.AAF412E0@viggo.jf.intel.com>
2015-03-09 21:31 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Kees Cook
[not found] ` <20150309204322.50DA6B5D@viggo.jf.intel.com>
2015-03-09 21:32 ` [RFC][PATCH 2/2] proc: config options for making privileged /proc the default Kees Cook
2015-03-09 22:13 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Eric W. Biederman
2015-03-09 22:22 ` Kees Cook
2015-03-09 23:08 ` Eric W. Biederman
2015-03-09 23:40 ` Kees Cook
2015-03-09 23:43 ` Eric W. Biederman
2015-03-10 0:03 ` Kees Cook
2015-03-10 2:51 ` Dave Hansen
2015-03-10 4:49 ` Eric W. Biederman
2015-03-10 2:28 ` Dave Hansen
2015-03-12 22:35 ` Andrew Morton
2015-03-13 15:56 ` Dave Hansen
2015-03-13 17:16 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).