public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>,
	Andy Lutomirski <luto@amacapital.net>, Willy Tarreau <w@1wt.eu>
Subject: [ 09/48] x86_64, vdso: Fix the vdso address randomization algorithm
Date: Fri, 15 May 2015 10:05:39 +0200	[thread overview]
Message-ID: <20150515080530.680848837@1wt.eu> (raw)
In-Reply-To: <9c2783dfae10ef2d1e9b08bcc1e562c5@local>

2.6.32-longterm review patch.  If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <luto@amacapital.net>

commit 394f56fe480140877304d342dec46d50dc823d46 upstream

The theory behind vdso randomization is that it's mapped at a random
offset above the top of the stack.  To avoid wasting a page of
memory for an extra page table, the vdso isn't supposed to extend
past the lowest PMD into which it can fit.  Other than that, the
address should be a uniformly distributed address that meets all of
the alignment requirements.

The current algorithm is buggy: the vdso has about a 50% probability
of being at the very end of a PMD.  The current algorithm also has a
decent chance of failing outright due to incorrect handling of the
case where the top of the stack is near the top of its PMD.

This fixes the implementation.  The paxtest estimate of vdso
"randomisation" improves from 11 bits to 18 bits.  (Disclaimer: I
don't know what the paxtest code is actually calculating.)

It's worth noting that this algorithm is inherently biased: the vdso
is more likely to end up near the end of its PMD than near the
beginning.  Ideally we would either nix the PMD sharing requirement
or jointly randomize the vdso and the stack to reduce the bias.

In the mean time, this is a considerable improvement with basically
no risk of compatibility issues, since the allowed outputs of the
algorithm are unchanged.

As an easy test, doing this:

for i in `seq 10000`
  do grep -P vdso /proc/self/maps |cut -d- -f1
done |sort |uniq -d

used to produce lots of output (1445 lines on my most recent run).
A tiny subset looks like this:

7fffdfffe000
7fffe01fe000
7fffe05fe000
7fffe07fe000
7fffe09fe000
7fffe0bfe000
7fffe0dfe000

Note the suspicious fe000 endings.  With the fix, I get a much more
palatable 76 repeated addresses.

Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
[bwh: Backported to 2.6.32:
 - The whole file is only built for x86_64; adjust context and comment for this
 - We don't have align_vdso_addr()]
Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 arch/x86/vdso/vma.c | 36 ++++++++++++++++++++++++++----------
 1 file changed, 26 insertions(+), 10 deletions(-)

diff --git a/arch/x86/vdso/vma.c b/arch/x86/vdso/vma.c
index 21e1aeb..3efc633 100644
--- a/arch/x86/vdso/vma.c
+++ b/arch/x86/vdso/vma.c
@@ -77,23 +77,39 @@ __initcall(init_vdso_vars);
 
 struct linux_binprm;
 
-/* Put the vdso above the (randomized) stack with another randomized offset.
-   This way there is no hole in the middle of address space.
-   To save memory make sure it is still in the same PTE as the stack top.
-   This doesn't give that many random bits */
+/*
+ * Put the vdso above the (randomized) stack with another randomized
+ * offset.  This way there is no hole in the middle of address space.
+ * To save memory make sure it is still in the same PTE as the stack
+ * top.  This doesn't give that many random bits.
+ *
+ * Note that this algorithm is imperfect: the distribution of the vdso
+ * start address within a PMD is biased toward the end.
+ */
 static unsigned long vdso_addr(unsigned long start, unsigned len)
 {
 	unsigned long addr, end;
 	unsigned offset;
-	end = (start + PMD_SIZE - 1) & PMD_MASK;
+
+	/*
+	 * Round up the start address.  It can start out unaligned as a result
+	 * of stack start randomization.
+	 */
+	start = PAGE_ALIGN(start);
+
+	/* Round the lowest possible end address up to a PMD boundary. */
+	end = (start + len + PMD_SIZE - 1) & PMD_MASK;
 	if (end >= TASK_SIZE_MAX)
 		end = TASK_SIZE_MAX;
 	end -= len;
-	/* This loses some more bits than a modulo, but is cheaper */
-	offset = get_random_int() & (PTRS_PER_PTE - 1);
-	addr = start + (offset << PAGE_SHIFT);
-	if (addr >= end)
-		addr = end;
+
+	if (end > start) {
+		offset = get_random_int() % (((end - start) >> PAGE_SHIFT) + 1);
+		addr = start + (offset << PAGE_SHIFT);
+	} else {
+		addr = start;
+	}
+
 	return addr;
 }
 
-- 
1.7.12.2.21.g234cd45.dirty




  parent reply	other threads:[~2015-05-15  8:29 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <9c2783dfae10ef2d1e9b08bcc1e562c5@local>
2015-05-15  8:05 ` [ 00/48] 2.6.32.66-longterm review Willy Tarreau
2015-05-15  8:05 ` [ 01/48] x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs Willy Tarreau
2015-05-15  8:05 ` [ 02/48] x86/tls: Validate TLS entries to protect espfix Willy Tarreau
2015-05-15  8:05 ` [ 03/48] x86, tls, ldt: Stop checking lm in LDT_empty Willy Tarreau
2015-05-15  8:05 ` [ 04/48] x86, tls: Interpret an all-zero struct user_desc as "no segment" Willy Tarreau
2015-05-15  8:05 ` [ 05/48] x86_64, switch_to(): Load TLS descriptors before switching DS and ES Willy Tarreau
2015-05-15 12:32   ` Ben Hutchings
2015-05-15 13:38     ` Willy Tarreau
2015-05-15 14:25       ` Ben Hutchings
2015-05-15 14:31         ` Ben Hutchings
2015-05-15 14:37         ` Willy Tarreau
2015-05-15 15:53         ` Andi Kleen
2015-05-15 16:48           ` Willy Tarreau
2015-05-15 20:53           ` Ben Hutchings
2015-05-15 22:15             ` Andi Kleen
2015-05-15  8:05 ` [ 06/48] x86/tls: Disallow unusual TLS segments Willy Tarreau
2015-05-15  8:05 ` [ 07/48] x86/tls: Dont validate lm in set_thread_area() after all Willy Tarreau
2015-05-15  8:05 ` [ 08/48] x86, kvm: Clear paravirt_enabled on KVM guests for espfix32s benefit Willy Tarreau
2015-05-15  8:05 ` Willy Tarreau [this message]
2015-05-15 21:02   ` [ 09/48] x86_64, vdso: Fix the vdso address randomization algorithm Ben Hutchings
2015-05-15  8:05 ` [ 10/48] ASLR: fix stack randomization on 64-bit systems Willy Tarreau
2015-05-15  8:05 ` [ 11/48] x86, cpu, amd: Add workaround for family 16h, erratum 793 Willy Tarreau
2015-05-15  8:05 ` [ 12/48] x86/asm/entry/64: Remove a bogus ret_from_fork optimization Willy Tarreau
2015-05-15  8:05 ` [ 13/48] x86: Conditionally update time when ack-ing pending irqs Willy Tarreau
2015-05-15  8:05 ` [ 14/48] serial: samsung: wait for transfer completion before clock disable Willy Tarreau
2015-05-15  8:05 ` [ 15/48] splice: Apply generic position and size checks to each write Willy Tarreau
2015-05-15  8:05 ` [ 16/48] netfilter: conntrack: disable generic tracking for known protocols Willy Tarreau
2015-05-15 21:05   ` Ben Hutchings
2015-05-15  8:05 ` [ 17/48] isofs: Fix infinite looping over CE entries Willy Tarreau
2015-05-15  8:05 ` [ 18/48] isofs: Fix unchecked printing of ER records Willy Tarreau
2015-05-15  8:05 ` [ 19/48] net: sctp: fix memory leak in auth key management Willy Tarreau
2015-05-15  8:05 ` [ 20/48] net: sctp: fix slab corruption from use after free on INIT collisions Willy Tarreau
2015-05-15  8:05 ` [ 21/48] IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic Willy Tarreau
2015-05-15  8:05 ` [ 22/48] net: llc: use correct size for sysctl timeout entries Willy Tarreau
2015-05-15  8:05 ` [ 23/48] net: rds: use correct size for max unacked packets and bytes Willy Tarreau
2015-05-15  8:05 ` [ 24/48] ipv6: Dont reduce hop limit for an interface Willy Tarreau
2015-05-15  8:05 ` [ 25/48] fs: take i_mutex during prepare_binprm for set[ug]id executables Willy Tarreau
2015-05-15  8:05 ` [ 26/48] net:socket: set msg_namelen to 0 if msg_name is passed as NULL in msghdr struct from userland Willy Tarreau
2015-05-15 21:08   ` Ben Hutchings
2015-05-16  5:31     ` Willy Tarreau
2015-05-15  8:05 ` [ 27/48] ppp: deflate: never return len larger than output buffer Willy Tarreau
2015-05-15  8:05 ` [ 29/48] net: reject creation of netdev names with colons Willy Tarreau
2015-05-15  8:06 ` [ 30/48] ipv4: Dont use ufo handling on later transformed packets Willy Tarreau
2015-05-15  8:06 ` [ 31/48] udp: only allow UFO for packets from SOCK_DGRAM sockets Willy Tarreau
2015-05-15  8:06 ` [ 32/48] net: avoid to hang up on sending due to sysctl configuration overflow Willy Tarreau
2015-05-15  8:06 ` [ 33/48] net: sysctl_net_core: check SNDBUF and RCVBUF for min length Willy Tarreau
2015-05-15  8:06 ` [ 34/48] rds: avoid potential stack overflow Willy Tarreau
2015-05-15  8:06 ` [ 35/48] rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() Willy Tarreau
2015-05-15  8:06 ` [ 36/48] tcp: make connect() mem charging friendly Willy Tarreau
2015-05-15  8:06 ` [ 37/48] ip_forward: Drop frames with attached skb->sk Willy Tarreau
2015-05-15  8:06 ` [ 38/48] tcp: avoid looping in tcp_send_fin() Willy Tarreau
2015-05-15  8:06 ` [ 39/48] spi: spidev: fix possible arithmetic overflow for multi-transfer message Willy Tarreau
2015-05-15  8:06 ` [ 40/48] IB/core: Avoid leakage from kernel to user space Willy Tarreau
2015-05-15  8:06 ` [ 41/48] ipvs: uninitialized data with IP_VS_IPV6 Willy Tarreau
2015-05-15  8:06 ` [ 42/48] ipv4: fix nexthop attlen check in fib_nh_match Willy Tarreau
2015-05-15  8:06 ` [ 43/48] pagemap: do not leak physical addresses to non-privileged userspace Willy Tarreau
2015-05-15  8:06 ` [ 44/48] lockd: Try to reconnect if statd has moved Willy Tarreau
2015-05-15  8:06 ` [ 45/48] scsi: Fix error handling in SCSI_IOCTL_SEND_COMMAND Willy Tarreau
2015-05-15  8:06 ` [ 46/48] posix-timers: Fix stack info leak in timer_create() Willy Tarreau
2015-05-15  8:06 ` [ 47/48] hfsplus: fix B-tree corruption after insertion at position 0 Willy Tarreau
2015-05-15  8:06 ` [ 48/48] sound/oss: fix deadlock in sequencer_ioctl(SNDCTL_SEQ_OUTOFBAND) Willy Tarreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150515080530.680848837@1wt.eu \
    --to=w@1wt.eu \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox