From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B90FC433DB for ; Sat, 20 Mar 2021 11:58:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C365A61979 for ; Sat, 20 Mar 2021 11:58:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C365A61979 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 472468D0008; Sat, 20 Mar 2021 07:58:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4499D8D0002; Sat, 20 Mar 2021 07:58:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C4788D0008; Sat, 20 Mar 2021 07:58:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id 10DAD8D0002 for ; Sat, 20 Mar 2021 07:58:26 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id BCB0B6C29 for ; Sat, 20 Mar 2021 11:58:25 +0000 (UTC) X-FDA: 77940104970.12.0931633 Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) by imf14.hostedemail.com (Postfix) with ESMTP id 44B3DC0007C4 for ; Sat, 20 Mar 2021 11:58:25 +0000 (UTC) Received: by mail-ed1-f52.google.com with SMTP id o19so13840488edc.3 for ; Sat, 20 Mar 2021 04:58:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Q1S6sOzSSmWv0cJIdKqH5aeKc+ejDOx1KDCFb2YbT3o=; b=qpR1PFuQAsHD9BxFuRDr2UgI0M5ZKr1wg73JNKyVGRXQp2STpCUHbOFNXVKc3pYUeU BxVh59NlVsKCuKzhWO++s1zTkKdomHSNUUlCTX2S6UNS+I4zg9FgCbRBjeDon1CzZMNK ur3VHRNpkiiDHnKnp+3igIpPWc+bJCJEMLRPquJjXjV/NVB5kRpDRjHyVgxKyBNVkxaM szDTmEKbH2A2igQ48viQewT3/nFUv2C46SmRvV+HtgcKhuQGm//YLfEhAZGzUWI7sXJu XA01inHcYSFdTpjw32pVWcVLYJ6/kHa+a5sLi2kWXCDv4RyMLF/88AOxAZK42GpMgrIp gOwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=Q1S6sOzSSmWv0cJIdKqH5aeKc+ejDOx1KDCFb2YbT3o=; b=DUfx9X0W5+zhNuHshdSdp3SZmGME2/qnsYPEsIctAfGZ8vIigjXWe/K1qIOzVFYzPW Y6ufRmSoO8mmjRh9+XVsMpErIyMAZaDgZBlePxNMoS7b4q6rkVS11xjw3r4btiUdw3O3 VeJN5KaIWG8VrUZztigCvpLtE3rePY8omwwYgomlpFd3ZkrwIVqjP7ZRhnWimEonqh+1 hSxmcyyiBttuJC9HkCLKMsiOVUQqPLrJA/x/GXGi/Eb8qy1RSz97NINCEQVxwxWrbKeV utg7t3prG18LQ3DiKR1eV9CyHzBbeX+MIUWj7n7PsPxn+DXJ2vOZ77Yg7jHSW2j/N8gV asTQ== X-Gm-Message-State: AOAM530/V/lhhpW0hOT+VwvnzYpxbQ6DEfcb6kIocYdWv8WwuBry/wBu So2GNuTA25f/vR1FKjHnXc0= X-Google-Smtp-Source: ABdhPJz01rctCHm86yiIpmHhRp51yZPtxnmqPKRg7nHGuJhzGfxbKyzoIQQxZVmBUirAIENLH5CbDA== X-Received: by 2002:a05:6402:d4:: with SMTP id i20mr15225084edu.147.1616241503810; Sat, 20 Mar 2021 04:58:23 -0700 (PDT) Received: from gmail.com (54033286.catv.pool.telekom.hu. [84.3.50.134]) by smtp.gmail.com with ESMTPSA id w18sm5273350ejn.23.2021.03.20.04.58.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 20 Mar 2021 04:58:23 -0700 (PDT) Date: Sat, 20 Mar 2021 12:58:20 +0100 From: Ingo Molnar To: Kees Cook Cc: Thomas Gleixner , Elena Reshetova , x86@kernel.org, Andy Lutomirski , Peter Zijlstra , Catalin Marinas , Will Deacon , Mark Rutland , Alexander Potapenko , Alexander Popov , Ard Biesheuvel , Jann Horn , Vlastimil Babka , David Hildenbrand , Mike Rapoport , Andrew Morton , Jonathan Corbet , Randy Dunlap , kernel-hardening@lists.openwall.com, linux-hardening@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Linus Torvalds , Peter Zijlstra , Borislav Petkov Subject: Re: [PATCH v7 4/6] x86/entry: Enable random_kstack_offset support Message-ID: <20210320115820.GA4151166@gmail.com> References: <20210319212835.3928492-1-keescook@chromium.org> <20210319212835.3928492-5-keescook@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210319212835.3928492-5-keescook@chromium.org> X-Stat-Signature: fxnabqrhcqqkmp8dzrbb47prw3ck6rus X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 44B3DC0007C4 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=mail-ed1-f52.google.com; client-ip=209.85.208.52 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616241505-941813 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: * Kees Cook wrote: > Allow for a randomized stack offset on a per-syscall basis, with roughly > 5-6 bits of entropy, depending on compiler and word size. Since the > method of offsetting uses macros, this cannot live in the common entry > code (the stack offset needs to be retained for the life of the syscall, > which means it needs to happen at the actual entry point). > __visible noinstr void do_syscall_64(unsigned long nr, struct pt_regs *regs) > { > + add_random_kstack_offset(); > nr = syscall_enter_from_user_mode(regs, nr); > @@ -83,6 +84,7 @@ __visible noinstr void do_int80_syscall_32(struct pt_regs *regs) > { > unsigned int nr = syscall_32_enter(regs); > > + add_random_kstack_offset(); > unsigned int nr = syscall_32_enter(regs); > int res; > > + add_random_kstack_offset(); > @@ -70,6 +71,13 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, > */ > current_thread_info()->status &= ~(TS_COMPAT | TS_I386_REGS_POKED); > #endif > + > + /* > + * x86_64 stack alignment means 3 bits are ignored, so keep > + * the top 5 bits. x86_32 needs only 2 bits of alignment, so > + * the top 6 bits will be used. > + */ > + choose_random_kstack_offset(rdtsc() & 0xFF); > } 1) Wondering why the calculation of the kstack offset (which happens in every syscall) is separated from the entry-time logic and happens during return to user-space? The two methods: +#define add_random_kstack_offset() do { \ + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ + &randomize_kstack_offset)) { \ + u32 offset = this_cpu_read(kstack_offset); \ + u8 *ptr = __builtin_alloca(offset & 0x3FF); \ + asm volatile("" : "=m"(*ptr) :: "memory"); \ + } \ +} while (0) + +#define choose_random_kstack_offset(rand) do { \ + if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ + &randomize_kstack_offset)) { \ + u32 offset = this_cpu_read(kstack_offset); \ + offset ^= (rand); \ + this_cpu_write(kstack_offset, offset); \ + } \ +} while (0) choose_random_kstack_offset() basically calculates the offset and stores it in a percpu variable (mixing it with the previous offset value), add_random_kstack_offset() uses it in an alloca() dynamic stack allocation. Wouldn't it be (slightly) lower combined overhead to just do it in a single step? There would be duplication along the 3 syscall entry points, but this should be marginal as this looks small, and the entry points would probably be cache-hot. 2) Another detail I noticed: add_random_kstack_offset() limits the offset to 0x3ff, or 1k - 10 bits. But the RDTSC mask is 0xff, 8 bits: + /* + * x86_64 stack alignment means 3 bits are ignored, so keep + * the top 5 bits. x86_32 needs only 2 bits of alignment, so + * the top 6 bits will be used. + */ + choose_random_kstack_offset(rdtsc() & 0xFF); alloca() itself works in byte units and will round the allocation to 8 bytes on x86-64, to 4 bytes on x86-32, this is what the 'ignored bits' reference in the comment is to, right? Why is there a 0x3ff mask for the alloca() call and a 0xff mask to the RDTSC randomizing value? Shouldn't the two be synced up? Or was the intention to shift the RDTSC value to the left by 3 bits? 3) Finally, kstack_offset is a percpu variable: #ifdef CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET ... DEFINE_PER_CPU(u32, kstack_offset); This is inherited across tasks on scheduling, and new syscalls will mix in new RDTSC values to continue to randomize the offset. Wouldn't it make sense to further mix values into this across context switching boundaries? A really inexpensive way would be to take the kernel stack value and mix it into the offset, and maybe even the randomized t->stack_canary value? This would further isolate the syscall kernel stack offsets of separate execution contexts from each other, should an attacker find a way to estimate or influence likely RDTSC values. Thanks, Ingo