From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02E05C27C6D for ; Wed, 16 Aug 2023 22:48:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347016AbjHPWre (ORCPT ); Wed, 16 Aug 2023 18:47:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347093AbjHPWr2 (ORCPT ); Wed, 16 Aug 2023 18:47:28 -0400 Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com [IPv6:2607:f8b0:4864:20::82b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06094271B for ; Wed, 16 Aug 2023 15:47:18 -0700 (PDT) Received: by mail-qt1-x82b.google.com with SMTP id d75a77b69052e-40a47e8e38dso61261cf.1 for ; Wed, 16 Aug 2023 15:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692226038; x=1692830838; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gmAfMTPz4eh29CLf8sk04qA9x8l/5iZhtjn1NdNuQus=; b=wIBTRZDkz6UQoktGyqxoTeEQiN6IHsdC3DKvn5H/h2iBRaohFe6JXoyF0NoAg/2tBp pRlR+f/oy57QNKyAN14VMwjBN74w57rH/wr/BPlQeKkBa94KsFYRDmoYjCTp8kCV6hzO DoGfbSoSyB9LY2seA1Dqat7bdi9sSxxPWwU1cnXjLX5zzGwDp7FWLBUSyJX7sehzokvT GAUMhsc9pZ1Ooyt7lUb0tKj/6Nk4Xi5Q5zftKU53KqgbmokxHTGTOWITXYBDivdT4Ob9 l/z7ZDuqhiAy8tYSstQMSnht+l47Z0hVa9IsskRBLmlp7Ij+82KBTY14k7C8V66GPHLc dxhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692226038; x=1692830838; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gmAfMTPz4eh29CLf8sk04qA9x8l/5iZhtjn1NdNuQus=; b=hFvHBiU1yd6KJc3hubwqyfNGeZrtZqw2xQCGl+e4PRhhyACDS4nxAYlSsAEWKj8G3R WkefSdJ4vFTfPRl/3mQ2jv63xZ/EmVYTvkfQrTFBQHZEyNzvu6HvpKrQODDRFsojv4Vr DWVDEa9AAQln2Js3Yl3MPNQd/DRJqCcCPyD2hW26jdCqF21oH3kC/CAyP2A1w/5D5bjP KtloKiBssH9oDap4XKDjal2syNFkXF5STPaG7PxlUty0UZFbhw2lsXY0SVNikkNFuSKv RUNfV+z42/jmD2lC7udjBABXW75eNH3T+f8vbECTkPQsfEhcsWCclEwESocxZeuByOVB pWyA== X-Gm-Message-State: AOJu0Yw+LJ/8Z0Lqnomic8ikptvNhh0VPPwHfPbmrpIoGJk4wVQ/gT0f DXuZh9yax/XsMOoVsXeBLVaH72a4Ubx1269S7TIIjw== X-Google-Smtp-Source: AGHT+IGKhVtNgZhUpSQYcnybxaqwvR/y997m77QY4GFoPXa9zfEq+ZbGp4ISpTVoDw/ZKneDf0Bl3pDyVZg2gBas0+c= X-Received: by 2002:ac8:118b:0:b0:40f:ec54:973 with SMTP id d11-20020ac8118b000000b0040fec540973mr93254qtj.22.1692226037853; Wed, 16 Aug 2023 15:47:17 -0700 (PDT) MIME-Version: 1.0 References: <20230814-memfd-vm-noexec-uapi-fixes-v2-0-7ff9e3e10ba6@cyphar.com> <20230814-memfd-vm-noexec-uapi-fixes-v2-4-7ff9e3e10ba6@cyphar.com> In-Reply-To: From: Jeff Xu Date: Wed, 16 Aug 2023 15:46:41 -0700 Message-ID: Subject: Re: [PATCH v2 4/5] memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy To: Dominique Martinet Cc: Aleksa Sarai , Andrew Morton , Shuah Khan , Kees Cook , Daniel Verkamp , Christian Brauner , stable@vger.kernel.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-api@vger.kernel.org On Tue, Aug 15, 2023 at 10:44=E2=80=AFPM Dominique Martinet wrote: > > Jeff Xu wrote on Tue, Aug 15, 2023 at 10:13:18PM -0700: > > > Given that it is possible for CAP_SYS_ADMIN users to create executabl= e > > > binaries without memfd_create(2) and without touching the host > > > filesystem (not to mention the many other things a CAP_SYS_ADMIN proc= ess > > > would be able to do that would be equivalent or worse), it seems stra= nge > > > to cause a fair amount of headache to admins when there doesn't appea= r > > > to be an actual security benefit to blocking this. There appear to be > > > concerns about confused-deputy-esque attacks[2] but a confused deputy= that > > > can write to arbitrary sysctls is a bigger security issue than > > > executable memfds. > > > > > Something to point out: The demo code might be enough to prove your > > case in other distributions, however, in ChromeOS, you can't run this > > code. The executable in ChromeOS are all from known sources and > > verified at boot. > > If an attacker could run this code in ChromeOS, that means the > > attacker already acquired arbitrary code execution through other ways, > > at that point, the attacker no longer needs to create/find an > > executable memfd, they already have the vehicle. You can't use an > > example of an attacker already running arbitrary code to prove that > > disable downgrading is useless. > > I agree it is a big problem that an attacker already can modify a > > sysctl. Assuming this can happen by controlling arguments passed into > > sysctl, at the time, the attacker might not have full arbitrary code > > execution yet, that is the reason the original design is so > > restrictive. > > I don't understand how you can say an attacker cannot run arbitrary code > within a process here, yet assert that they'd somehow run memfd_create + > execveat on it if this sysctl is lowered -- the two look equivalent to > me? > It might require multiple steps for this attack, one possible scenario: 1> control a write primitive in CAP_SYSADMIN process's memory, change arguments of sysctl call, and downgrade the setting for memfd, e.g. change it=3D0 to revert to old behavior (by default creating executable memfd) 2> control a non-privileged process that creates and writes to memfd, and write the contents with the binary that the attacker wants. This process just needs non-executable memfd, but isn't updated yet. 3> Confuse a non-privilege process to execute the memfd the attacker wrote in step 2. In chromeOS, because all the executables are from verified sources, attackers typically can't easily use the step 3 alone (without step 2), and memfd was such a hole that enables an unverified executable. In the original design, downgrading is not allowed, the attack chain of 2/3 is completely blocked. With this new approach, attackers will try to find an additional step (step 1) to make the old attack (step 2 and 3) working again. It is difficult but I can't say it is impossible. > CAP_SYS_ADMIN is a kludge of a capability that pretty much gives root as > soon as you can run arbitrary code (just have a look at the various > container escape example when the capability is given); I see little > point in trying to harden just this here. I'm not an expert in containers, if the industry is giving up on privileged containers, then the reasoning makes sense. >From ChromeOS point of view, we don't use runc currently, so I think it makes more sense for runc users to drive these features. The original design is with runc's in mind, and even privileged containers can't downgrade its own setting. > It'd make more sense to limit all sysctl modifications in the context > you're thinking of through e.g. selinux or another LSM. > I agree, when I think more about this. Security features fit LSM better, LSM can do additional "allow/deny" on otherwise allowed behavior from user space code. Based on that, "disallow downgrading" fits LSM better. Also from the same reasoning, I have second thoughts on the "=3D2", originally the "MEMFD_EXE was left out due to the thinking, if user code explicitly setting MEMFD_EXE, sysctl should not block it, it is the work of LSM. However, the "=3D2" has evolved to block MEMFD_EXE completely ... alas .. it might be too late to revert this, if this is what devs want, it can be that way. Thanks Best regards, -Jeff -Jeff > (in the context of users making their own containers, my suggestion is > always to never use CAP_SYS_ADMIN, or if they must give it to a separate > minimal container where they can limit user interaction) > > > FWIW, I also think the proposed =3D2 behaviour makes more sense, but this > is something we already discussed last month so I won't come back to it > as not really involved here. > > -- > Dominique Martinet | Asmadeus