From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F0D66D3A67D for ; Wed, 30 Oct 2024 09:28:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Subject:Cc:To:From: Message-ID:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ntQNFrAwRXAO2jW6xxxMTlYh+yGVrXaKaxa2kDvz2uU=; b=f/PS+jzCYh04bxiMr3gJVGp+Xa sSJBsBQc7b1VJD0qoe49sHMic6E7jLbegUtIeT4tQuOZD5SE1GApNzrd4WQFHX1FBdCEyiE0LjVH3 qC7nfChk04dDXFEMUzh6FGkSnuzOQMkokUwp5d1FR1BndaRUT/dvnXwpTYohz8kI3l9Nji6769t9d r/V33mQLOzM3JyIs9Ih9FAF5ku15a509D9sVavseqCQjDTpOnVM+6AqyA/yNb3M2j7RdXTsd6QdmU uXAc6vAlFRGffoG5y8bY6bOim+5IOOA5fxgOi+fUsYFposcgILHVp0dY02+vZkrdCIJ9MJ1g8NIRf 6asDHxjA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t6505-0000000HORX-27Vv; Wed, 30 Oct 2024 09:28:25 +0000 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t64xA-0000000HO8h-3DoP for linux-um@lists.infradead.org; Wed, 30 Oct 2024 09:25:26 +0000 Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-20cdb889222so61930305ad.3 for ; Wed, 30 Oct 2024 02:25:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730280323; x=1730885123; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:subject:cc:to:from:message-id:date:from:to:cc:subject :date:message-id:reply-to; bh=ntQNFrAwRXAO2jW6xxxMTlYh+yGVrXaKaxa2kDvz2uU=; b=ee+ojaYYNnyOORLsN0mRlxg/ZoLc1E7eGYmJLBEX63u0jSbgvlHjFmWX5Hhx2L8Ha7 wfVQoe78WcYSjRgcswO0M8IZ69CXfC3ElbhPOZN0yw+mRkwRZpbIxjAC/p4Jh7LD5VHC AUXQ0wkQgeyAF2b2BokTj8LgAZeduZndX6ZJzPCAx0F8TuA4QuetJgyAtkS678BaGHlA OV7x3uHvr1DVMSynNn2XWehIao1PqAGDpBeV10HLF3ZzTDCQdKS8GzM94K8lHu2f5vrW xVIjlKLghU6QocPKvnVDMXJyOU1iOtuWmNCZbnxEX1f1QiLl88Fqh1pc5rxg9BkeX6Io uoeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730280323; x=1730885123; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:subject:cc:to:from:message-id:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ntQNFrAwRXAO2jW6xxxMTlYh+yGVrXaKaxa2kDvz2uU=; b=UhnCr44Hin6S7QcCTk6VF3sD87juRMOrPpy4U+wWSIPpbtiUnXZLBDGnVNmIn4fJAy m9uAAm2Iy6z5vfvIwnSk3Jq5ukaC13zk09c8eqPgpX+Opf4yfrGQM7BFZc4uKQ5AwBWh ryedGt2IuVC6XDBY4+JU5wHZNzafNwC+RV1VjxzeuN1/r2FiWX/XE7oLBJV7rSKZx95u Sd5voaC7JN/rqe37AjcQ3kpSJQMR4hxLScNgzkfUkFOo4YX3lQMM0tdHxnc07xWmqUQY itMJd3nCKLx0M42Vo6hBOID0IElMn2Vu/b6K1qDFWDRaZwhA4c4nxQUlA6WXI0j0WR8P tEww== X-Gm-Message-State: AOJu0YwIoxHoK9uxI5osIB6D2SNFvqnVyV7DTrCZ4fglnIyZRZrSFAxr f+lhZODrWOSZ3Dme+448qAzN7CqYskXajC8l/PyE2EyqGqMVQcLq X-Google-Smtp-Source: AGHT+IGFwHNUxtOlLMddyeGmeZTn6QB0ZWXEUYWHBFDehpmAwMGluLIJuuP3gbiE7r6lasTnXOzVAg== X-Received: by 2002:a17:90a:468c:b0:2e2:d15c:1a24 with SMTP id 98e67ed59e1d1-2e8f1073406mr17424051a91.23.1730280323195; Wed, 30 Oct 2024 02:25:23 -0700 (PDT) Received: from mars.local.gmail.com (221x241x217x81.ap221.ftth.ucom.ne.jp. [221.241.217.81]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e92fa1db79sm1205153a91.8.2024.10.30.02.25.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Oct 2024 02:25:22 -0700 (PDT) Date: Wed, 30 Oct 2024 18:25:18 +0900 Message-ID: From: Hajime Tazaki To: benjamin@sipsolutions.net Cc: linux-um@lists.infradead.org, jdike@addtoit.com, richard@nod.at, anton.ivanov@cambridgegreys.com, johannes@sipsolutions.net, ricarkol@google.com Subject: Re: [RFC PATCH 00/13] nommu UML In-Reply-To: <1246523a29aeca3d727a359ebf8ccd931631e1ef.camel@sipsolutions.net> References: <1246523a29aeca3d727a359ebf8ccd931631e1ef.camel@sipsolutions.net> User-Agent: Wanderlust/2.15.9 (Almost Unreal) Emacs/26.3 Mule/6.0 MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241030_022524_836550_68B67046 X-CRM114-Status: GOOD ( 52.23 ) X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-um" Errors-To: linux-um-bounces+linux-um=archiver.kernel.org@lists.infradead.org Hello, On Mon, 28 Oct 2024 22:32:43 +0900, Benjamin Berg wrote: > > > > - a crash on userspace programs crashes a UML kernel, not signaling > > > > =A0 with SIGSEGV to the program. > > > > - commit c27e618 (during v6.12-rc1 merge) introduces invalid access= to > > > > =A0 a vma structure for our case, which updates the internal proced= ure > > > > =A0 of maple_tree subsystem.=A0 We're trying to fix issue but still= a > > > > =A0 random process on exit(2) crashes. > > >=20 > > > Btw. are you handling FP register save/restore? If it is not there, it > > > probably would not be too hard to add (XSAVE, etc.), though it might > > > add a bit of additional overhead. Especially as UML always saves the = FP > > > state rather than optimizing it like the x86 architectures. > >=20 > > The patch handles fp register on entry/leave at syscall; [07/13] patch > > contains this part. >=20 > That looks like FS/GS registers which are for thread-local storage. I > was talking about floating point registers. Maybe you meant another > patch? oh, this is my terrible mistake... no, the patch doesn't handle fp resister at all. > > I'm not familiar with that but what kind of optimizations does x86 > > architecture do for fp register handling ? >=20 > The kernel does not usually need the FP registers. So it optimizes the > pretty common case of a userspace -> kernel -> userspace switch that > happens for a syscall by simply not saving/restoring these registers at > all. >=20 > Obviously, it then still needs to do the work when the task is switched > or in the rare case that the kernel wants to use floating point itself. thanks for the information. > > > I am a bit confused overall. I mean, zpoline seems kind of neat, but a > > > requirement on patching userspace code also seems like a lot. > > >=20 > > > To me, it seems much more natural to catch the userspace syscalls usi= ng > > > a SECCOMP filter[1]. While quite a lot slower, that should be much mo= re > > > portable across architectures. For improved speed one could still do > > > architecture specific things inside the vDSO or by using zpoline. But > > > those would then "just" be optimizations and unpatched code would sti= ll > > > work correctly (e.g. JIT). > >=20 > > I'm not proposing this patch to replace existing UML implementations; > > for instance, the patchset cannot run CONFIG_MMU code in the whole > > kernel tree so, existing ptrace-based implementation still has real > > usecase.=A0 and ptrace based syscall hook is not indeed fast and the > > improvements with seccomp filter instead clearly has benefits.=A0 I > > think it's independent to this patchset. >=20 > Of course. nommu mode is a completely independent feature. >=20 > I am still wondering a bit about the users for such a mode. It is not > interesting for us as we use it for testing. Of course, speed is nice > but it is not the primary objective. >=20 > I understand that it can be an approach for a small "container", but > then you would need a very strict SECCOMP filter for the kernel itself. I didn't specifically describe the usecase for this at the v1 patch; but at least here is the list in my mind. 1) container-like usecase can be one of them (the original work proposed toward this), 2) testing nommu code in kernel might be another use, 3) faster I/O workload which involves bunch of syscalls over UML can be also interesting. I think this list covers pretty much to have !MMU mode in current MMU-full UML. speed might not be indeed the primary objective but if you'll see the dozen of test cases which issues bunch of syscalls (which I think possible case), this might be helpful. (snip) > > > For me, a big argument in favour of such an approach is its simplicit= y. > > > I am mostly basing that on the fact that this patchset should properly > > > handle other signals like SIGFPE and SIGSEGV. And, once it does that, > > > you will already have all the infrastructure to do the correct regist= er > > > save/restore using the host mcontex, which is what is needed in the > > > SIGSYS handler when using SECCOMP. The filter itself should be simple > > > as it just needs to catch all syscalls within valid userspace > > > executable memory[2] ranges. > >=20 > > I agree with your observation that the approach is simple. > > I don't have a good idea on how to handle SIGSEGV, but will try to see > > with your inputs. >=20 > You can probably use "[RFC PATCH v2 5/9] um: Add helper functions to > get/set state for SECCOMP" for getting the registers and also writing > them back if you want to restore using rt_sigreturn. thanks, I'm still testing with various attempts to deliver SEGV to userspace, but yet no luck so far... I will get you back once I come up with a nice form. (snip) > > > [2] I am assuming that userspace executable code is already confined = to > > > a certain address space within the UML process. Obviously, the kernel > > > itself and loaded modules need to be free to do host syscalls and > > > should not be affected by the SECCOMP filter. > >=20 > > I think our !MMU UML doesn't break this assumption.=A0 But did you see > > something to our patchset ? >=20 > I also assume that is fine. One just needs to understand this when > writing a SECCOMP filter for syscall emulation in nommu mode. okay, thanks for the clarification. -- Hajime