From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.sws.net.au (smtp.sws.net.au [144.76.186.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B68EC26E70E for ; Fri, 19 Jun 2026 07:29:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=144.76.186.9 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781854159; cv=none; b=i6+XnkS5lHX5ycGK94fek/s8YUsr6GJldnv7JH0YF21VHYmfvjlzlPPntQUkTdmYSzpuT8NvoePFyb1xr6ixXvyri8m9CeW+fKRgegEP8c51DH6AP5SrTQ0szEQnDkMTC8sVtQ3w5SoDY5V/fsKL7EH7Hd9izkx1hQ1BdPUu2eQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781854159; c=relaxed/simple; bh=zOs+kyPUS7BRnxSMW6Ekqoy/gaLQahtkf9TvcR6DuJw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=r8XpVluXjTAp3zzfSloHrnvOK6aKabS7Scu0HoA0GKwpo3C/eT37xTIxEey34U6Wk/7lxsxNDRvjdKhskE8g8WGXfNm8MXxgGwfg/lJ72V9nGuN/GdF0ZWs0FaznbEZ1nSMOFC2wQ3p9KTgHttPzvB6P/J8VX1yOlRZKSsxKls0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=coker.com.au; spf=pass smtp.mailfrom=coker.com.au; dkim=pass (1024-bit key) header.d=coker.com.au header.i=@coker.com.au header.b=nQSJ9wYe; arc=none smtp.client-ip=144.76.186.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=coker.com.au Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=coker.com.au Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=coker.com.au header.i=@coker.com.au header.b="nQSJ9wYe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=coker.com.au; s=2008; t=1781854154; bh=Y1xpzYwkwXvfNMyV+qfD9kaLj6ZU2mSqvG1dm/2Fe5w=; l=6149; h=From:To:Subject:Date:In-Reply-To:References:From; b=nQSJ9wYeGKrjj31riYnGvqML+NlIgXkDeJUjj78oa1c/A/W3zaXP/EayUDxKW43Pb yx0/pY2VffXzpHgGR3CxLVkwRjTGfJGBh5pLb+YbC1dzwjN8WQre9VhODi6jYZsZD/ FxcnKOzy5YUQ5lL57C9SOAR096jWXUR3dvESyfv0= Received: from liv.coker.com.au (unknown [IPv6:2001:4479:6205:8300:3373:34fb:f861:c4e0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) (Authenticated sender: russell@coker.com.au) by smtp.sws.net.au (Postfix) with ESMTPSA id AE62F10B58; Fri, 19 Jun 2026 17:29:13 +1000 (AEST) From: Russell Coker To: selinux-refpolicy@vger.kernel.org, Rahul Sandhu Subject: Re: RFC: earlyinit_t Date: Fri, 19 Jun 2026 17:29:25 +1000 Message-ID: <2460841.VLH7GnMWUR@dojacat> In-Reply-To: References: <5504686.lxrEIsy0gb@xev> Precedence: bulk X-Mailing-List: selinux-refpolicy@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="utf-8" On Friday, 19 June 2026 03:18:44 AEST Rahul Sandhu wrote: > > That was never a problem in the traditional Unix design. > > However, it is a problem now! It is a problem that we are left to solve after others have created it. I am not opposed to solving it, but we should be clear about where blame lies. > > This is a bad idea. The initramfs should just do whatever is needed to > > mount real root and nothing more. > > I think there are _some_ advantages (e.g., logging; I find journald to > be quite exceptional in this area at least). And yes, I too do find the > extent of stuff created in initramfs frustrating now. Unfortunately, we > don't really have the ability to change this for all of desktop linux, > being outside of the purview of refpolicy's scope, leaving it for us to > deal with. For this to be needed for logging we need the following conditions to hold: 1) There is useful things to log in early boot which can't be logged in other ways (EG through the kernel message log or storing a file in a tmpfs). 2) We don't have a "stop initramfs stuff before starting real root stuff" cutoff. 3) Programs which log via /dev/log can't reconnect when syslogd restarts. 1 is dubious and 2 and 3 don't hold AFAIK. If 3 holds then we could have some mechanism similar to "telinit u" or "systemctl daemon-reexec" to restart it. Which we would need anyway to allow correctly upgrading whatever is listening to /dev/log without a reboot. The recent systemd functionality of listening to sockets on behalf of daemons allows the "systemctl daemon-reexec" functionality to be centralised. Moving some of that back out to other processes seems like a backward step. I can't imagine any scenario where such sockets need to be preserved that doesn't lead to the inability to restart running processes to fix security bugs without a reboot or "systemctl soft-reboot". But maybe even "systemctl soft-reboot" will be insufficient to fix such things. NB I'm not opposing your proposed solution to the problem that is being imposed on us. But we have to keep in mind that it is a problem and will have wider scope than just what we are doing here, including the potential for security problems. > > If people continue with the bad idea of processes running from initramfs > > to > > multiuser mode there is no other choice. > > That's not what's occuring here. systemd holds open various resources > for processes which die off and then later start again, or sometimes do > not even start until later. This means that, using the example of the > journald socket which is created and held open by systemd-init very > early in the boot process, no logs are ever lost, regardless of whether > or not journald is stopped to be re-exec'd. If it's only systemd holding the resources open then "systemctl daemon-reexec" can solve the problem and cause a domain transition, in fact that's what currently happens. > > Probably best to have a new module and make it optional for systems that > > don't do that sort of thing. > > I don't see what is served by making it optional; the sid exists either > way, it's just kernel_t. If anything, systems which don't do this sort > of thing stand the most to gain in terms of security improvements. It can't be optional at link time but it can be optional at compile time. > > Systems without the unconfined module work well currently with a few > > tweaks. > They do! I'm not denying that at all, and I think they work well at the > moment because of the various "subsystem unconfined" stuff that exists, > an example being files_manage_all_files(). The point is exactly that: > for a fair few domain, no real meaningful confinement exists (e.g. the > init process for systemd). I'm not saying this is a fault of policy; if system_u:system_r:init_t:s0 etbe 2561 0.0 0.0 23380 1136 ? S May13 0:00 (sd-pam) Situations like the above are a reason for having some confinement of init_t. The possibility of sd-pam being exploited to use one of the recent kernel exploits to get UID 0 isn't a good one. > anything quite the opposite, and I think it's wise to take the step of > accepting some of these things as somewhat scopeless. To further expand > on my systemd example, it basically needs to read any file in theory, > it would also need to getattr and mount on any file for sandboxing. It > can load SELinux policy, and has such broad, sweeping access that most > of the rules are simply there to grant it as close to full access of > the system as possible. Without commenting on this design from init's > perspective, I _do_ think that the _policy_ choice of accepting init's > scope under systemd is a good one, namely because trying to fight what > the upstream of a piece of software expects breaks robustness of policy > with practically zero security advancements. Even if we meticulously > went around adding type attributes to files systemd can sandbox, or the > dev nodes it can relabel, etc, the accesses would be pretty much just > as broad anyway, and even if they weren't, a comprimised systemd-init > can just simply load a new policy, or modify the boot chain, etc. I agree that complexity makes security more difficult and that simplification can in many situations be helful for security. Having 36 different domains for systemd stuff is excessive. We have to make some tough decisions about these things and I think that 36 systemd domains was not the right decision. > > The problem with this is demonstrated by all the ifdef(`distro_ubuntu',` > > sections in the current policy, they have 19 domains unconfined that > > everyone else has confined. Removing the unconfined module is a way of > > quickly fixing that on an Ubuntu system. > > Hm, that's frustrating. Maybe we could gatekeep this behind a tunable > or something akin to that for Ubuntu users? I am not opposed to such a tunable, hopefully we could do it in a way that the Ubuntu people don't object to. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/