From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14ADF14601A for ; Thu, 15 Feb 2024 23:17:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.45 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708039056; cv=none; b=jE2VCOtKzTy3K/IAJmNFhgfQbSRNut+5zJCSusTrjYmgh8RpneMrYGz1f59kIOOT217fxSfA8iOLLbiDN5guUF0vXxHr6dZNDmsMcGI2bRETibgcR/wEtc9+VmTOAXTyVo+87hFRB62hseaWYHg0K3OuYRJNslKwx7mmSkBexaY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708039056; c=relaxed/simple; bh=IQzXo8YAC7z3kfvXEALYTSg4Bkn7ckildzJH7UoOgRU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VGPPpcKzNQp9MkJdn9tntEYiP0K8Nt/etvUSJmMIuRYvcPz9D5OD9zgLGMaRaGvFCqumKMdxHa5zgy6qjNjgj1NFbXKvoywkGC236a/Q7AxQE3GjyuOpExtXDVl4R62C7VTibekbnGLsxSI1d8lAPNVYcxLxrrMlt7GAIG4EXV4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=Tu/KY/w0; arc=none smtp.client-ip=209.85.216.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="Tu/KY/w0" Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-296cca9169bso1215892a91.3 for ; Thu, 15 Feb 2024 15:17:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1708039053; x=1708643853; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2/et5MSWxRm/qeqE7m6EKR3xHrEpnYGsOApna7Ov6Ak=; b=Tu/KY/w0ogFz+aDAE7+EsAre2Q45YrjUPCMfM/DtpS+hEe7PwFkRGOSYovQfj6I6Vk uF46N0Q+wSnw+6W5i211RL7hATdAzQyzGjwWbndHC6xPzqpF/93Idqo/qrnVCwkCzib6 wlUeVgZv1xaECXxiVwb64VuSJVjC6ucxSZLza1KyaDCU0QPQ271ZgutIHa0KmDcMhSQl gv5wnIRI5OV9u0g/kdAl0gKaqjfdJ7QZyRoQTuYlzBjVgsTq397AjHuwUJS6JkXOy4Bk YaGtCJliUdf69ccPXhBlCIsEa5ZhKm4nhY362pLdg2yXIJWs+RoHNQRj2Z6LZb/Ab1r7 bOFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708039053; x=1708643853; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2/et5MSWxRm/qeqE7m6EKR3xHrEpnYGsOApna7Ov6Ak=; b=U+HtED1KpkaOambwf3pA5E5oEbvwqAkDY0URio7Q5oOTcPIe5O5C0aDGWwbJ7b1g56 wLqPAi6LwlmIt3NVvuxtdt1LW3umezpYBbOKTN0b64pXGDdBiafKA0Gaxit8bD6/tnhq xGnCxBNiE+Cf2PdemWZl/WsHuNVau2Lg+HVT2BgnHn1W58J82xepoEVf27cUTSFwMAPO g0lqo6lZQzaHI9vCywg6gr3YysMN6U+Z49dIii/vxcrMy2OjBuiMlw3l3MjLpjXEIVBQ 5llahtUlwmLS/R2FYx9OSa1MZiTEkpZavwp5/8a50r1h/VT5pXjGAysYvCS6tP487fAp gz4A== X-Forwarded-Encrypted: i=1; AJvYcCXCzgbGzISO16fqHG08zgJfmdm634VAQvPTG5NC21bxavfb5nt1S8X0WmX1VqA5ELuRUjJT8HxKxiHEEo/B7fvB7QyQnRcRzTZbMic= X-Gm-Message-State: AOJu0Yxo1egZAmRTiHSR32Bej/2cjx2+EpojwYG7i2V2u4A1QO10j1Mp wFBPcPP1vLW4iEHTrAh5aSiZfIwCkZdNvM3QV2X/AWXC8g23LXwsMw5E7TD4FiToAAzu+qAntF7 0Q/4= X-Google-Smtp-Source: AGHT+IFhA4H/FgqLaX5RroaIei6TRFkU6Su0sp4d98cSsiQq3PYmJvaE1sS4txnHDPpO9+9KxCEcqg== X-Received: by 2002:a17:90a:ac08:b0:296:3a5:6fb8 with SMTP id o8-20020a17090aac0800b0029603a56fb8mr3007778pjq.25.1708039053227; Thu, 15 Feb 2024 15:17:33 -0800 (PST) Received: from dread.disaster.area (pa49-195-8-86.pa.nsw.optusnet.com.au. [49.195.8.86]) by smtp.gmail.com with ESMTPSA id eu16-20020a17090af95000b00296f3401cabsm336168pjb.41.2024.02.15.15.17.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Feb 2024 15:17:32 -0800 (PST) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1rakyv-0072Pt-2a; Fri, 16 Feb 2024 10:17:29 +1100 Date: Fri, 16 Feb 2024 10:17:29 +1100 From: Dave Chinner To: Adrian Vovk Cc: Jan Kara , Matthew Wilcox , Christian Brauner , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org, Christoph Hellwig Subject: Re: [LSF/MM/BPF TOPIC] Dropping page cache of individual fs Message-ID: References: <20240116-tagelang-zugnummer-349edd1b5792@brauner> <20240116114519.jcktectmk2thgagw@quack3> <20240117-tupfen-unqualifiziert-173af9bc68c8@brauner> <20240117143528.idmyeadhf4yzs5ck@quack3> <3107a023-3173-4b3d-9623-71812b1e7eb6@gmail.com> <20240215135709.4zmfb7qlerztbq6b@quack3> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Thu, Feb 15, 2024 at 02:46:52PM -0500, Adrian Vovk wrote: > On 2/15/24 08:57, Jan Kara wrote: > > On Mon 29-01-24 19:13:17, Adrian Vovk wrote: > > > Hello! I'm the "GNOME people" who Christian is referring to > > Got back to thinking about this after a while... > > > > > On 1/17/24 09:52, Matthew Wilcox wrote: > > > > I feel like we're in an XY trap [1]. What Christian actually wants is > > > > to not be able to access the contents of a file while the device it's > > > > on is suspended, and we've gone from there to "must drop the page cache". > > > What we really want is for the plaintext contents of the files to be gone > > > from memory while the dm-crypt device backing them is suspended. > > > > > > Ultimately my goal is to limit the chance that an attacker with access to a > > > user's suspended laptop will be able to access the user's encrypted data. I > > > need to achieve this without forcing the user to completely log out/power > > > off/etc their system; it must be invisible to the user. The key word here is > > > limit; if we can remove _most_ files from memory _most_ of the time Ithink > > > luksSuspend would be a lot more useful against cold boot than it is today. > > Well, but if your attack vector are cold-boot attacks, then how does > > freeing pages from the page cache help you? I mean sure the page allocator > > will start tracking those pages with potentially sensitive content as free > > but unless you also zero all of them, this doesn't help anything against > > cold-boot attacks? The sensitive memory content is still there... > > > > So you would also have to enable something like zero-on-page-free and > > generally the cost of this is going to be pretty big? > > Yes you are right. Just marking pages as free isn't enough. > > I'm sure it's reasonable enough to zero out the pages that are getting > free'd at our request. But the difficulty here is to try and clear pages > that were freed previously for other reasons, unless we're zeroing out all > pages on free. So I suppose that leaves me with a couple questions: > > - As far as I know, the kernel only naturally frees pages from the page > cache when they're about to be given to some program for imminent use. Memory pressure does cause cache reclaim. Not just page cache, but also slab caches and anything else various subsystems can clean up to free memory.. > But > then in the case the page isn't only free'd, but also zero'd out before it's > handed over to the program (because giving a program access to a page filled > with potentially sensitive data is a bad idea!). Is this correct? Memory exposed to userspace is zeroed before userspace can access it. Kernel memory is not zeroed unless the caller specifically asks for it to be zeroed. > - Are there other situations (aside from drop_caches) where the kernel frees > pages from the page cache? Especially without having to zero them anyway? In truncate(), fallocate(), direct IO, fadvise(), madvise(), etc. IOWs, there are lots of runtime vectors that cause page cache to be freed. > other words, what situations would turning on some zero-pages-on-free > setting actually hurt performance? Lots. page contents are typically cold when the page is freed so the zeroing is typically memory latency and bandwidth bound. And doing it on free means there isn't any sort of "cache priming" performance benefits that we get with zeroing at allocation because the page contents are not going to be immediately accessed by the kernel or userspace. > - Does dismounting a filesystem completely zero out the removed fs's pages > from the page cache? No. It just frees them. No explicit zeroing. > - I remember hearing somewhere of some Linux support for zeroing out all > pages in memory if they're free'd from the page cache. However, I spent a > while trying to find this (how to turn it on, benchmarks) and I couldn't > find it. Do you know if such a thing exists, and if so how to turn it on? > I'm curious of the actual performance impact of it. You can test it for yourself: the init_on_free kernel command line option controls whether the kernel zeroes on free. Typical distro configuration is: $ sudo dmesg |grep auto-init [ 0.018882] mem auto-init: stack:all(zero), heap alloc:on, heap free:off $ So this kernel zeroes all stack memory, page and heap memory on allocation, and does nothing on free... -Dave. -- Dave Chinner david@fromorbit.com