linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Andy Lutomirski" <luto@kernel.org>
To: "H. Peter Anvin" <hpa@zytor.com>,
	"Dave Hansen" <dave.hansen@intel.com>,
	"Nadav Amit" <nadav.amit@gmail.com>
Cc: "Pasha Tatashin" <pasha.tatashin@soleen.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	"Borislav Petkov" <bp@alien8.de>,
	"Christian Brauner" <brauner@kernel.org>,
	bristot@redhat.com, "Ben Segall" <bsegall@google.com>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	dianders@chromium.org, dietmar.eggemann@arm.com,
	eric.devolder@oracle.com, hca@linux.ibm.com,
	"hch@infradead.org" <hch@infradead.org>,
	"Jacob Pan" <jacob.jun.pan@linux.intel.com>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	jpoimboe@kernel.org, "Joerg Roedel" <jroedel@suse.de>,
	juri.lelli@redhat.com,
	"Kent Overstreet" <kent.overstreet@linux.dev>,
	kinseyho@google.com,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	lstoakes@gmail.com, mgorman@suse.de, mic@digikod.net,
	michael.christie@oracle.com, "Ingo Molnar" <mingo@redhat.com>,
	mjguzik@gmail.com, "Michael S. Tsirkin" <mst@redhat.com>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	"Petr Mladek" <pmladek@suse.com>,
	"Rick P Edgecombe" <rick.p.edgecombe@intel.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Suren Baghdasaryan" <surenb@google.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Uladzislau Rezki" <urezki@gmail.com>,
	vincent.guittot@linaro.org, vschneid@redhat.com
Subject: Re: [RFC 11/14] x86: add support for Dynamic Kernel Stacks
Date: Mon, 11 Mar 2024 19:16:38 -0700	[thread overview]
Message-ID: <0645946c-f4f5-43b1-a9a0-03ed139036b3@app.fastmail.com> (raw)
In-Reply-To: <D4D747C5-2B0E-44B0-A550-BA05BE0AF2FA@zytor.com>

On Mon, Mar 11, 2024, at 6:25 PM, H. Peter Anvin wrote:
> On March 11, 2024 5:53:33 PM PDT, Dave Hansen <dave.hansen@intel.com> wrote:
>>On 3/11/24 16:56, Nadav Amit wrote:
>>> So you can look on the dirty-bit, which is not being set
>>> speculatively and save yourself one problem.
>>Define "set speculatively". :)
>>
>>> If software on one logical processor writes to a page while software
>>> on another logical processor concurrently clears the R/W flag in the
>>> paging-structure entry that maps the page, execution on some
>>> processors may result in the entry’s dirty flag being set (due to the
>>> write on the first logical processor) and the entry’s R/W flag being
>>> clear (due to the update to the entry on the second logical
>>> processor).
>>
>>In other words, you'll see both a fault *AND* the dirty bit.  The write
>>never retired and the dirty bit is set.
>>
>>Does that count as being set speculatively?
>>
>>That's just the behavior that the SDM explicitly admits to.
>
> Indeed; both the A and D bits are by design permissive; that is, the 
> hardware can set them at any time.
>
> The only guarantees are:
>
> 1. The hardware will not set the A bit on a not present late, nor the D 
> bit on a read only page.

Wait a sec.  What about setting the D bit on a not-present page?

I always assumed that the actual intended purpose of the D bit was for things like file mapping.  Imagine an alternate universe in which Linux used hardware dirty tracking instead of relying on do_wp_page, etc.

mmap(..., MAP_SHARED): PTE is created, read-write, clean

user program may or may not write to the page.

Now either munmap is called or the kernel needs to reclaim memory.  So the kernel checks if the page is dirty and, if so, writes it back, and then unmaps it.

Now some silly people invented SMP, so this needs an atomic operation:  xchg the PTE to all-zeros, see if the dirty bit is set, and, if itt's set, write back the page.  Otherwise discard it.

Does this really not work on Intel CPU?


  reply	other threads:[~2024-03-12  2:17 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-11 16:46 [RFC 00/14] Dynamic Kernel Stacks Pasha Tatashin
2024-03-11 16:46 ` [RFC 01/14] task_stack.h: remove obsolete __HAVE_ARCH_KSTACK_END check Pasha Tatashin
2024-03-17 14:36   ` Christophe JAILLET
2024-03-17 15:13     ` Pasha Tatashin
2024-03-11 16:46 ` [RFC 02/14] fork: Clean-up ifdef logic around stack allocation Pasha Tatashin
2024-03-11 16:46 ` [RFC 03/14] fork: Clean-up naming of vm_strack/vm_struct variables in vmap stacks code Pasha Tatashin
2024-03-17 14:42   ` Christophe JAILLET
2024-03-19 16:32     ` Pasha Tatashin
2024-03-11 16:46 ` [RFC 04/14] fork: Remove assumption that vm_area->nr_pages equals to THREAD_SIZE Pasha Tatashin
2024-03-17 14:45   ` Christophe JAILLET
2024-03-17 15:14     ` Pasha Tatashin
2024-03-11 16:46 ` [RFC 05/14] fork: check charging success before zeroing stack Pasha Tatashin
2024-03-12 15:57   ` Kirill A. Shutemov
2024-03-12 16:52     ` Pasha Tatashin
2024-03-11 16:46 ` [RFC 06/14] fork: zero vmap stack using clear_page() instead of memset() Pasha Tatashin
2024-03-12  7:15   ` Nikolay Borisov
2024-03-12 16:53     ` Pasha Tatashin
2024-03-14  7:55       ` Christophe Leroy
2024-03-14 13:52         ` Pasha Tatashin
2024-03-17 14:48   ` Christophe JAILLET
2024-03-17 15:15     ` Pasha Tatashin
2024-03-11 16:46 ` [RFC 07/14] fork: use the first page in stack to store vm_stack in cached_stacks Pasha Tatashin
2024-03-11 16:46 ` [RFC 08/14] fork: separate vmap stack alloction and free calls Pasha Tatashin
2024-03-14 15:18   ` Jeff Xie
2024-03-14 17:14     ` Pasha Tatashin
2024-03-17 14:51   ` Christophe JAILLET
2024-03-17 15:15     ` Pasha Tatashin
2024-03-11 16:46 ` [RFC 09/14] mm/vmalloc: Add a get_vm_area_node() and vmap_pages_range_noflush() public functions Pasha Tatashin
2024-03-11 16:46 ` [RFC 10/14] fork: Dynamic Kernel Stacks Pasha Tatashin
2024-03-11 19:32   ` Randy Dunlap
2024-03-11 19:55     ` Pasha Tatashin
2024-03-11 16:46 ` [RFC 11/14] x86: add support for " Pasha Tatashin
2024-03-11 22:17   ` Andy Lutomirski
2024-03-11 23:10     ` Pasha Tatashin
2024-03-11 23:33       ` Thomas Gleixner
2024-03-11 23:34       ` Andy Lutomirski
2024-03-12  0:08         ` Pasha Tatashin
2024-03-12  0:23           ` Pasha Tatashin
2024-03-11 23:34     ` Dave Hansen
2024-03-11 23:41       ` Andy Lutomirski
2024-03-11 23:56         ` Nadav Amit
2024-03-12  0:02           ` Andy Lutomirski
2024-03-12  7:20             ` Nadav Amit
2024-03-12  0:53           ` Dave Hansen
2024-03-12  1:25             ` H. Peter Anvin
2024-03-12  2:16               ` Andy Lutomirski [this message]
2024-03-12  2:20                 ` H. Peter Anvin
2024-03-12 21:58   ` Andi Kleen
2024-03-13 10:23   ` Thomas Gleixner
2024-03-13 13:43     ` Pasha Tatashin
2024-03-13 15:28       ` Pasha Tatashin
2024-03-13 16:12         ` Thomas Gleixner
2024-03-14 14:03           ` Pasha Tatashin
2024-03-14 18:26             ` Thomas Gleixner
2024-03-11 16:46 ` [RFC 12/14] task_stack.h: Clean-up stack_not_used() implementation Pasha Tatashin
2024-03-11 16:46 ` [RFC 13/14] task_stack.h: Add stack_not_used() support for dynamic stack Pasha Tatashin
2024-03-11 16:46 ` [RFC 14/14] fork: Dynamic Kernel Stack accounting Pasha Tatashin
2024-03-11 17:09 ` [RFC 00/14] Dynamic Kernel Stacks Mateusz Guzik
2024-03-11 18:58   ` Pasha Tatashin
2024-03-11 19:21     ` Mateusz Guzik
2024-03-11 19:55       ` Pasha Tatashin
2024-03-12 17:18 ` H. Peter Anvin
2024-03-12 19:45   ` Pasha Tatashin
2024-03-12 21:36     ` H. Peter Anvin
2024-03-14 19:05       ` Kent Overstreet
2024-03-14 19:23         ` Pasha Tatashin
2024-03-14 19:28           ` Kent Overstreet
2024-03-14 19:34             ` Pasha Tatashin
2024-03-14 19:49               ` Kent Overstreet
2024-03-12 22:18     ` David Laight
2024-03-14 19:43   ` Matthew Wilcox
2024-03-14 19:53     ` Kent Overstreet
2024-03-14 19:57       ` Matthew Wilcox
2024-03-14 19:58         ` Kent Overstreet
2024-03-15  3:13         ` Pasha Tatashin
2024-03-15  3:39           ` H. Peter Anvin
2024-03-16 19:17             ` Pasha Tatashin
2024-03-17  0:41               ` Matthew Wilcox
2024-03-17  1:32                 ` Kent Overstreet
2024-03-17 14:19                 ` Pasha Tatashin
2024-03-17 14:43               ` Brian Gerst
2024-03-17 16:15                 ` Pasha Tatashin
2024-03-17 21:30                   ` Brian Gerst
2024-03-18 14:59                     ` Pasha Tatashin
2024-03-18 21:02                       ` Brian Gerst
2024-03-19 14:56                         ` Pasha Tatashin
2024-03-17 18:57               ` David Laight
2024-03-18 15:09                 ` Pasha Tatashin
2024-03-18 15:13                   ` Pasha Tatashin
2024-03-18 15:19                   ` Matthew Wilcox
2024-03-18 15:30                     ` Pasha Tatashin
2024-03-18 15:53                       ` David Laight
2024-03-18 16:57                         ` Pasha Tatashin
2024-03-18 15:38               ` David Laight
2024-03-18 17:00                 ` Pasha Tatashin
2024-03-18 17:37                   ` Pasha Tatashin
2024-03-15  4:17           ` H. Peter Anvin
2024-03-17  0:47     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0645946c-f4f5-43b1-a9a0-03ed139036b3@app.fastmail.com \
    --to=luto@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=brauner@kernel.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dianders@chromium.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=eric.devolder@oracle.com \
    --cc=hca@linux.ibm.com \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=jgg@ziepe.ca \
    --cc=jpoimboe@kernel.org \
    --cc=jroedel@suse.de \
    --cc=juri.lelli@redhat.com \
    --cc=kent.overstreet@linux.dev \
    --cc=kinseyho@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lstoakes@gmail.com \
    --cc=mgorman@suse.de \
    --cc=mic@digikod.net \
    --cc=michael.christie@oracle.com \
    --cc=mingo@redhat.com \
    --cc=mjguzik@gmail.com \
    --cc=mst@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=npiggin@gmail.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=urezki@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).