All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: discuss@x86-64.org
Cc: "Jan Beulich" <jbeulich@novell.com>, linux-kernel@vger.kernel.org
Subject: Re: [discuss] change_page_attr() and global_flush_tlb()
Date: Thu, 5 Apr 2007 14:43:12 +0200	[thread overview]
Message-ID: <200704051443.12505.ak@suse.de> (raw)
In-Reply-To: <4614DE61.76E4.0078.0@novell.com>

On Thursday 05 April 2007 11:32:49 Jan Beulich wrote:
> Looking at both the i386 and x86-64 implementations I fail to understand why
> there is an explicit requirement on calling global_flush_tlb() after
> change_page_attr(), yet actual TLB flushing will not normally happen (on i386
> it will happen if the CPU doesn't support clflush, but if it does, or on x86-64,
> the flushing depends on the list of deferred pages being non-empty, which
> can only happen when a large page gets re-combined). Is it assumed that
> the callers additionally call tlb_flush_all() (I think none of them do)?

Not sure I understand the question? global_flush_tlb is perhaps a little
misnamed, but it only flushes the pages changed in change_page_attr.
This works because it uses INVLPG which should ignore the G bits,
so not additional global flush is needed.

Linus wanted it done this two step way because he was worried about too 
many IPIs. I guess it doesn't make too much difference though because near all
callers only change single pages. The flush could be probably folded
back into c_p_a().

BTW we know the sequence for doing this is not quite as recommended
by Intel (TODO item) but afaik the TLB flushing works. 

> Further, change_page_attr()'s reference counting in a split large page's
> page table appears to imply that attributes are only changed from or back to
> the reference attributes, but not from one kind of non-default ones to the
> same or another set of non-default ones (otherwise the reference count
> will never again drop to zero), > and also not from default to default (i.e. the 
> caller trying to revert attributes to normal not knowing what state they are
> currently in) - this would BUG() if the large page was already reverted, or
> screw the reference count otherwise. Is all of this intentional? I think it
> will need to be changed as a prerequisite to supporting on-the-fly attribute
> changes in the SMP alternatives code, which was requested as a follow-up
> to the tightening of the CONFIG_DEBUG_RODATA effects.

The reference count is just to count pages that have a non default attribute
in the PMD range so that we know when to revert to a large page.

Originally attribute was only the caching attribute, but later changed
to include NX and RW for slab (but that area was always a bit hackish
and might have some bugs)

For non default to another non default changes the count should not change.

If it doesn't work this way that would be a bug.

> Finally, at least for the kernel image range it would seem to me that it might
> be beneficial to recombine mappings into large ones even when the
> attributes are not at their default anymore, but consistent across an entire
> 2Mb/4Mb range (i.e. after write-protecting .text). At the same time I wonder,
> though, whether it wouldn't be safer to remove execute permission from
> anything but .text along with write-protecting read-only regions under
> CONFIG_DEBUG_RODATA.

Yes I guess that would be a useful optimization.  Just getting
the reference counting right with that might be tricky.

At least the i386 code will probably change significantly soon as I clean 
up the GB page patches, which require some restructuring in c_p_a().

-Andi

  reply	other threads:[~2007-04-05 12:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-05  9:32 change_page_attr() and global_flush_tlb() Jan Beulich
2007-04-05 12:43 ` Andi Kleen [this message]
2007-04-05 13:52   ` [discuss] " Jan Beulich
2007-04-05 14:09     ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200704051443.12505.ak@suse.de \
    --to=ak@suse.de \
    --cc=discuss@x86-64.org \
    --cc=jbeulich@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.