From: Manfred Spraul <manfred@colorfullife.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: x86 ptep_get_and_clear question
Date: Thu, 15 Feb 2001 22:26:50 +0100 [thread overview]
Message-ID: <3A8C499A.E0370F63@colorfullife.com> (raw)
In-Reply-To: <3A8C254F.17334682@colorfullife.com> <200102151905.LAA62688@google.engr.sgi.com> <20010215201945.A2505@pcep-jamie.cern.ch> <96heaj$bvs$1@penguin.transmeta.com>
Linus Torvalds wrote:
>
> In article <20010215201945.A2505@pcep-jamie.cern.ch>,
> Jamie Lokier <lk@tantalophile.demon.co.uk> wrote:
> >> > << lock;
> >> > read pte
> >> > if (!present(pte))
> >> > do_page_fault();
> >> > pte |= dirty
> >> > write pte.
> >> > >> end lock;
> >>
> >> No, it is a little more complicated. You also have to include in the
> >> tlb state into this algorithm. Since that is what we are talking about.
> >> Specifically, what does the processor do when it has a tlb entry allowing
> >> RW, the processor has only done reads using the translation, and the
> >> in-memory pte is clear?
> >
> >Yes (no to the no): Manfred's pseudo-code is exactly the question you're
> >asking. Because when the TLB entry is non-dirty and you do a write, we
> >_know_ the processor will do a locked memory cycle to update the dirty
> >bit. A locked memory cycle implies read-modify-write, not "write TLB
> >entry + dirty" (which would be a plain write) or anything like that.
> >
> >Given you know it's a locked cycle, the only sensible design from Intel
> >is going to be one of Manfred's scenarios.
>
> Not necessarily, and this is NOT guaranteed by the docs I've seen.
>
> It _could_ be that the TLB data actually also contains the pointer to
> the place where it was fetched, and a "mark dirty" becomes
>
> read *ptr locked
> val |= D
> write *ptr unlock
>
Jamie wrote "one of my scenarios", that's the other option ;-)
> Now, I will agree that I suspect most x86 _implementations_ will not do
> this. TLB's are too timing-critical, and nobody tends to want to make
> them bigger than necessary - so saving off the source address is
> unlikely. Also, setting the D bit is not a very common operation, so
> it's easy enough to say that an internal D-bit-fault will just cause a
> TLB re-load, where the TLB re-load just sets the A and D bits as it
> fetches the entry (and then page fault handling is an automatic result
> of the reload).
>
But then the cpu would support setting the D bit in the page directory,
but it doesn't.
Probably Kanoj is right, the current code is not guaranteed by the
specs.
But if we change the interface, could we think about the poor s390
developers?
s390 only has a "clear the present bit in the pte and flush the tlb"
instruction.
>From your other post:
> pte = ptep_get_and_clear(page_table);
> flush_tlb_page(vma, address);
>+ pte = ptep_update_after_flush(page_table, pte);
What about one arch specific
pte = ptep_get_and_invalidate(vma, address, page_table);
On i386 SMP it would
{
pte = *page_table_entry;
if(!present(pte))
return pte;
lock; andl 0xfffffffe, *page_table_entry;
flush_tlb_page();
return *page_table_entry | 1;
}
>
> The "gather" operation could possibly be improved to make the other
> CPU's do useful work while being shot down (ie schedule away to another
> mm), but that has it's own pitfalls too.
>
IMHO scheduling away is the best long term solution.
Perhaps try to schedule away, just to improve the probability that
mm->cpu_vm_mask is clear.
I just benchmarked a single flush_tlb_page().
Pentium II 350: ~ 2000 cpu ticks.
Pentium III 850: ~ 3000 cpu ticks.
--
Manfred
next prev parent reply other threads:[~2001-02-15 21:26 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20010215173547.A2079@pcep-jamie.cern.ch>
[not found] ` <200102151723.JAA43255@google.engr.sgi.com>
2001-02-15 17:47 ` x86 ptep_get_and_clear question Jamie Lokier
2001-02-15 18:05 ` Kanoj Sarcar
2001-02-15 18:23 ` Kanoj Sarcar
2001-02-15 18:42 ` Jamie Lokier
2001-02-15 18:57 ` Kanoj Sarcar
2001-02-15 19:06 ` Ben LaHaise
2001-02-15 19:19 ` Kanoj Sarcar
2001-02-15 20:16 ` Linus Torvalds
2001-02-15 18:51 ` Manfred Spraul
2001-02-15 19:05 ` Kanoj Sarcar
2001-02-15 19:19 ` Jamie Lokier
2001-02-15 20:31 ` Linus Torvalds
2001-02-15 21:26 ` Manfred Spraul [this message]
2001-02-15 21:29 ` Manfred Spraul
2001-02-16 1:21 ` Linus Torvalds
2001-02-16 14:18 ` Jamie Lokier
2001-02-16 14:59 ` Manfred Spraul
2001-02-16 15:27 ` Jamie Lokier
2001-02-16 15:54 ` Manfred Spraul
2001-02-16 16:00 ` Jamie Lokier
2001-02-16 16:23 ` Manfred Spraul
2001-02-16 16:43 ` Jamie Lokier
2001-02-16 17:12 ` Manfred Spraul
2001-02-16 17:20 ` Jamie Lokier
2001-02-16 17:36 ` Linus Torvalds
2001-02-16 18:49 ` Manfred Spraul
2001-02-16 19:00 ` Linus Torvalds
2001-02-16 19:02 ` Ben LaHaise
2001-02-16 19:32 ` Linus Torvalds
2001-02-16 19:42 ` Ben LaHaise
2001-02-16 17:37 ` Jamie Lokier
2001-02-16 18:04 ` Manfred Spraul
2001-02-16 18:09 ` Jamie Lokier
2001-02-16 18:36 ` Hugh Dickins
2001-02-16 17:29 ` Ben LaHaise
2001-02-16 17:38 ` Linus Torvalds
2001-02-16 17:44 ` Ben LaHaise
2001-02-16 17:59 ` Linus Torvalds
2001-02-15 23:57 ` Jamie Lokier
2001-02-16 0:55 ` Linus Torvalds
2001-02-15 19:07 ` Jamie Lokier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3A8C499A.E0370F63@colorfullife.com \
--to=manfred@colorfullife.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox