All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Peter Anvin <hpa@zytor.com>,
	jmario@redhat.com, dzickus@redhat.com,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH] sched,x86: optimize switch_mm for multi-threaded workloads
Date: Wed, 31 Jul 2013 18:39:20 -0400	[thread overview]
Message-ID: <51F99218.4060104@redhat.com> (raw)
In-Reply-To: <CA+55aFwj+6P4y8MgGTGbiK_EtfY7LJ_bL_7k9zFYLx4S8F0rJQ@mail.gmail.com>

On 07/31/2013 06:21 PM, Linus Torvalds wrote:
> Ummm.. The race is to the testing of the bit, not setting. The testing
> of the bit is not valid before we have set the tlb state, AFAIK.

I believe the bit is cleared and set by the current CPU.

Clearing is done from the TLB flush IPI handler, or by directly
calling leave_mm from ptep_flush_clear if the flush originated
locally.  The exception is clear_tasks_mm_cpumask, which may
only be called for an already offlined CPU.

I believe setting is only ever done in switch_mm.

Interrupts are blocked inside switch_mm, so I think we
are safe.

Would you like a comment to this effect in the code, or
are there other things we need to check first?

> On Jul 31, 2013 3:16 PM, "Rik van Riel" <riel@redhat.com
> <mailto:riel@redhat.com>> wrote:
>
>     On 07/31/2013 06:07 PM, Linus Torvalds wrote:
>
>         On Wed, Jul 31, 2013 at 2:43 PM, Rik van Riel <riel@redhat.com
>         <mailto:riel@redhat.com>> wrote:
>
>
>             The cause turned out to be unnecessary atomic accesses to the
>             mm_cpumask. When in lazy TLB mode, the CPU is only removed from
>             the mm_cpumask if there is a TLB flush event.
>
>             Most of the time, no such TLB flush happens, and the kernel
>             skips the TLB reload.  It can also skip the atomic memory
>             set & test.
>
>
>         The patch looks obvious, and I'm not seeing any very clear
>         reasons for
>         why we would want that test-and-set to be atomic. That said, I'd
>         like
>         to have some explicit comments about exactly why it doesn't need the
>         atomicity. Because afaik, there actually are concurrent readers
>         _and_
>         writers of that mask, and the accesses are not locked by anything
>         here.
>
>      >
>
>         I _think_ the reason for this all being safe is simply that the only
>         real race is "We need to set the bit before we load the page table,
>         and we're protected against that bit being cleared because the TLB
>         state is TLBSTATE_OK and thus TLB flushing will no longer leave that
>         mm".
>
>         But damn, it all looks subtle as hell. That code does:
>
>                           this_cpu_write(cpu_tlbstate.__state, TLBSTATE_OK);
>                           BUG_ON(this_cpu_read(cpu___tlbstate.active_mm)
>         != next);
>
>                           if (!cpumask_test_and_set_cpu(__cpu,
>         mm_cpumask(next))) {
>
>         and I'm wondering if we need a barrier to make sure that that
>         TLBSTATE_OK write happens *before* we test the cpumask. With
>         test_and_set(), we have the barrier in the test-and-set. But
>         with just
>         test_bit, I'm not seeing why the compiler couldn't re-order them. I
>         suspect it won't, but...
>
>
>     cpumask_set_bit expands to set_bit, which is atomic
>
>     Do we need any explicit compiler barrier in addition to the
>     atomic operation performed by set_bit?
>
>     I would be happy to rewrite the comment right above the
>     cpumask_set_cpu call if you want.
>
>     --
>     All rights reversed
>


-- 
All rights reversed

  parent reply	other threads:[~2013-07-31 22:39 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-31 21:43 [PATCH] sched,x86: optimize switch_mm for multi-threaded workloads Rik van Riel
2013-07-31 21:46 ` Paul Turner
2013-07-31 22:07 ` Linus Torvalds
2013-07-31 22:16   ` Rik van Riel
     [not found]     ` <CA+55aFwj+6P4y8MgGTGbiK_EtfY7LJ_bL_7k9zFYLx4S8F0rJQ@mail.gmail.com>
2013-07-31 22:39       ` Rik van Riel [this message]
     [not found]         ` <CA+55aFxKSEHSkdsCkTvjwwo4MnEpN0TwJrek2jd1QJCyUTb-=Q@mail.gmail.com>
2013-07-31 23:12           ` Rik van Riel
2013-07-31 23:14             ` Paul Turner
2013-08-01  0:41               ` Linus Torvalds
2013-08-01  1:58                 ` Rik van Riel
2013-08-01  2:14                   ` Linus Torvalds
2013-08-01  2:14                 ` [PATCH -v2] " Rik van Riel
2013-08-01  2:25                   ` Linus Torvalds
2013-08-01  7:04                     ` Ingo Molnar
2013-08-02  9:07                   ` [tip:sched/core] sched/x86: Optimize switch_mm() " tip-bot for Rik van Riel
2013-08-02  9:12                     ` Ingo Molnar
2013-08-02 12:44                       ` Joe Mario
2013-08-03  1:18                       ` Greg KH
2013-08-01 15:37             ` [PATCH] sched,x86: optimize switch_mm " Jörn Engel
2013-08-01 17:45               ` Linus Torvalds
2013-08-01 17:54                 ` Jörn Engel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51F99218.4060104@redhat.com \
    --to=riel@redhat.com \
    --cc=dzickus@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jmario@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.