From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Lai Jiangshan <eag0628@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Waiman Long <waiman.long@hp.com>,
linux-kernel@vger.kernel.org, mingo@elte.hu, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
peterz@infradead.org, Valdis.Kletnieks@vt.edu,
dhowells@redhat.com, edumazet@google.com, darren@dvhart.com,
fweisbec@gmail.com, sbw@mit.edu, torvalds@linux-foundation.org,
Davidlohr Bueso <davidlohr.bueso@hp.com>
Subject: Re: [PATCH RFC ticketlock] Auto-queued ticketlock
Date: Wed, 12 Jun 2013 07:21:02 -0700 [thread overview]
Message-ID: <20130612142102.GT5146@linux.vnet.ibm.com> (raw)
In-Reply-To: <CACvQF50DvfOGosHgiN+D5GTUAJHjX+9xoxdrdGgzbh2ZbsdK=Q@mail.gmail.com>
On Wed, Jun 12, 2013 at 07:06:53PM +0800, Lai Jiangshan wrote:
> On Wed, Jun 12, 2013 at 9:58 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > On Wed, 2013-06-12 at 09:19 +0800, Lai Jiangshan wrote:
> >
> >> > +
> >> > +/*
> >> > + * Hand the lock off to the first CPU on the queue.
> >> > + */
> >> > +void tkt_q_do_wake(arch_spinlock_t *lock)
> >> > +{
> >> > + struct tkt_q_head *tqhp;
> >> > + struct tkt_q *tqp;
> >> > +
> >> > + /* If the queue is still being set up, wait for it. */
> >> > + while ((tqhp = tkt_q_find_head(lock)) == NULL)
> >> > + cpu_relax();
> >> > +
> >> > + for (;;) {
> >> > +
> >> > + /* Find the first queue element. */
> >> > + tqp = ACCESS_ONCE(tqhp->spin);
> >> > + if (tqp != NULL)
> >> > + break; /* Element exists, hand off lock. */
> >> > + if (tkt_q_try_unqueue(lock, tqhp))
> >> > + return; /* No element, successfully removed queue. */
> >> > + cpu_relax();
> >> > + }
> >> > + if (ACCESS_ONCE(tqhp->head_tkt) != -1)
> >> > + ACCESS_ONCE(tqhp->head_tkt) = -1;
> >> > + smp_mb(); /* Order pointer fetch and assignment against handoff. */
> >> > + ACCESS_ONCE(tqp->cpu) = -1;
> >> > +}
> >> > +EXPORT_SYMBOL(tkt_q_do_wake);
> >> > +
> >> > +/*
> >> > + * Given a lock that already has a queue associated with it, spin on
> >> > + * that queue. Return false if there was no queue (which means we do not
> >> > + * hold the lock) and true otherwise (meaning we -do- hold the lock).
> >> > + */
> >> > +bool tkt_q_do_spin(arch_spinlock_t *lock, struct __raw_tickets inc)
> >> > +{
> >> > + struct tkt_q **oldtail;
> >> > + struct tkt_q tq;
> >> > + struct tkt_q_head *tqhp;
> >> > +
> >> > + /*
> >> > + * Ensure that accesses to queue header happen after sensing
> >> > + * the lock's have-queue bit.
> >> > + */
> >> > + smp_mb(); /* See above block comment. */
> >> > +
> >> > + /* If there no longer is a queue, leave. */
> >> > + tqhp = tkt_q_find_head(lock);
> >> > + if (tqhp == NULL)
> >> > + return false;
> >> > +
> >> > + /* Initialize our queue element. */
> >> > + tq.cpu = raw_smp_processor_id();
> >> > + tq.tail = inc.tail;
> >> > + tq.next = NULL;
> >>
> >> I guess a mb() is needed here for between read tqhp->ref and read
> >> tqhp->head_tkt.
> >> you can move the above mb() to here.
> >
> > Do we?
> >
> > The only way to get into here is if you either set up the queue
> > yourself, or you saw the LSB set in head.
> >
> > If you were the one to set it up yourself, then there's nothing to worry
> > about because you are also the one that set head_tkt.
> >
> > If you didn't set up the queue, then someone else set the LSB in head,
> > which is done with a cmpxchg() which is also a full mb. This would make
> > head_tkt visible as well because it's set before cmpxchg is called.
> >
> > Thus, to come into this function you must have seen head & 1 set, and
> > the smp_mb() above will also make head_tkt visible.
Agreed, after looking again.
> > The only thing I can see now is that it might not find tqhp because ref
> > may not be set yet. If that's the case, then it will fall out back to
> > the main loop. But if it finds ref, then I don't see how it can't see
> > head_tkt up to date as well.
> >
> > Maybe I'm missing something.
>
> No, you are right.
>
> When I lay on the bed in the night, I was thinking about the V1,
> I wrongly considered the V2 has the same problem without deeper
> thought in this morning.
>
> V2 has not such problem. sorry for the noisy.
Not a problem -- you did cause me to spot a missing ACCESS_ONCE() in
tkt_q_find_head(), which I have now added. I also added a comment
to tkt_q_do_wake() noting that the caller's xadd() provides the needed
memory ordering.
Thank you both for looking this over!
Thanx, Paul
> Thanks,
> Lai
>
> >
> > -- Steve
> >
> >
> >>
> >> > +
> >> > + /* Check to see if we already hold the lock. */
> >> > + if (ACCESS_ONCE(tqhp->head_tkt) == inc.tail) {
> >> > + /* The last holder left before queue formed, we hold lock. */
> >> > + tqhp->head_tkt = -1;
> >> > + return true;
> >> > + }
> >> > +
> >> > + /*
> >> > + * Add our element to the tail of the queue. Note that if the
> >> > + * queue is empty, the ->spin_tail pointer will reference
> >> > + * the queue's head pointer, namely ->spin.
> >> > + */
> >> > + oldtail = xchg(&tqhp->spin_tail, &tq.next);
> >> > + ACCESS_ONCE(*oldtail) = &tq;
> >> > +
> >> > + /* Spin until handoff. */
> >> > + while (ACCESS_ONCE(tq.cpu) != -1)
> >> > + cpu_relax();
> >> > +
> >> > + /*
> >> > + * Remove our element from the queue. If the queue is now empty,
> >> > + * update carefully so that the next acquisition will enqueue itself
> >> > + * at the head of the list. Of course, the next enqueue operation
> >> > + * might be happening concurrently, and this code needs to handle all
> >> > + * of the possible combinations, keeping in mind that the enqueue
> >> > + * operation happens in two stages: (1) update the tail pointer and
> >> > + * (2) update the predecessor's ->next pointer. With this in mind,
> >> > + * the following code needs to deal with three scenarios:
> >> > + *
> >> > + * 1. tq is the last entry. In this case, we use cmpxchg to
> >> > + * point the list tail back to the list head (->spin). If
> >> > + * the cmpxchg fails, that indicates that we are instead
> >> > + * in scenario 2 below. If the cmpxchg succeeds, the next
> >> > + * enqueue operation's tail-pointer exchange will enqueue
> >> > + * the next element at the queue head, because the ->spin_tail
> >> > + * pointer now references the queue head.
> >> > + *
> >> > + * 2. tq is the last entry, and the next entry has updated the
> >> > + * tail pointer but has not yet updated tq.next. In this
> >> > + * case, tq.next is NULL, the cmpxchg will fail, and the
> >> > + * code will wait for the enqueue to complete before completing
> >> > + * removal of tq from the list.
> >> > + *
> >> > + * 3. tq is not the last pointer. In this case, tq.next is non-NULL,
> >> > + * so the following code simply removes tq from the list.
> >> > + */
> >> > + if (tq.next == NULL) {
> >> > +
> >> > + /* Mark the queue empty. */
> >> > + tqhp->spin = NULL;
> >> > +
> >> > + /* Try to point the tail back at the head. */
> >> > + if (cmpxchg(&tqhp->spin_tail,
> >> > + &tq.next,
> >> > + &tqhp->spin) == &tq.next)
> >> > + return true; /* Succeeded, queue is now empty. */
> >> > +
> >> > + /* Failed, if needed, wait for the enqueue to complete. */
> >> > + while (tq.next == NULL)
> >> > + cpu_relax();
> >> > +
> >> > + /* The following code will repair the head. */
> >> > + }
> >> > + smp_mb(); /* Force ordering between handoff and critical section. */
> >> > +
> >> > + /*
> >> > + * Advance list-head pointer. This same task will be the next to
> >> > + * access this when releasing the lock, so no need for a memory
> >> > + * barrier after the following assignment.
> >> > + */
> >> > + ACCESS_ONCE(tqhp->spin) = tq.next;
> >> > + return true;
> >> > +}
> >> > +
> >> > +/*
> >> > + * Given a lock that does not have a queue, attempt to associate the
> >> > + * i-th queue with it, returning true if successful (meaning we hold
> >> > + * the lock) or false otherwise (meaning we do -not- hold the lock).
> >> > + * Note that the caller has already filled in ->ref with 0x1, so we
> >> > + * own the queue.
> >> > + */
> >> > +static bool
> >> > +tkt_q_init_contend(int i, arch_spinlock_t *lock, struct __raw_tickets inc)
> >> > +{
> >> > + arch_spinlock_t asold;
> >> > + arch_spinlock_t asnew;
> >> > + struct tkt_q_head *tqhp;
> >> > +
> >> > + /* Initialize the i-th queue header. */
> >> > + tqhp = &tkt_q_heads[i];
> >> > + tqhp->spin = NULL;
> >> > + tqhp->spin_tail = &tqhp->spin;
> >> > +
> >> > + /* Each pass through this loop attempts to mark the lock as queued. */
> >> > + do {
> >> > + asold.head_tail = ACCESS_ONCE(lock->head_tail);
> >> > + asnew = asold;
> >> > + if (asnew.tickets.head & 0x1) {
> >> > +
> >> > + /* Someone beat us to it, back out. */
> >> > + smp_mb();
> >> > + ACCESS_ONCE(tqhp->ref) = NULL;
> >> > +
> >> > + /* Spin on the queue element they set up. */
> >> > + return tkt_q_do_spin(lock, inc);
> >> > + }
> >> > +
> >> > + /*
> >> > + * Record the head counter in case one of the spinning
> >> > + * CPUs already holds the lock but doesn't realize it yet.
> >> > + */
> >> > + tqhp->head_tkt = asold.tickets.head;
> >> > +
> >> > + /* The low-order bit in the head counter says "queued". */
> >> > + asnew.tickets.head |= 0x1;
> >> > + } while (cmpxchg(&lock->head_tail,
> >> > + asold.head_tail,
> >> > + asnew.head_tail) != asold.head_tail);
> >> > +
> >> > + /* Point the queue at the lock and go spin on it. */
> >> > + ACCESS_ONCE(tqhp->ref) = lock;
> >> > + return tkt_q_do_spin(lock, inc);
> >> > +}
> >> > +
> >> > +/*
> >> > + * Start handling a period of high contention by finding a queue to associate
> >> > + * with this lock. Returns true if successful (in which case we hold the
> >> > + * lock) and false otherwise (in which case we do -not- hold the lock).
> >> > + */
> >> > +bool tkt_q_start_contend(arch_spinlock_t *lock, struct __raw_tickets inc)
> >> > +{
> >> > + int i;
> >> > + int start;
> >> > +
> >> > + /* Hash the lock address to find a starting point. */
> >> > + start = i = tkt_q_hash(lock);
> >> > +
> >> > + /*
> >> > + * Each pass through the following loop attempts to associate
> >> > + * the lock with the corresponding queue.
> >> > + */
> >> > + do {
> >> > + /*
> >> > + * Use 0x1 to mark the queue in use, but also avoiding
> >> > + * any spinners trying to use it before we get it all
> >> > + * initialized.
> >> > + */
> >> > + if (tkt_q_heads[i].ref)
> >> > + continue;
> >> > + if (cmpxchg(&tkt_q_heads[i].ref,
> >> > + NULL,
> >> > + (arch_spinlock_t *)0x1) == NULL) {
> >> > +
> >> > + /* Succeeded, now go initialize it. */
> >> > + return tkt_q_init_contend(i, lock, inc);
> >> > + }
> >> > +
> >> > + /* If someone beat us to it, go spin on their queue. */
> >> > + if (ACCESS_ONCE(lock->tickets.head) & 0x1)
> >> > + return tkt_q_do_spin(lock, inc);
> >> > + } while ((i = tkt_q_next_slot(i)) != start);
> >> > +
> >> > + /* All the queues are in use, revert to spinning on the ticket lock. */
> >> > + return false;
> >> > +}
> >> > +
> >> > +bool tkt_spin_pass(arch_spinlock_t *ap, struct __raw_tickets inc)
> >> > +{
> >> > + if (unlikely(inc.head & 0x1)) {
> >> > +
> >> > + /* This lock has a queue, so go spin on the queue. */
> >> > + if (tkt_q_do_spin(ap, inc))
> >> > + return true;
> >> > +
> >> > + /* Get here if the queue is in transition: Retry next time. */
> >> > +
> >> > + } else if (inc.tail - TKT_Q_SWITCH == inc.head) {
> >> > +
> >> > + /*
> >> > + * This lock has lots of spinners, but no queue.
> >> > + * Go create a queue to spin on.
> >> > + */
> >> > + if (tkt_q_start_contend(ap, inc))
> >> > + return true;
> >> > +
> >> > + /* Get here if the queue is in transition: Retry next time. */
> >> > + }
> >> > +
> >> > + /* Either no need for a queue or the queue is in transition. Spin. */
> >> > + return false;
> >> > +}
> >> > +EXPORT_SYMBOL(tkt_spin_pass);
> >> >
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> > the body of a message to majordomo@vger.kernel.org
> >> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> > Please read the FAQ at http://www.tux.org/lkml/
> >
> >
>
next prev parent reply other threads:[~2013-06-12 14:21 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-09 19:36 [PATCH RFC ticketlock] Auto-queued ticketlock Paul E. McKenney
2013-06-10 20:47 ` Steven Rostedt
2013-06-10 20:57 ` Paul E. McKenney
2013-06-10 21:01 ` Thomas Gleixner
2013-06-10 21:15 ` Paul E. McKenney
2013-06-10 21:08 ` Steven Rostedt
2013-06-10 21:30 ` Paul E. McKenney
2013-06-10 21:35 ` Eric Dumazet
2013-06-10 21:54 ` Paul E. McKenney
2013-06-10 23:02 ` Steven Rostedt
2013-06-11 0:22 ` Paul E. McKenney
2013-06-11 0:44 ` Steven Rostedt
2013-06-11 0:51 ` Linus Torvalds
2013-06-11 7:53 ` Lai Jiangshan
2013-06-11 10:14 ` Paul E. McKenney
2013-06-11 15:22 ` Steven Rostedt
2013-06-11 16:45 ` Paul E. McKenney
2013-06-11 10:06 ` Paul E. McKenney
2013-06-11 17:53 ` Davidlohr Bueso
2013-06-11 18:05 ` Paul E. McKenney
2013-06-11 18:10 ` Steven Rostedt
2013-06-11 18:14 ` Davidlohr Bueso
2013-06-11 18:46 ` Paul E. McKenney
2013-06-12 17:50 ` Davidlohr Bueso
2013-06-12 18:15 ` Linus Torvalds
2013-06-12 20:03 ` Davidlohr Bueso
2013-06-12 20:26 ` Linus Torvalds
2013-06-12 20:40 ` Davidlohr Bueso
2013-06-12 21:06 ` Raymond Jennings
2013-06-12 23:32 ` Al Viro
2013-06-13 0:01 ` Linus Torvalds
2013-06-13 0:20 ` Al Viro
2013-06-13 0:38 ` Linus Torvalds
2013-06-13 0:49 ` Al Viro
2013-06-13 0:59 ` Linus Torvalds
2013-06-14 15:00 ` Waiman Long
2013-06-14 15:37 ` Linus Torvalds
2013-06-14 18:17 ` Waiman Long
2013-06-15 1:26 ` Benjamin Herrenschmidt
2013-06-15 3:36 ` Waiman Long
2013-06-12 20:37 ` Linus Torvalds
2013-06-12 18:18 ` Steven Rostedt
2013-06-11 9:56 ` Paul E. McKenney
2013-06-11 15:00 ` Paul E. McKenney
2013-06-11 1:04 ` Steven Rostedt
2013-06-11 9:52 ` Paul E. McKenney
2013-06-11 14:48 ` Lai Jiangshan
2013-06-11 15:10 ` Lai Jiangshan
2013-06-11 16:48 ` Paul E. McKenney
2013-06-11 17:17 ` Linus Torvalds
2013-06-11 17:30 ` Paul E. McKenney
2013-06-11 16:21 ` Paul E. McKenney
2013-06-11 15:57 ` Waiman Long
2013-06-11 16:20 ` Steven Rostedt
2013-06-11 16:43 ` Paul E. McKenney
2013-06-11 17:13 ` Steven Rostedt
2013-06-11 17:43 ` Paul E. McKenney
2013-06-11 17:35 ` Waiman Long
2013-06-11 16:36 ` Paul E. McKenney
2013-06-11 17:01 ` Steven Rostedt
2013-06-11 17:16 ` Paul E. McKenney
2013-06-11 18:41 ` Waiman Long
2013-06-11 18:54 ` Davidlohr Bueso
2013-06-11 19:49 ` Paul E. McKenney
2013-06-11 20:09 ` Steven Rostedt
2013-06-11 20:32 ` Paul E. McKenney
2013-06-11 20:53 ` Steven Rostedt
2013-06-11 20:25 ` Jason Low
2013-06-11 20:36 ` Paul E. McKenney
2013-06-11 20:56 ` Steven Rostedt
2013-06-11 21:09 ` Paul E. McKenney
2013-06-12 1:19 ` Lai Jiangshan
2013-06-12 1:58 ` Steven Rostedt
2013-06-12 10:12 ` Paul E. McKenney
2013-06-12 11:06 ` Lai Jiangshan
2013-06-12 14:21 ` Paul E. McKenney [this message]
2013-06-12 14:15 ` Lai Jiangshan
2013-06-12 14:44 ` Paul E. McKenney
2013-06-11 17:02 ` [PATCH RFC ticketlock] v2 " Paul E. McKenney
2013-06-11 17:35 ` Linus Torvalds
2013-06-11 17:49 ` Paul E. McKenney
2013-06-11 17:36 ` Steven Rostedt
2013-06-11 17:52 ` Paul E. McKenney
2013-06-12 15:40 ` [PATCH RFC ticketlock] v3 " Paul E. McKenney
2013-06-12 16:13 ` Lai Jiangshan
2013-06-12 16:59 ` Paul E. McKenney
2013-06-13 2:55 ` Lai Jiangshan
2013-06-13 15:22 ` Paul E. McKenney
2013-06-13 23:25 ` Lai Jiangshan
2013-06-13 23:57 ` Paul E. McKenney
2013-06-14 1:28 ` Lai Jiangshan
2013-06-14 23:49 ` Paul E. McKenney
2013-06-14 7:12 ` Lai Jiangshan
2013-06-14 23:46 ` Paul E. McKenney
[not found] ` <CAC4Lta3dpTDc19rXLVQkZrxbu8AJL+Foc6ocAktUAozCpk2-Mg@mail.gmail.com>
2013-07-01 9:19 ` Raghavendra KT
2013-07-02 5:56 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130612142102.GT5146@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=darren@dvhart.com \
--cc=davidlohr.bueso@hp.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=eag0628@gmail.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sbw@mit.edu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=waiman.long@hp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).