* [PATCH RFC ] locking/mutexes: don't spin on owner when wait list is not NULL. @ 2016-01-21 6:53 Ding Tianhong 2016-01-21 7:29 ` Ingo Molnar 0 siblings, 1 reply; 24+ messages in thread From: Ding Tianhong @ 2016-01-21 6:53 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org I build a script to create several process for ioctl loop calling, the ioctl will calling the kernel function just like: xx_ioctl { ... rtnl_lock(); function(); rtnl_unlock(); ... } The function may sleep several ms, but will not halt, at the same time another user service may calling ifconfig to change the state of the ethernet, and after several hours, the hung task thread report this problem: ======================================================================== 149738.039038] INFO: task ifconfig:11890 blocked for more than 120 seconds. [149738.040597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [149738.042280] ifconfig D ffff88061ec13680 0 11890 11573 0x00000080 [149738.042284] ffff88052449bd40 0000000000000082 ffff88053a33f300 ffff88052449bfd8 [149738.042286] ffff88052449bfd8 ffff88052449bfd8 ffff88053a33f300 ffffffff819e6240 [149738.042288] ffffffff819e6244 ffff88053a33f300 00000000ffffffff ffffffff819e6248 [149738.042290] Call Trace: [149738.042300] [<ffffffff8160d219>] schedule_preempt_disabled+0x29/0x70 [149738.042303] [<ffffffff8160af65>] __mutex_lock_slowpath+0xc5/0x1c0 [149738.042305] [<ffffffff8160a3cf>] mutex_lock+0x1f/0x2f [149738.042309] [<ffffffff8150d945>] rtnl_lock+0x15/0x20 [149738.042311] [<ffffffff81514e3a>] dev_ioctl+0xda/0x590 [149738.042314] [<ffffffff816121cc>] ? __do_page_fault+0x21c/0x560 [149738.042318] [<ffffffff814e42c5>] sock_do_ioctl+0x45/0x50 [149738.042320] [<ffffffff814e49d0>] sock_ioctl+0x1f0/0x2c0 [149738.042324] [<ffffffff811dc9b5>] do_vfs_ioctl+0x2e5/0x4c0 [149738.042327] [<ffffffff811e6a00>] ? fget_light+0xa0/0xd0 ================================ cut here ================================ I got the vmcore and found that the ifconfig is already in the wait_list of the rtnl_lock for 120 second, but my process could get and release the rtnl_lock normally several times in one second, so it means that my process jump the queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock slow path and found that the mutex may spin on owner ignore whether the wait list is empty, it will cause the task in the wait list always be cut in line, so add test for wait list in the mutex_can_spin_on_owner and avoid this problem. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> --- kernel/locking/mutex.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 0551c21..596b341 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) struct task_struct *owner; int retval = 1; - if (need_resched()) + if (need_resched() || atomic_read(&lock->count) == -1) return 0; rcu_read_lock(); @@ -283,10 +283,11 @@ static inline bool mutex_try_to_acquire(struct mutex *lock) /* * Optimistic spinning. * - * We try to spin for acquisition when we find that the lock owner - * is currently running on a (different) CPU and while we don't - * need to reschedule. The rationale is that if the lock owner is - * running, it is likely to release the lock soon. + * We try to spin for acquisition when we find that there are no + * pending waiters and the lock owner is currently running on a + * (different) CPU and while we don't need to reschedule. The + * rationale is that if the lock owner is running, it is likely + * to release the lock soon. * * Since this needs the lock owner, and this mutex implementation * doesn't track the owner atomically in the lock field, we need to -- 2.5.0 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH RFC ] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-21 6:53 [PATCH RFC ] locking/mutexes: don't spin on owner when wait list is not NULL Ding Tianhong @ 2016-01-21 7:29 ` Ingo Molnar 2016-01-21 9:04 ` Ding Tianhong 0 siblings, 1 reply; 24+ messages in thread From: Ingo Molnar @ 2016-01-21 7:29 UTC (permalink / raw) To: Ding Tianhong Cc: Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Peter Zijlstra, Will Deacon, Jason Low, Tim Chen, Waiman Long Cc:-ed other gents who touched the mutex code recently. Mail quoted below. Thanks, Ingo * Ding Tianhong <dingtianhong@huawei.com> wrote: > I build a script to create several process for ioctl loop calling, > the ioctl will calling the kernel function just like: > xx_ioctl { > ... > rtnl_lock(); > function(); > rtnl_unlock(); > ... > } > The function may sleep several ms, but will not halt, at the same time > another user service may calling ifconfig to change the state of the > ethernet, and after several hours, the hung task thread report this problem: > > ======================================================================== > 149738.039038] INFO: task ifconfig:11890 blocked for more than 120 seconds. > [149738.040597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [149738.042280] ifconfig D ffff88061ec13680 0 11890 11573 0x00000080 > [149738.042284] ffff88052449bd40 0000000000000082 ffff88053a33f300 ffff88052449bfd8 > [149738.042286] ffff88052449bfd8 ffff88052449bfd8 ffff88053a33f300 ffffffff819e6240 > [149738.042288] ffffffff819e6244 ffff88053a33f300 00000000ffffffff ffffffff819e6248 > [149738.042290] Call Trace: > [149738.042300] [<ffffffff8160d219>] schedule_preempt_disabled+0x29/0x70 > [149738.042303] [<ffffffff8160af65>] __mutex_lock_slowpath+0xc5/0x1c0 > [149738.042305] [<ffffffff8160a3cf>] mutex_lock+0x1f/0x2f > [149738.042309] [<ffffffff8150d945>] rtnl_lock+0x15/0x20 > [149738.042311] [<ffffffff81514e3a>] dev_ioctl+0xda/0x590 > [149738.042314] [<ffffffff816121cc>] ? __do_page_fault+0x21c/0x560 > [149738.042318] [<ffffffff814e42c5>] sock_do_ioctl+0x45/0x50 > [149738.042320] [<ffffffff814e49d0>] sock_ioctl+0x1f0/0x2c0 > [149738.042324] [<ffffffff811dc9b5>] do_vfs_ioctl+0x2e5/0x4c0 > [149738.042327] [<ffffffff811e6a00>] ? fget_light+0xa0/0xd0 > > ================================ cut here ================================ > > I got the vmcore and found that the ifconfig is already in the wait_list of the > rtnl_lock for 120 second, but my process could get and release the rtnl_lock > normally several times in one second, so it means that my process jump the > queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock > slow path and found that the mutex may spin on owner ignore whether the wait list > is empty, it will cause the task in the wait list always be cut in line, so add > test for wait list in the mutex_can_spin_on_owner and avoid this problem. > > Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> > --- > kernel/locking/mutex.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 0551c21..596b341 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) > struct task_struct *owner; > int retval = 1; > > - if (need_resched()) > + if (need_resched() || atomic_read(&lock->count) == -1) > return 0; > > rcu_read_lock(); > @@ -283,10 +283,11 @@ static inline bool mutex_try_to_acquire(struct mutex *lock) > /* > * Optimistic spinning. > * > - * We try to spin for acquisition when we find that the lock owner > - * is currently running on a (different) CPU and while we don't > - * need to reschedule. The rationale is that if the lock owner is > - * running, it is likely to release the lock soon. > + * We try to spin for acquisition when we find that there are no > + * pending waiters and the lock owner is currently running on a > + * (different) CPU and while we don't need to reschedule. The > + * rationale is that if the lock owner is running, it is likely > + * to release the lock soon. > * > * Since this needs the lock owner, and this mutex implementation > * doesn't track the owner atomically in the lock field, we need to > -- > 2.5.0 > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC ] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-21 7:29 ` Ingo Molnar @ 2016-01-21 9:04 ` Ding Tianhong 0 siblings, 0 replies; 24+ messages in thread From: Ding Tianhong @ 2016-01-21 9:04 UTC (permalink / raw) To: Ingo Molnar Cc: Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Peter Zijlstra, Will Deacon, Jason Low, Tim Chen, Waiman Long On 2016/1/21 15:29, Ingo Molnar wrote: > > Cc:-ed other gents who touched the mutex code recently. Mail quoted below. > Ok, thanks. Ding > Thanks, > > Ingo > > * Ding Tianhong <dingtianhong@huawei.com> wrote: > >> I build a script to create several process for ioctl loop calling, >> the ioctl will calling the kernel function just like: >> xx_ioctl { >> ... >> rtnl_lock(); >> function(); >> rtnl_unlock(); >> ... >> } >> The function may sleep several ms, but will not halt, at the same time >> another user service may calling ifconfig to change the state of the >> ethernet, and after several hours, the hung task thread report this problem: >> >> ======================================================================== >> 149738.039038] INFO: task ifconfig:11890 blocked for more than 120 seconds. >> [149738.040597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [149738.042280] ifconfig D ffff88061ec13680 0 11890 11573 0x00000080 >> [149738.042284] ffff88052449bd40 0000000000000082 ffff88053a33f300 ffff88052449bfd8 >> [149738.042286] ffff88052449bfd8 ffff88052449bfd8 ffff88053a33f300 ffffffff819e6240 >> [149738.042288] ffffffff819e6244 ffff88053a33f300 00000000ffffffff ffffffff819e6248 >> [149738.042290] Call Trace: >> [149738.042300] [<ffffffff8160d219>] schedule_preempt_disabled+0x29/0x70 >> [149738.042303] [<ffffffff8160af65>] __mutex_lock_slowpath+0xc5/0x1c0 >> [149738.042305] [<ffffffff8160a3cf>] mutex_lock+0x1f/0x2f >> [149738.042309] [<ffffffff8150d945>] rtnl_lock+0x15/0x20 >> [149738.042311] [<ffffffff81514e3a>] dev_ioctl+0xda/0x590 >> [149738.042314] [<ffffffff816121cc>] ? __do_page_fault+0x21c/0x560 >> [149738.042318] [<ffffffff814e42c5>] sock_do_ioctl+0x45/0x50 >> [149738.042320] [<ffffffff814e49d0>] sock_ioctl+0x1f0/0x2c0 >> [149738.042324] [<ffffffff811dc9b5>] do_vfs_ioctl+0x2e5/0x4c0 >> [149738.042327] [<ffffffff811e6a00>] ? fget_light+0xa0/0xd0 >> >> ================================ cut here ================================ >> >> I got the vmcore and found that the ifconfig is already in the wait_list of the >> rtnl_lock for 120 second, but my process could get and release the rtnl_lock >> normally several times in one second, so it means that my process jump the >> queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock >> slow path and found that the mutex may spin on owner ignore whether the wait list >> is empty, it will cause the task in the wait list always be cut in line, so add >> test for wait list in the mutex_can_spin_on_owner and avoid this problem. >> >> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> >> --- >> kernel/locking/mutex.c | 11 ++++++----- >> 1 file changed, 6 insertions(+), 5 deletions(-) >> >> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c >> index 0551c21..596b341 100644 >> --- a/kernel/locking/mutex.c >> +++ b/kernel/locking/mutex.c >> @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) >> struct task_struct *owner; >> int retval = 1; >> >> - if (need_resched()) >> + if (need_resched() || atomic_read(&lock->count) == -1) >> return 0; >> >> rcu_read_lock(); >> @@ -283,10 +283,11 @@ static inline bool mutex_try_to_acquire(struct mutex *lock) >> /* >> * Optimistic spinning. >> * >> - * We try to spin for acquisition when we find that the lock owner >> - * is currently running on a (different) CPU and while we don't >> - * need to reschedule. The rationale is that if the lock owner is >> - * running, it is likely to release the lock soon. >> + * We try to spin for acquisition when we find that there are no >> + * pending waiters and the lock owner is currently running on a >> + * (different) CPU and while we don't need to reschedule. The >> + * rationale is that if the lock owner is running, it is likely >> + * to release the lock soon. >> * >> * Since this needs the lock owner, and this mutex implementation >> * doesn't track the owner atomically in the lock field, we need to >> -- >> 2.5.0 >> >> > > . > ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. @ 2016-01-21 9:29 Ding Tianhong 2016-01-21 21:23 ` Tim Chen 2016-01-21 23:02 ` Waiman Long 0 siblings, 2 replies; 24+ messages in thread From: Ding Tianhong @ 2016-01-21 9:29 UTC (permalink / raw) To: Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org Cc: Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Jason Low, Tim Chen, Waiman Long I build a script to create several process for ioctl loop calling, the ioctl will calling the kernel function just like: xx_ioctl { ... rtnl_lock(); function(); rtnl_unlock(); ... } The function may sleep several ms, but will not halt, at the same time another user service may calling ifconfig to change the state of the ethernet, and after several hours, the hung task thread report this problem: ======================================================================== 149738.039038] INFO: task ifconfig:11890 blocked for more than 120 seconds. [149738.040597] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [149738.042280] ifconfig D ffff88061ec13680 0 11890 11573 0x00000080 [149738.042284] ffff88052449bd40 0000000000000082 ffff88053a33f300 ffff88052449bfd8 [149738.042286] ffff88052449bfd8 ffff88052449bfd8 ffff88053a33f300 ffffffff819e6240 [149738.042288] ffffffff819e6244 ffff88053a33f300 00000000ffffffff ffffffff819e6248 [149738.042290] Call Trace: [149738.042300] [<ffffffff8160d219>] schedule_preempt_disabled+0x29/0x70 [149738.042303] [<ffffffff8160af65>] __mutex_lock_slowpath+0xc5/0x1c0 [149738.042305] [<ffffffff8160a3cf>] mutex_lock+0x1f/0x2f [149738.042309] [<ffffffff8150d945>] rtnl_lock+0x15/0x20 [149738.042311] [<ffffffff81514e3a>] dev_ioctl+0xda/0x590 [149738.042314] [<ffffffff816121cc>] ? __do_page_fault+0x21c/0x560 [149738.042318] [<ffffffff814e42c5>] sock_do_ioctl+0x45/0x50 [149738.042320] [<ffffffff814e49d0>] sock_ioctl+0x1f0/0x2c0 [149738.042324] [<ffffffff811dc9b5>] do_vfs_ioctl+0x2e5/0x4c0 [149738.042327] [<ffffffff811e6a00>] ? fget_light+0xa0/0xd0 ================================ cut here ================================ I got the vmcore and found that the ifconfig is already in the wait_list of the rtnl_lock for 120 second, but my process could get and release the rtnl_lock normally several times in one second, so it means that my process jump the queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock slow path and found that the mutex may spin on owner ignore whether the wait list is empty, it will cause the task in the wait list always be cut in line, so add test for wait list in the mutex_can_spin_on_owner and avoid this problem. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Jason Low <jason.low2@hp.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Waiman Long <Waiman.Long@hp.com> --- kernel/locking/mutex.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 0551c21..596b341 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) struct task_struct *owner; int retval = 1; - if (need_resched()) + if (need_resched() || atomic_read(&lock->count) == -1) return 0; rcu_read_lock(); @@ -283,10 +283,11 @@ static inline bool mutex_try_to_acquire(struct mutex *lock) /* * Optimistic spinning. * - * We try to spin for acquisition when we find that the lock owner - * is currently running on a (different) CPU and while we don't - * need to reschedule. The rationale is that if the lock owner is - * running, it is likely to release the lock soon. + * We try to spin for acquisition when we find that there are no + * pending waiters and the lock owner is currently running on a + * (different) CPU and while we don't need to reschedule. The + * rationale is that if the lock owner is running, it is likely + * to release the lock soon. * * Since this needs the lock owner, and this mutex implementation * doesn't track the owner atomically in the lock field, we need to -- 2.5.0 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-21 9:29 [PATCH RFC] " Ding Tianhong @ 2016-01-21 21:23 ` Tim Chen 2016-01-22 2:41 ` Paul E. McKenney 2016-01-21 23:02 ` Waiman Long 1 sibling, 1 reply; 24+ messages in thread From: Tim Chen @ 2016-01-21 21:23 UTC (permalink / raw) To: Ding Tianhong Cc: Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Jason Low, Waiman Long On Thu, 2016-01-21 at 17:29 +0800, Ding Tianhong wrote: > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 0551c21..596b341 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) > struct task_struct *owner; > int retval = 1; > > - if (need_resched()) > + if (need_resched() || atomic_read(&lock->count) == -1) > return 0; > One concern I have is this change will eliminate any optimistic spinning as long as there is a waiter. Is there a middle ground that we can allow only one spinner if there are waiters? In other words, we allow spinning when atomic_read(&lock->count) == -1 but there is no one on the osq lock that queue up the spinners (i.e. no other process doing optimistic spinning). This could allow a bit of spinning without starving out the waiters. Tim ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-21 21:23 ` Tim Chen @ 2016-01-22 2:41 ` Paul E. McKenney 2016-01-22 2:48 ` Davidlohr Bueso 0 siblings, 1 reply; 24+ messages in thread From: Paul E. McKenney @ 2016-01-22 2:41 UTC (permalink / raw) To: Tim Chen Cc: Ding Tianhong, Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Thomas Gleixner, Will Deacon, Jason Low, Waiman Long On Thu, Jan 21, 2016 at 01:23:09PM -0800, Tim Chen wrote: > On Thu, 2016-01-21 at 17:29 +0800, Ding Tianhong wrote: > > > > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > > index 0551c21..596b341 100644 > > --- a/kernel/locking/mutex.c > > +++ b/kernel/locking/mutex.c > > @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) > > struct task_struct *owner; > > int retval = 1; > > > > - if (need_resched()) > > + if (need_resched() || atomic_read(&lock->count) == -1) > > return 0; > > > > One concern I have is this change will eliminate any optimistic spinning > as long as there is a waiter. Is there a middle ground that we > can allow only one spinner if there are waiters? > > In other words, we allow spinning when > atomic_read(&lock->count) == -1 but there is no one on the > osq lock that queue up the spinners (i.e. no other process doing > optimistic spinning). > > This could allow a bit of spinning without starving out the waiters. I did some testing, which exposed it to the 0day test robot, which did note some performance differences. I was hoping that it would clear up some instability from other patches, but no such luck. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 2:41 ` Paul E. McKenney @ 2016-01-22 2:48 ` Davidlohr Bueso 2016-01-22 3:13 ` Paul E. McKenney 0 siblings, 1 reply; 24+ messages in thread From: Davidlohr Bueso @ 2016-01-22 2:48 UTC (permalink / raw) To: Paul E. McKenney Cc: Tim Chen, Ding Tianhong, Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Linus Torvalds, Thomas Gleixner, Will Deacon, Jason Low, Waiman Long On Thu, 21 Jan 2016, Paul E. McKenney wrote: >I did some testing, which exposed it to the 0day test robot, which >did note some performance differences. I was hoping that it would >clear up some instability from other patches, but no such luck. ;-) Oh, that explains why we got a performance regression report :) Thanks, Davidlohr ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 2:48 ` Davidlohr Bueso @ 2016-01-22 3:13 ` Paul E. McKenney 0 siblings, 0 replies; 24+ messages in thread From: Paul E. McKenney @ 2016-01-22 3:13 UTC (permalink / raw) To: Davidlohr Bueso Cc: Tim Chen, Ding Tianhong, Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Linus Torvalds, Thomas Gleixner, Will Deacon, Jason Low, Waiman Long On Thu, Jan 21, 2016 at 06:48:54PM -0800, Davidlohr Bueso wrote: > On Thu, 21 Jan 2016, Paul E. McKenney wrote: > > >I did some testing, which exposed it to the 0day test robot, which > >did note some performance differences. I was hoping that it would > >clear up some instability from other patches, but no such luck. ;-) > > Oh, that explains why we got a performance regression report :) Plus I suspected that you wanted some extra email. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-21 9:29 [PATCH RFC] " Ding Tianhong 2016-01-21 21:23 ` Tim Chen @ 2016-01-21 23:02 ` Waiman Long 2016-01-22 6:09 ` Davidlohr Bueso 2016-01-22 8:54 ` Peter Zijlstra 1 sibling, 2 replies; 24+ messages in thread From: Waiman Long @ 2016-01-21 23:02 UTC (permalink / raw) To: Ding Tianhong Cc: Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Jason Low, Tim Chen, Waiman Long [-- Attachment #1: Type: text/plain, Size: 4432 bytes --] On 01/21/2016 04:29 AM, Ding Tianhong wrote: > I build a script to create several process for ioctl loop calling, > the ioctl will calling the kernel function just like: > xx_ioctl { > ... > rtnl_lock(); > function(); > rtnl_unlock(); > ... > } > The function may sleep several ms, but will not halt, at the same time > another user service may calling ifconfig to change the state of the > ethernet, and after several hours, the hung task thread report this problem: > > ======================================================================== > 149738.039038] INFO: task ifconfig:11890 blocked for more than 120 seconds. > [149738.040597] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [149738.042280] ifconfig D ffff88061ec13680 0 11890 11573 0x00000080 > [149738.042284] ffff88052449bd40 0000000000000082 ffff88053a33f300 ffff88052449bfd8 > [149738.042286] ffff88052449bfd8 ffff88052449bfd8 ffff88053a33f300 ffffffff819e6240 > [149738.042288] ffffffff819e6244 ffff88053a33f300 00000000ffffffff ffffffff819e6248 > [149738.042290] Call Trace: > [149738.042300] [<ffffffff8160d219>] schedule_preempt_disabled+0x29/0x70 > [149738.042303] [<ffffffff8160af65>] __mutex_lock_slowpath+0xc5/0x1c0 > [149738.042305] [<ffffffff8160a3cf>] mutex_lock+0x1f/0x2f > [149738.042309] [<ffffffff8150d945>] rtnl_lock+0x15/0x20 > [149738.042311] [<ffffffff81514e3a>] dev_ioctl+0xda/0x590 > [149738.042314] [<ffffffff816121cc>] ? __do_page_fault+0x21c/0x560 > [149738.042318] [<ffffffff814e42c5>] sock_do_ioctl+0x45/0x50 > [149738.042320] [<ffffffff814e49d0>] sock_ioctl+0x1f0/0x2c0 > [149738.042324] [<ffffffff811dc9b5>] do_vfs_ioctl+0x2e5/0x4c0 > [149738.042327] [<ffffffff811e6a00>] ? fget_light+0xa0/0xd0 > > ================================ cut here ================================ > > I got the vmcore and found that the ifconfig is already in the wait_list of the > rtnl_lock for 120 second, but my process could get and release the rtnl_lock > normally several times in one second, so it means that my process jump the > queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock > slow path and found that the mutex may spin on owner ignore whether the wait list > is empty, it will cause the task in the wait list always be cut in line, so add > test for wait list in the mutex_can_spin_on_owner and avoid this problem. > > Signed-off-by: Ding Tianhong<dingtianhong@huawei.com> > Cc: Ingo Molnar<mingo@redhat.com> > Cc: Peter Zijlstra<peterz@infradead.org> > Cc: Davidlohr Bueso<dave@stgolabs.net> > Cc: Linus Torvalds<torvalds@linux-foundation.org> > Cc: Paul E. McKenney<paulmck@us.ibm.com> > Cc: Thomas Gleixner<tglx@linutronix.de> > Cc: Will Deacon<will.deacon@arm.com> > Cc: Jason Low<jason.low2@hp.com> > Cc: Tim Chen<tim.c.chen@linux.intel.com> > Cc: Waiman Long<Waiman.Long@hp.com> > --- > kernel/locking/mutex.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 0551c21..596b341 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -256,7 +256,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock) > struct task_struct *owner; > int retval = 1; > > - if (need_resched()) > + if (need_resched() || atomic_read(&lock->count) == -1) > return 0; > > rcu_read_lock(); > @@ -283,10 +283,11 @@ static inline bool mutex_try_to_acquire(struct mutex *lock) > /* > * Optimistic spinning. > * > - * We try to spin for acquisition when we find that the lock owner > - * is currently running on a (different) CPU and while we don't > - * need to reschedule. The rationale is that if the lock owner is > - * running, it is likely to release the lock soon. > + * We try to spin for acquisition when we find that there are no > + * pending waiters and the lock owner is currently running on a > + * (different) CPU and while we don't need to reschedule. The > + * rationale is that if the lock owner is running, it is likely > + * to release the lock soon. > * > * Since this needs the lock owner, and this mutex implementation > * doesn't track the owner atomically in the lock field, we need to This patch will largely defeat the performance benefit of optimistic spinning. I have an alternative solution to this live-lock problem. Would you mind trying out the attached patch to see if it can fix your problem? Cheers, Longman [-- Attachment #2: 0001-locking-mutex-Enable-optimistic-spinning-of-woken-ta.patch --] [-- Type: text/plain, Size: 5460 bytes --] From 1bbb5a4434d395f48163abc5435c5c720a15d327 Mon Sep 17 00:00:00 2001 From: Waiman Long <Waiman.Long@hpe.com> Date: Thu, 21 Jan 2016 17:53:14 -0500 Subject: [PATCH] locking/mutex: Enable optimistic spinning of woken task in wait list Ding Tianhong reported a live-lock situation where a constant stream of incoming optimistic spinners blocked a task in the wait list from getting the mutex. This patch attempts to fix this live-lock condition by enabling the a woken task in the wait list to enter optimistic spinning loop itself with precedence over the ones in the OSQ. This should prevent the live-lock condition from happening. Signed-off-by: Waiman Long <Waiman.Long@hpe.com> --- include/linux/mutex.h | 2 + kernel/locking/mutex.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 95 insertions(+), 2 deletions(-) diff --git a/include/linux/mutex.h b/include/linux/mutex.h index 2cb7531..2c55ecd 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -57,6 +57,8 @@ struct mutex { #endif #ifdef CONFIG_MUTEX_SPIN_ON_OWNER struct optimistic_spin_queue osq; /* Spinner MCS lock */ + /* Set if wait list head actively spinning */ + int wlh_spinning; #endif #ifdef CONFIG_DEBUG_MUTEXES void *magic; diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 0551c21..8b27b03 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -55,6 +55,7 @@ __mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key) mutex_clear_owner(lock); #ifdef CONFIG_MUTEX_SPIN_ON_OWNER osq_lock_init(&lock->osq); + lock->wlh_spinning = false; #endif debug_mutex_init(lock, name, key); @@ -346,8 +347,12 @@ static bool mutex_optimistic_spin(struct mutex *lock, if (owner && !mutex_spin_on_owner(lock, owner)) break; - /* Try to acquire the mutex if it is unlocked. */ - if (mutex_try_to_acquire(lock)) { + /* + * Try to acquire the mutex if it is unlocked and the wait + * list head isn't spinning on the lock. + */ + if (!READ_ONCE(lock->wlh_spinning) && + mutex_try_to_acquire(lock)) { lock_acquired(&lock->dep_map, ip); if (use_ww_ctx) { @@ -398,12 +403,91 @@ done: return false; } + +/* + * Wait list head optimistic spinning + * + * The wait list head, when woken up, will try to spin on the lock if the + * lock owner is active. It will also set the wlh_spinning flag to give + * itself a higher chance of getting the lock than the other optimisically + * spinning locker in the OSQ. + */ +static bool mutex_wlh_opt_spin(struct mutex *lock, + struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx) +{ + struct task_struct *owner, *task = current; + int gotlock = false; + + WRITE_ONCE(lock->wlh_spinning, true); + while (true) { + if (use_ww_ctx && ww_ctx->acquired > 0) { + struct ww_mutex *ww; + + ww = container_of(lock, struct ww_mutex, base); + /* + * If ww->ctx is set the contents are undefined, only + * by acquiring wait_lock there is a guarantee that + * they are not invalid when reading. + * + * As such, when deadlock detection needs to be + * performed the optimistic spinning cannot be done. + */ + if (READ_ONCE(ww->ctx)) + break; + } + + /* + * If there's an owner, wait for it to either + * release the lock or go to sleep. + */ + owner = READ_ONCE(lock->owner); + if (owner && !mutex_spin_on_owner(lock, owner)) + break; + + /* + * Try to acquire the mutex if it is unlocked. The mutex + * value is set to -1 which will be changed to 0 later on + * if the wait list becomes empty. + */ + if (!mutex_is_locked(lock) && + (atomic_cmpxchg_acquire(&lock->count, 1, -1) == 1)) { + gotlock = true; + break; + } + + /* + * When there's no owner, we might have preempted between the + * owner acquiring the lock and setting the owner field. If + * we're an RT task that will live-lock because we won't let + * the owner complete. + */ + if (!owner && (need_resched() || rt_task(task))) + break; + + /* + * The cpu_relax() call is a compiler barrier which forces + * everything in this loop to be re-loaded. We don't need + * memory barriers as we'll eventually observe the right + * values at the cost of a few extra spins. + */ + cpu_relax_lowlatency(); + + } + WRITE_ONCE(lock->wlh_spinning, false); + return gotlock; +} #else static bool mutex_optimistic_spin(struct mutex *lock, struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx) { return false; } + +static bool mutex_wlh_opt_spin(struct mutex *lock, + struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx) +{ + return false; +} #endif __visible __used noinline @@ -543,6 +627,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, lock_contended(&lock->dep_map, ip); for (;;) { + int gotlock; + /* * Lets try to take the lock again - this is needed even if * we get here for the first time (shortly after failing to @@ -577,7 +663,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, /* didn't get the lock, go to sleep: */ spin_unlock_mutex(&lock->wait_lock, flags); schedule_preempt_disabled(); + + /* optimistically spinning on the mutex without the wait lock */ + gotlock = mutex_wlh_opt_spin(lock, ww_ctx, use_ww_ctx); spin_lock_mutex(&lock->wait_lock, flags); + if (gotlock) + break; } __set_task_state(task, TASK_RUNNING); -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-21 23:02 ` Waiman Long @ 2016-01-22 6:09 ` Davidlohr Bueso 2016-01-22 13:38 ` Waiman Long 2016-01-22 8:54 ` Peter Zijlstra 1 sibling, 1 reply; 24+ messages in thread From: Davidlohr Bueso @ 2016-01-22 6:09 UTC (permalink / raw) To: Waiman Long Cc: Ding Tianhong, Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Jason Low, Tim Chen, Waiman Long On Thu, 21 Jan 2016, Waiman Long wrote: >On 01/21/2016 04:29 AM, Ding Tianhong wrote: >>I got the vmcore and found that the ifconfig is already in the wait_list of the >>rtnl_lock for 120 second, but my process could get and release the rtnl_lock >>normally several times in one second, so it means that my process jump the >>queue and the ifconfig couldn't get the rtnl all the time, I check the mutex lock >>slow path and found that the mutex may spin on owner ignore whether the wait list >>is empty, it will cause the task in the wait list always be cut in line, so add >>test for wait list in the mutex_can_spin_on_owner and avoid this problem. So this has been somewhat always known, at least in theory, until now. It's the cost of spinning without going through the wait-queue, unlike other locks. >> [...] >From: Waiman Long <Waiman.Long@hpe.com> >Date: Thu, 21 Jan 2016 17:53:14 -0500 >Subject: [PATCH] locking/mutex: Enable optimistic spinning of woken task in wait list > >Ding Tianhong reported a live-lock situation where a constant stream >of incoming optimistic spinners blocked a task in the wait list from >getting the mutex. > >This patch attempts to fix this live-lock condition by enabling the >a woken task in the wait list to enter optimistic spinning loop itself >with precedence over the ones in the OSQ. This should prevent the >live-lock >condition from happening. And one of the reasons why we never bothered 'fixing' things was the additional branching out in the slowpath (and lack of real issue, although this one being so damn pathological). I fear that your approach is one of those scenarios where the code ends up being bloated, albeit most of it is actually duplicated and can be refactored *sigh*. So now we'd spin, then sleep, then try spinning then sleep again... phew. Not to mention the performance implications, ie loosing the benefits of osq over waiter spinning in scenarios that would otherwise have more osq spinners as opposed to waiter spinners, or in setups where it is actually best to block instead of spinning. Thanks, Davidlohr ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 6:09 ` Davidlohr Bueso @ 2016-01-22 13:38 ` Waiman Long 2016-01-22 16:46 ` Davidlohr Bueso 0 siblings, 1 reply; 24+ messages in thread From: Waiman Long @ 2016-01-22 13:38 UTC (permalink / raw) To: Davidlohr Bueso Cc: Ding Tianhong, Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Jason Low, Tim Chen, Waiman Long On 01/22/2016 01:09 AM, Davidlohr Bueso wrote: > On Thu, 21 Jan 2016, Waiman Long wrote: > >> On 01/21/2016 04:29 AM, Ding Tianhong wrote: > >>> I got the vmcore and found that the ifconfig is already in the >>> wait_list of the >>> rtnl_lock for 120 second, but my process could get and release the >>> rtnl_lock >>> normally several times in one second, so it means that my process >>> jump the >>> queue and the ifconfig couldn't get the rtnl all the time, I check >>> the mutex lock >>> slow path and found that the mutex may spin on owner ignore whether >>> the wait list >>> is empty, it will cause the task in the wait list always be cut in >>> line, so add >>> test for wait list in the mutex_can_spin_on_owner and avoid this >>> problem. > > So this has been somewhat always known, at least in theory, until now. > It's the cost > of spinning without going through the wait-queue, unlike other locks. > >>> [...] > >> From: Waiman Long <Waiman.Long@hpe.com> >> Date: Thu, 21 Jan 2016 17:53:14 -0500 >> Subject: [PATCH] locking/mutex: Enable optimistic spinning of woken >> task in wait list >> >> Ding Tianhong reported a live-lock situation where a constant stream >> of incoming optimistic spinners blocked a task in the wait list from >> getting the mutex. >> >> This patch attempts to fix this live-lock condition by enabling the >> a woken task in the wait list to enter optimistic spinning loop itself >> with precedence over the ones in the OSQ. This should prevent the >> live-lock >> condition from happening. > > And one of the reasons why we never bothered 'fixing' things was the > additional > branching out in the slowpath (and lack of real issue, although this > one being so > damn pathological). I fear that your approach is one of those > scenarios where the > code ends up being bloated, albeit most of it is actually duplicated > and can be > refactored *sigh*. So now we'd spin, then sleep, then try spinning > then sleep again... > phew. Not to mention the performance implications, ie loosing the > benefits of osq > over waiter spinning in scenarios that would otherwise have more osq > spinners as > opposed to waiter spinners, or in setups where it is actually best to > block instead > of spinning. The patch that I sent out is just a proof of concept to make sure that it can fix that particular case. I do plan to refactor it if I decide to go ahead with an official one. Unlike the OSQ, there can be no more than one waiter spinner as the wakeup function is directed to only the first task in the wait list and the spinning won't happen until the task is first woken up. In the worst case scenario, there are only 2 spinners spinning on the lock and the owner field, one from OSQ and one from the wait list. That shouldn't put too much cacheline contention traffic to the system. Cheers, Longman ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 13:38 ` Waiman Long @ 2016-01-22 16:46 ` Davidlohr Bueso 0 siblings, 0 replies; 24+ messages in thread From: Davidlohr Bueso @ 2016-01-22 16:46 UTC (permalink / raw) To: Waiman Long Cc: Ding Tianhong, Peter Zijlstra, Ingo Molnar, linux-kernel@vger.kernel.org, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Jason Low, Tim Chen, Waiman Long On Fri, 22 Jan 2016, Waiman Long wrote: >The patch that I sent out is just a proof of concept to make sure >that it can fix that particular case. I do plan to refactor it if I >decide to go ahead with an official one. Unlike the OSQ, there can be >no more than one waiter spinner as the wakeup function is directed to >only the first task in the wait list and the spinning won't happen >until the task is first woken up. In the worst case scenario, there >are only 2 spinners spinning on the lock and the owner field, one >from OSQ and one from the wait list. That shouldn't put too much >cacheline contention traffic to the system. Similarly, I guess we should also wakeup the next waiter in line after releasing the wait_lock via wake_q. This would allow the woken waiter a slightly better chance of finding the wait_lock free when continuing to take the mutex. Thanks, Davidlohr ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-21 23:02 ` Waiman Long 2016-01-22 6:09 ` Davidlohr Bueso @ 2016-01-22 8:54 ` Peter Zijlstra 2016-01-22 10:20 ` Jason Low 2016-01-22 13:41 ` Waiman Long 1 sibling, 2 replies; 24+ messages in thread From: Peter Zijlstra @ 2016-01-22 8:54 UTC (permalink / raw) To: Waiman Long Cc: Ding Tianhong, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Jason Low, Tim Chen, Waiman Long On Thu, Jan 21, 2016 at 06:02:34PM -0500, Waiman Long wrote: > This patch attempts to fix this live-lock condition by enabling the > a woken task in the wait list to enter optimistic spinning loop itself > with precedence over the ones in the OSQ. This should prevent the > live-lock > condition from happening. So I think having the top waiter going back in to contend on the OSQ is an excellent idea, but I'm not sure the wlh_spinning thing is important. The OSQ itself is FIFO fair, and the waiters retain the wait_list position. So having the top wait_list entry contending on the OSQ ensures we cannot starve (I think). Also, as Davidlohr said, we cannot copy/paste this much code. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 8:54 ` Peter Zijlstra @ 2016-01-22 10:20 ` Jason Low 2016-01-22 10:53 ` Peter Zijlstra 2016-01-22 13:41 ` Waiman Long 1 sibling, 1 reply; 24+ messages in thread From: Jason Low @ 2016-01-22 10:20 UTC (permalink / raw) To: Peter Zijlstra Cc: Waiman Long, Ding Tianhong, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long, jason.low2 On Fri, 2016-01-22 at 09:54 +0100, Peter Zijlstra wrote: > On Thu, Jan 21, 2016 at 06:02:34PM -0500, Waiman Long wrote: > > This patch attempts to fix this live-lock condition by enabling the > > a woken task in the wait list to enter optimistic spinning loop itself > > with precedence over the ones in the OSQ. This should prevent the > > live-lock > > condition from happening. > > > So I think having the top waiter going back in to contend on the OSQ is > an excellent idea, but I'm not sure the wlh_spinning thing is important. > > The OSQ itself is FIFO fair, and the waiters retain the wait_list > position. So having the top wait_list entry contending on the OSQ > ensures we cannot starve (I think). Right, and we can also avoid needing to add that extra field to the mutex structure. Before calling optimistic spinning, we do want to check if the lock is available to avoid unnecessary OSQ overhead though. So maybe the following would be sufficient: --- kernel/locking/mutex.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 0551c21..ead0bd1 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -543,6 +543,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, lock_contended(&lock->dep_map, ip); for (;;) { + bool acquired = false; + /* * Lets try to take the lock again - this is needed even if * we get here for the first time (shortly after failing to @@ -577,7 +579,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, /* didn't get the lock, go to sleep: */ spin_unlock_mutex(&lock->wait_lock, flags); schedule_preempt_disabled(); + + if (mutex_is_locked(lock)) + acquired = mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx); spin_lock_mutex(&lock->wait_lock, flags); + if (acquired) + break; } __set_task_state(task, TASK_RUNNING); -- 1.7.2.5 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 10:20 ` Jason Low @ 2016-01-22 10:53 ` Peter Zijlstra 2016-01-22 10:56 ` Peter Zijlstra 0 siblings, 1 reply; 24+ messages in thread From: Peter Zijlstra @ 2016-01-22 10:53 UTC (permalink / raw) To: Jason Low Cc: Waiman Long, Ding Tianhong, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long On Fri, Jan 22, 2016 at 02:20:19AM -0800, Jason Low wrote: > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -543,6 +543,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > lock_contended(&lock->dep_map, ip); > > for (;;) { > + bool acquired = false; > + > /* > * Lets try to take the lock again - this is needed even if > * we get here for the first time (shortly after failing to > @@ -577,7 +579,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > /* didn't get the lock, go to sleep: */ > spin_unlock_mutex(&lock->wait_lock, flags); > schedule_preempt_disabled(); > + > + if (mutex_is_locked(lock)) > + acquired = mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx); > spin_lock_mutex(&lock->wait_lock, flags); > + if (acquired) > + break; > } > __set_task_state(task, TASK_RUNNING); I think the problem here is that mutex_optimistic_spin() leaves the mutex->count == 0, even though we have waiters (us at the very least). But this should be easily fixed, since if we acquired, we should be the one releasing, so there's no race. So something like so: if (acquired) { atomic_set(&mutex->count, -1); break; } Should deal with that -- we'll set it to 0 again a little further down if the list ends up empty. There might be other details, but this is the one that stood out. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 10:53 ` Peter Zijlstra @ 2016-01-22 10:56 ` Peter Zijlstra 2016-01-22 11:06 ` Peter Zijlstra 0 siblings, 1 reply; 24+ messages in thread From: Peter Zijlstra @ 2016-01-22 10:56 UTC (permalink / raw) To: Jason Low Cc: Waiman Long, Ding Tianhong, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long On Fri, Jan 22, 2016 at 11:53:12AM +0100, Peter Zijlstra wrote: > There might be other details, but this is the one that stood out. I think this also does the wrong thing for use_ww_ctx. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 10:56 ` Peter Zijlstra @ 2016-01-22 11:06 ` Peter Zijlstra 2016-01-22 13:59 ` Waiman Long 0 siblings, 1 reply; 24+ messages in thread From: Peter Zijlstra @ 2016-01-22 11:06 UTC (permalink / raw) To: Jason Low Cc: Waiman Long, Ding Tianhong, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long On Fri, Jan 22, 2016 at 11:56:52AM +0100, Peter Zijlstra wrote: > On Fri, Jan 22, 2016 at 11:53:12AM +0100, Peter Zijlstra wrote: > > > There might be other details, but this is the one that stood out. > > I think this also does the wrong thing for use_ww_ctx. Something like so? --- kernel/locking/mutex.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 0551c219c40e..070a0ac34aa7 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -512,6 +512,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, struct task_struct *task = current; struct mutex_waiter waiter; unsigned long flags; + bool acquired; int ret; preempt_disable(); @@ -543,6 +544,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, lock_contended(&lock->dep_map, ip); for (;;) { + acquired = false; /* * Lets try to take the lock again - this is needed even if * we get here for the first time (shortly after failing to @@ -577,7 +579,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, /* didn't get the lock, go to sleep: */ spin_unlock_mutex(&lock->wait_lock, flags); schedule_preempt_disabled(); + + if (mutex_is_locked(lock)) + acquired = mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx); + spin_lock_mutex(&lock->wait_lock, flags); + + if (acquired) { + atomic_set(&lock->count, -1); + break; + } } __set_task_state(task, TASK_RUNNING); @@ -587,6 +598,9 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, atomic_set(&lock->count, 0); debug_mutex_free_waiter(&waiter); + if (acquired) + goto unlock; + skip_wait: /* got the lock - cleanup and rejoice! */ lock_acquired(&lock->dep_map, ip); @@ -597,6 +611,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, ww_mutex_set_context_slowpath(ww, ww_ctx); } +unlock: spin_unlock_mutex(&lock->wait_lock, flags); preempt_enable(); return 0; ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 11:06 ` Peter Zijlstra @ 2016-01-22 13:59 ` Waiman Long 2016-01-24 8:03 ` Ding Tianhong 0 siblings, 1 reply; 24+ messages in thread From: Waiman Long @ 2016-01-22 13:59 UTC (permalink / raw) To: Peter Zijlstra Cc: Jason Low, Ding Tianhong, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long On 01/22/2016 06:06 AM, Peter Zijlstra wrote: > On Fri, Jan 22, 2016 at 11:56:52AM +0100, Peter Zijlstra wrote: >> On Fri, Jan 22, 2016 at 11:53:12AM +0100, Peter Zijlstra wrote: >> >>> There might be other details, but this is the one that stood out. >> I think this also does the wrong thing for use_ww_ctx. > Something like so? I think that should work. My only minor concern is that putting the waiter spinner at the end of the OSQ will take it longer to get the lock, but that shouldn't be a big issue. > --- > kernel/locking/mutex.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 0551c219c40e..070a0ac34aa7 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -512,6 +512,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > struct task_struct *task = current; > struct mutex_waiter waiter; > unsigned long flags; > + bool acquired; > int ret; > > preempt_disable(); > @@ -543,6 +544,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > lock_contended(&lock->dep_map, ip); > > for (;;) { > + acquired = false; > /* > * Lets try to take the lock again - this is needed even if > * we get here for the first time (shortly after failing to > @@ -577,7 +579,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > /* didn't get the lock, go to sleep: */ > spin_unlock_mutex(&lock->wait_lock, flags); > schedule_preempt_disabled(); > + > + if (mutex_is_locked(lock)) > + acquired = mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx); > + > spin_lock_mutex(&lock->wait_lock, flags); > + > + if (acquired) { > + atomic_set(&lock->count, -1); > + break; > + } > } > __set_task_state(task, TASK_RUNNING); > > @@ -587,6 +598,9 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > atomic_set(&lock->count, 0); > debug_mutex_free_waiter(&waiter); > > + if (acquired) > + goto unlock; > + > skip_wait: > /* got the lock - cleanup and rejoice! */ > lock_acquired(&lock->dep_map, ip); > @@ -597,6 +611,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, > ww_mutex_set_context_slowpath(ww, ww_ctx); > } > > +unlock: > spin_unlock_mutex(&lock->wait_lock, flags); > preempt_enable(); > return 0; Cheers, Longman ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 13:59 ` Waiman Long @ 2016-01-24 8:03 ` Ding Tianhong 2016-01-29 9:53 ` Peter Zijlstra 0 siblings, 1 reply; 24+ messages in thread From: Ding Tianhong @ 2016-01-24 8:03 UTC (permalink / raw) To: Waiman Long, Peter Zijlstra Cc: Jason Low, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long On 2016/1/22 21:59, Waiman Long wrote: > On 01/22/2016 06:06 AM, Peter Zijlstra wrote: >> On Fri, Jan 22, 2016 at 11:56:52AM +0100, Peter Zijlstra wrote: >>> On Fri, Jan 22, 2016 at 11:53:12AM +0100, Peter Zijlstra wrote: >>> >>>> There might be other details, but this is the one that stood out. >>> I think this also does the wrong thing for use_ww_ctx. >> Something like so? > > I think that should work. My only minor concern is that putting the waiter spinner at the end of the OSQ will take it longer to get the lock, but that shouldn't be a big issue. > >> --- >> kernel/locking/mutex.c | 15 +++++++++++++++ >> 1 file changed, 15 insertions(+) >> >> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c >> index 0551c219c40e..070a0ac34aa7 100644 >> --- a/kernel/locking/mutex.c >> +++ b/kernel/locking/mutex.c >> @@ -512,6 +512,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, >> struct task_struct *task = current; >> struct mutex_waiter waiter; >> unsigned long flags; >> + bool acquired; >> int ret; >> >> preempt_disable(); >> @@ -543,6 +544,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, >> lock_contended(&lock->dep_map, ip); >> >> for (;;) { >> + acquired = false; >> /* >> * Lets try to take the lock again - this is needed even if >> * we get here for the first time (shortly after failing to >> @@ -577,7 +579,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, >> /* didn't get the lock, go to sleep: */ >> spin_unlock_mutex(&lock->wait_lock, flags); >> schedule_preempt_disabled(); >> + >> + if (mutex_is_locked(lock)) >> + acquired = mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx); >> + >> spin_lock_mutex(&lock->wait_lock, flags); >> + >> + if (acquired) { >> + atomic_set(&lock->count, -1); >> + break; >> + } >> } >> __set_task_state(task, TASK_RUNNING); >> >> @@ -587,6 +598,9 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, >> atomic_set(&lock->count, 0); >> debug_mutex_free_waiter(&waiter); >> >> + if (acquired) >> + goto unlock; >> + >> skip_wait: >> /* got the lock - cleanup and rejoice! */ >> lock_acquired(&lock->dep_map, ip); >> @@ -597,6 +611,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, >> ww_mutex_set_context_slowpath(ww, ww_ctx); >> } >> >> +unlock: >> spin_unlock_mutex(&lock->wait_lock, flags); >> preempt_enable(); >> return 0; > > Cheers, > Longman > looks good to me, I will try this solution and report the result, thanks everyone. Ding > . > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-24 8:03 ` Ding Tianhong @ 2016-01-29 9:53 ` Peter Zijlstra 2016-01-30 1:18 ` Ding Tianhong 0 siblings, 1 reply; 24+ messages in thread From: Peter Zijlstra @ 2016-01-29 9:53 UTC (permalink / raw) To: Ding Tianhong Cc: Waiman Long, Jason Low, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long On Sun, Jan 24, 2016 at 04:03:50PM +0800, Ding Tianhong wrote: > looks good to me, I will try this solution and report the result, thanks everyone. Did you get a change to run with this? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-29 9:53 ` Peter Zijlstra @ 2016-01-30 1:18 ` Ding Tianhong 2016-02-01 3:29 ` huang ying 0 siblings, 1 reply; 24+ messages in thread From: Ding Tianhong @ 2016-01-30 1:18 UTC (permalink / raw) To: Peter Zijlstra Cc: Waiman Long, Jason Low, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long On 2016/1/29 17:53, Peter Zijlstra wrote: > On Sun, Jan 24, 2016 at 04:03:50PM +0800, Ding Tianhong wrote: > >> looks good to me, I will try this solution and report the result, thanks everyone. > > Did you get a change to run with this? > > . > I backport this patch to 3.10 lts kernel, and didn't change any logic, Till now, the patch works fine to me, and no need to change anything, So I think this patch is no problem, could you formal release this patch to the latest kernel? :) Thanks. Ding ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-30 1:18 ` Ding Tianhong @ 2016-02-01 3:29 ` huang ying 2016-02-01 3:35 ` Huang, Ying 0 siblings, 1 reply; 24+ messages in thread From: huang ying @ 2016-02-01 3:29 UTC (permalink / raw) To: Ding Tianhong Cc: Peter Zijlstra, Waiman Long, Jason Low, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long, Huang Ying On Sat, Jan 30, 2016 at 9:18 AM, Ding Tianhong <dingtianhong@huawei.com> wrote: > On 2016/1/29 17:53, Peter Zijlstra wrote: >> On Sun, Jan 24, 2016 at 04:03:50PM +0800, Ding Tianhong wrote: >> >>> looks good to me, I will try this solution and report the result, thanks everyone. >> >> Did you get a change to run with this? >> >> . >> > > I backport this patch to 3.10 lts kernel, and didn't change any logic, Till now, the patch works fine to me, and no need to change anything, > So I think this patch is no problem, could you formal release this patch to the latest kernel? :) > > Thanks. > Ding > > The original patch from Tianhong triggered a performance regression because the optimistic spinning is turned off in effect. I tested Peter's patch with same configuration and there show no regression. So I think the patch keep the optimistic spinning. Test result details will be in the next email. Best Regards, Huang, YIng ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-02-01 3:29 ` huang ying @ 2016-02-01 3:35 ` Huang, Ying 0 siblings, 0 replies; 24+ messages in thread From: Huang, Ying @ 2016-02-01 3:35 UTC (permalink / raw) To: huang ying Cc: Ding Tianhong, Peter Zijlstra, Waiman Long, Jason Low, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Tim Chen, Waiman Long, Huang Ying huang ying <huang.ying.caritas@gmail.com> writes: > On Sat, Jan 30, 2016 at 9:18 AM, Ding Tianhong <dingtianhong@huawei.com> wrote: >> On 2016/1/29 17:53, Peter Zijlstra wrote: >>> On Sun, Jan 24, 2016 at 04:03:50PM +0800, Ding Tianhong wrote: >>> >>>> looks good to me, I will try this solution and report the result, thanks everyone. >>> >>> Did you get a change to run with this? >>> >>> . >>> >> >> I backport this patch to 3.10 lts kernel, and didn't change any >> logic, Till now, the patch works fine to me, and no need to change >> anything, >> So I think this patch is no problem, could you formal release this patch to the latest kernel? :) >> >> Thanks. >> Ding >> >> > > The original patch from Tianhong triggered a performance regression > because the optimistic spinning is turned off in effect. I tested > Peter's patch with same configuration and there show no regression. > So I think the patch keep the optimistic spinning. Test result > details will be in the next email. Here is the detailed test result: ========================================================================================= compiler/cpufreq_governor/kconfig/nr_task/rootfs/tbox_group/test/testcase: gcc-4.9/performance/x86_64-rhel/100%/debian-x86_64-2015-02-07.cgz/lkp-ivb-d01/fstime/unixbench commit: v4.4 1db66c17114d5437c0757d6792c0d8923192ecd6 v4.4 1db66c17114d5437c0757d6792 ---------------- -------------------------- %stddev %change %stddev \ | \ 371269 ± 10% -93.2% 25080 ± 4% unixbench.time.voluntary_context_switches 371269 ± 10% -93.2% 25080 ± 4% time.voluntary_context_switches 6189 ± 8% -76.4% 1463 ± 6% vmstat.system.cs 5706 ± 0% -1.7% 5608 ± 0% vmstat.system.in 113680 ± 12% -73.5% 30086 ± 1% cpuidle.C1-IVB.usage 1515925 ± 20% +68.3% 2552001 ± 11% cpuidle.C1E-IVB.time 1227221 ± 20% -51.7% 592695 ± 17% cpuidle.C3-IVB.time 2697 ± 10% -77.8% 598.33 ± 10% cpuidle.C3-IVB.usage 15173 ± 6% -23.1% 11663 ± 1% cpuidle.C6-IVB.usage 34.38 ± 27% -35.0% 22.33 ± 2% cpuidle.POLL.usage 61.92 ± 9% +14.3% 70.78 ± 7% sched_debug.cfs_rq:/.load_avg.min 40.85 ± 29% +27.3% 52.00 ± 10% sched_debug.cfs_rq:/.runnable_load_avg.2 -1949 ±-37% -64.6% -690.19 ±-134% sched_debug.cfs_rq:/.spread0.4 -1773 ±-29% -85.6% -256.00 ±-388% sched_debug.cfs_rq:/.spread0.7 -2478 ±-26% -49.5% -1251 ±-66% sched_debug.cfs_rq:/.spread0.min 61.95 ± 9% +14.8% 71.11 ± 7% sched_debug.cfs_rq:/.tg_load_avg_contrib.min 396962 ± 12% +27.9% 507573 ± 10% sched_debug.cpu.avg_idle.0 432973 ± 18% +45.3% 629147 ± 12% sched_debug.cpu.avg_idle.6 448566 ± 3% +11.5% 499990 ± 0% sched_debug.cpu.avg_idle.avg 45.31 ± 5% -9.5% 41.00 ± 3% sched_debug.cpu.cpu_load[3].7 52204 ± 10% -49.9% 26173 ± 34% sched_debug.cpu.nr_switches.0 50383 ± 12% -57.6% 21353 ± 15% sched_debug.cpu.nr_switches.1 45425 ± 16% -68.5% 14325 ± 28% sched_debug.cpu.nr_switches.2 43069 ± 20% -65.5% 14852 ± 41% sched_debug.cpu.nr_switches.3 40285 ± 16% -70.4% 11905 ± 47% sched_debug.cpu.nr_switches.4 40732 ± 13% -75.8% 9872 ± 39% sched_debug.cpu.nr_switches.5 43011 ± 19% -80.0% 8607 ± 42% sched_debug.cpu.nr_switches.6 38076 ± 12% -75.9% 9167 ± 40% sched_debug.cpu.nr_switches.7 44148 ± 7% -67.1% 14532 ± 6% sched_debug.cpu.nr_switches.avg 59877 ± 8% -51.3% 29146 ± 21% sched_debug.cpu.nr_switches.max 33672 ± 9% -83.0% 5718 ± 4% sched_debug.cpu.nr_switches.min -0.62 ±-1411% -2212.5% 13.00 ± 16% sched_debug.cpu.nr_uninterruptible.0 -1.23 ±-582% -1318.8% 15.00 ± 35% sched_debug.cpu.nr_uninterruptible.1 2.54 ±263% +267.7% 9.33 ± 91% sched_debug.cpu.nr_uninterruptible.2 0.31 ±2966% -5841.7% -17.67 ±-19% sched_debug.cpu.nr_uninterruptible.5 9.84 ± 19% +70.4% 16.76 ± 5% sched_debug.cpu.nr_uninterruptible.stddev 116287 ± 4% -20.9% 91972 ± 9% sched_debug.cpu.sched_count.0 50411 ± 12% -57.6% 21382 ± 15% sched_debug.cpu.sched_count.1 45453 ± 16% -68.4% 14356 ± 28% sched_debug.cpu.sched_count.2 43098 ± 20% -65.5% 14888 ± 41% sched_debug.cpu.sched_count.3 40314 ± 16% -70.4% 11934 ± 47% sched_debug.cpu.sched_count.4 40761 ± 13% -75.7% 9896 ± 39% sched_debug.cpu.sched_count.5 43041 ± 19% -79.9% 8636 ± 42% sched_debug.cpu.sched_count.6 38105 ± 12% -75.9% 9193 ± 40% sched_debug.cpu.sched_count.7 52184 ± 6% -56.3% 22782 ± 4% sched_debug.cpu.sched_count.avg 116288 ± 4% -20.9% 91972 ± 9% sched_debug.cpu.sched_count.max 33701 ± 9% -82.9% 5746 ± 4% sched_debug.cpu.sched_count.min 22760 ± 10% -63.0% 8418 ± 40% sched_debug.cpu.sched_goidle.0 23319 ± 13% -60.9% 9114 ± 22% sched_debug.cpu.sched_goidle.1 21273 ± 17% -80.4% 4169 ± 13% sched_debug.cpu.sched_goidle.2 19993 ± 19% -67.8% 6429 ± 45% sched_debug.cpu.sched_goidle.3 18614 ± 17% -85.0% 2788 ± 29% sched_debug.cpu.sched_goidle.4 18921 ± 12% -86.7% 2520 ± 15% sched_debug.cpu.sched_goidle.5 20131 ± 17% -82.1% 3596 ± 52% sched_debug.cpu.sched_goidle.6 17861 ± 12% -86.9% 2334 ± 14% sched_debug.cpu.sched_goidle.7 20359 ± 8% -75.8% 4921 ± 5% sched_debug.cpu.sched_goidle.avg 26477 ± 10% -60.2% 10539 ± 21% sched_debug.cpu.sched_goidle.max 15845 ± 10% -87.2% 2033 ± 4% sched_debug.cpu.sched_goidle.min 29043 ± 15% -58.8% 11958 ± 26% sched_debug.cpu.ttwu_count.0 24191 ± 10% -68.8% 7547 ± 27% sched_debug.cpu.ttwu_count.1 21313 ± 11% -72.7% 5819 ± 24% sched_debug.cpu.ttwu_count.2 21487 ± 13% -61.4% 8296 ± 43% sched_debug.cpu.ttwu_count.3 19644 ± 15% -54.4% 8967 ± 79% sched_debug.cpu.ttwu_count.4 20786 ± 15% -69.2% 6409 ± 58% sched_debug.cpu.ttwu_count.5 20435 ± 17% -79.3% 4231 ± 58% sched_debug.cpu.ttwu_count.6 19293 ± 17% -77.0% 4432 ± 55% sched_debug.cpu.ttwu_count.7 22024 ± 7% -67.3% 7207 ± 6% sched_debug.cpu.ttwu_count.avg 31009 ± 9% -45.2% 17008 ± 17% sched_debug.cpu.ttwu_count.max 16791 ± 10% -85.1% 2494 ± 5% sched_debug.cpu.ttwu_count.min 3084 ± 8% +25.5% 3870 ± 9% sched_debug.cpu.ttwu_local.avg The URL of the original regression report email: https://lists.01.org/pipermail/lkp/2016-January/003442.html Best Regards, Huang, Ying ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] locking/mutexes: don't spin on owner when wait list is not NULL. 2016-01-22 8:54 ` Peter Zijlstra 2016-01-22 10:20 ` Jason Low @ 2016-01-22 13:41 ` Waiman Long 1 sibling, 0 replies; 24+ messages in thread From: Waiman Long @ 2016-01-22 13:41 UTC (permalink / raw) To: Peter Zijlstra Cc: Ding Tianhong, Ingo Molnar, linux-kernel@vger.kernel.org, Davidlohr Bueso, Linus Torvalds, Paul E. McKenney, Thomas Gleixner, Will Deacon, Jason Low, Tim Chen, Waiman Long On 01/22/2016 03:54 AM, Peter Zijlstra wrote: > On Thu, Jan 21, 2016 at 06:02:34PM -0500, Waiman Long wrote: >> This patch attempts to fix this live-lock condition by enabling the >> a woken task in the wait list to enter optimistic spinning loop itself >> with precedence over the ones in the OSQ. This should prevent the >> live-lock >> condition from happening. > > So I think having the top waiter going back in to contend on the OSQ is > an excellent idea, but I'm not sure the wlh_spinning thing is important. Yes, that is optional. I put it there just to make it is more likely for the waiter spinner to get the lock. Without that, the chance will be 50/50 on average. I can certainly take that out. > The OSQ itself is FIFO fair, and the waiters retain the wait_list > position. So having the top wait_list entry contending on the OSQ > ensures we cannot starve (I think). > > Also, as Davidlohr said, we cannot copy/paste this much code. As I said in the previous mail, I do intend to refactor it before sending out the official patch. Cheers, Longman ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2016-02-01 3:35 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-21 6:53 [PATCH RFC ] locking/mutexes: don't spin on owner when wait list is not NULL Ding Tianhong 2016-01-21 7:29 ` Ingo Molnar 2016-01-21 9:04 ` Ding Tianhong -- strict thread matches above, loose matches on Subject: below -- 2016-01-21 9:29 [PATCH RFC] " Ding Tianhong 2016-01-21 21:23 ` Tim Chen 2016-01-22 2:41 ` Paul E. McKenney 2016-01-22 2:48 ` Davidlohr Bueso 2016-01-22 3:13 ` Paul E. McKenney 2016-01-21 23:02 ` Waiman Long 2016-01-22 6:09 ` Davidlohr Bueso 2016-01-22 13:38 ` Waiman Long 2016-01-22 16:46 ` Davidlohr Bueso 2016-01-22 8:54 ` Peter Zijlstra 2016-01-22 10:20 ` Jason Low 2016-01-22 10:53 ` Peter Zijlstra 2016-01-22 10:56 ` Peter Zijlstra 2016-01-22 11:06 ` Peter Zijlstra 2016-01-22 13:59 ` Waiman Long 2016-01-24 8:03 ` Ding Tianhong 2016-01-29 9:53 ` Peter Zijlstra 2016-01-30 1:18 ` Ding Tianhong 2016-02-01 3:29 ` huang ying 2016-02-01 3:35 ` Huang, Ying 2016-01-22 13:41 ` Waiman Long
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).