* Re: [PATCH 3/3] sched: Add cond_resched_rwlock
2020-10-27 16:49 ` [PATCH 3/3] sched: Add cond_resched_rwlock Ben Gardon
@ 2020-10-27 17:56 ` Sean Christopherson
2020-10-27 19:17 ` Peter Zijlstra
2020-10-27 20:17 ` Davidlohr Bueso
2020-10-27 18:44 ` Peter Zijlstra
2020-10-30 17:09 ` Waiman Long
2 siblings, 2 replies; 9+ messages in thread
From: Sean Christopherson @ 2020-10-27 17:56 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Peter Shier, Peter Zijlstra,
Ingo Molnar, Will Deacon, Waiman Long, Davidlohr Bueso
On Tue, Oct 27, 2020 at 09:49:50AM -0700, Ben Gardon wrote:
> Rescheduling while holding a spin lock is essential for keeping long
> running kernel operations running smoothly. Add the facility to
> cond_resched rwlocks.
This adds two new exports and two new macros without any in-tree users, which
is generally frowned upon. You and I know these will be used by KVM's new
TDP MMU, but the non-KVM folks, and more importantly the maintainers of this
code, are undoubtedly going to ask "why". I.e. these patches probably belong
in the KVM series to switch to a rwlock for the TDP MMU.
Regarding the code, it's all copy-pasted from the spinlock code and darn near
identical. It might be worth adding builder macros for these.
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
> include/linux/sched.h | 12 ++++++++++++
> kernel/sched/core.c | 40 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 52 insertions(+)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 77179160ec3ab..2eb0c53fce115 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1841,12 +1841,24 @@ static inline int _cond_resched(void) { return 0; }
> })
>
> extern int __cond_resched_lock(spinlock_t *lock);
> +extern int __cond_resched_rwlock_read(rwlock_t *lock);
> +extern int __cond_resched_rwlock_write(rwlock_t *lock);
>
> #define cond_resched_lock(lock) ({ \
> ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\
> __cond_resched_lock(lock); \
> })
>
> +#define cond_resched_rwlock_read(lock) ({ \
> + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \
> + __cond_resched_rwlock_read(lock); \
> +})
> +
> +#define cond_resched_rwlock_write(lock) ({ \
> + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \
> + __cond_resched_rwlock_write(lock); \
> +})
> +
> static inline void cond_resched_rcu(void)
> {
> #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d2003a7d5ab55..ac58e7829a063 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6152,6 +6152,46 @@ int __cond_resched_lock(spinlock_t *lock)
> }
> EXPORT_SYMBOL(__cond_resched_lock);
>
> +int __cond_resched_rwlock_read(rwlock_t *lock)
> +{
> + int resched = should_resched(PREEMPT_LOCK_OFFSET);
> + int ret = 0;
> +
> + lockdep_assert_held(lock);
> +
> + if (rwlock_needbreak(lock) || resched) {
> + read_unlock(lock);
> + if (resched)
> + preempt_schedule_common();
> + else
> + cpu_relax();
> + ret = 1;
AFAICT, this rather odd code flow from __cond_resched_lock() is an artifact of
code changes over the years and not intentionally weird. IMO, it would be
cleaner and easier to read as:
int resched = should_resched(PREEMPT_LOCK_OFFSET);
lockdep_assert_held(lock);
if (!rwlock_needbreak(lock) && !resched)
return 0;
read_unlock(lock);
if (resched)
preempt_schedule_common();
else
cpu_relax();
read_lock(lock)
return 1;
> + read_lock(lock);
> + }
> + return ret;
> +}
> +EXPORT_SYMBOL(__cond_resched_rwlock_read);
> +
> +int __cond_resched_rwlock_write(rwlock_t *lock)
> +{
> + int resched = should_resched(PREEMPT_LOCK_OFFSET);
> + int ret = 0;
> +
> + lockdep_assert_held(lock);
This shoulid be lockdep_assert_held_write.
> +
> + if (rwlock_needbreak(lock) || resched) {
> + write_unlock(lock);
> + if (resched)
> + preempt_schedule_common();
> + else
> + cpu_relax();
> + ret = 1;
> + write_lock(lock);
> + }
> + return ret;
> +}
> +EXPORT_SYMBOL(__cond_resched_rwlock_write);
> +
> /**
> * yield - yield the current processor to other threads.
> *
> --
> 2.29.0.rc2.309.g374f81d7ae-goog
>
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 3/3] sched: Add cond_resched_rwlock
2020-10-27 17:56 ` Sean Christopherson
@ 2020-10-27 19:17 ` Peter Zijlstra
2020-10-27 20:17 ` Davidlohr Bueso
1 sibling, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2020-10-27 19:17 UTC (permalink / raw)
To: Sean Christopherson
Cc: Ben Gardon, linux-kernel, kvm, Paolo Bonzini, Peter Shier,
Ingo Molnar, Will Deacon, Waiman Long, Davidlohr Bueso
On Tue, Oct 27, 2020 at 10:56:36AM -0700, Sean Christopherson wrote:
> On Tue, Oct 27, 2020 at 09:49:50AM -0700, Ben Gardon wrote:
> > Rescheduling while holding a spin lock is essential for keeping long
> > running kernel operations running smoothly. Add the facility to
> > cond_resched rwlocks.
>
> This adds two new exports and two new macros without any in-tree users, which
> is generally frowned upon. You and I know these will be used by KVM's new
> TDP MMU, but the non-KVM folks, and more importantly the maintainers of this
> code, are undoubtedly going to ask "why". I.e. these patches probably belong
> in the KVM series to switch to a rwlock for the TDP MMU.
I was informed about this ;-)
> Regarding the code, it's all copy-pasted from the spinlock code and darn near
> identical. It might be worth adding builder macros for these.
I considered mentioning them; I'm typically a fan of them, but I'm not
quite sure it's worth the effort here.
> > +int __cond_resched_rwlock_read(rwlock_t *lock)
> > +{
> > + int resched = should_resched(PREEMPT_LOCK_OFFSET);
> > + int ret = 0;
> > +
> > + lockdep_assert_held(lock);
> > +
> > + if (rwlock_needbreak(lock) || resched) {
> > + read_unlock(lock);
> > + if (resched)
> > + preempt_schedule_common();
> > + else
> > + cpu_relax();
> > + ret = 1;
>
> AFAICT, this rather odd code flow from __cond_resched_lock() is an artifact of
> code changes over the years and not intentionally weird. IMO, it would be
> cleaner and easier to read as:
>
> int resched = should_resched(PREEMPT_LOCK_OFFSET);
>
> lockdep_assert_held(lock);
lockdep_assert_held_read() :-)
>
> if (!rwlock_needbreak(lock) && !resched)
> return 0;
>
> read_unlock(lock);
> if (resched)
> preempt_schedule_common();
> else
> cpu_relax();
> read_lock(lock)
> return 1;
>
I suppose that works, but then also change the existing one.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 3/3] sched: Add cond_resched_rwlock
2020-10-27 17:56 ` Sean Christopherson
2020-10-27 19:17 ` Peter Zijlstra
@ 2020-10-27 20:17 ` Davidlohr Bueso
1 sibling, 0 replies; 9+ messages in thread
From: Davidlohr Bueso @ 2020-10-27 20:17 UTC (permalink / raw)
To: Sean Christopherson
Cc: Ben Gardon, linux-kernel, kvm, Paolo Bonzini, Peter Shier,
Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long
On Tue, 27 Oct 2020, Sean Christopherson wrote:
>On Tue, Oct 27, 2020 at 09:49:50AM -0700, Ben Gardon wrote:
>> Rescheduling while holding a spin lock is essential for keeping long
>> running kernel operations running smoothly. Add the facility to
>> cond_resched rwlocks.
Nit: I would start the paragraph with 'Safely rescheduling ...'
While obvious when reading the code, 'Rescheduling while holding
a spin lock' can throw the reader off.
>
>This adds two new exports and two new macros without any in-tree users, which
>is generally frowned upon. You and I know these will be used by KVM's new
>TDP MMU, but the non-KVM folks, and more importantly the maintainers of this
>code, are undoubtedly going to ask "why". I.e. these patches probably belong
>in the KVM series to switch to a rwlock for the TDP MMU.
>
>Regarding the code, it's all copy-pasted from the spinlock code and darn near
>identical. It might be worth adding builder macros for these.
Agreed, all three could be nicely consolidated. Otherwise this series looks
sane, feel free to add my:
Acked-by: Davidlohr Bueso <dbueso@suse.de>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] sched: Add cond_resched_rwlock
2020-10-27 16:49 ` [PATCH 3/3] sched: Add cond_resched_rwlock Ben Gardon
2020-10-27 17:56 ` Sean Christopherson
@ 2020-10-27 18:44 ` Peter Zijlstra
2020-10-27 18:50 ` Paolo Bonzini
2020-10-30 17:09 ` Waiman Long
2 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2020-10-27 18:44 UTC (permalink / raw)
To: Ben Gardon
Cc: linux-kernel, kvm, Paolo Bonzini, Sean Christopherson,
Peter Shier, Ingo Molnar, Will Deacon, Waiman Long,
Davidlohr Bueso
On Tue, Oct 27, 2020 at 09:49:50AM -0700, Ben Gardon wrote:
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d2003a7d5ab55..ac58e7829a063 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6152,6 +6152,46 @@ int __cond_resched_lock(spinlock_t *lock)
> }
> EXPORT_SYMBOL(__cond_resched_lock);
>
> +int __cond_resched_rwlock_read(rwlock_t *lock)
> +{
> + int resched = should_resched(PREEMPT_LOCK_OFFSET);
> + int ret = 0;
> +
> + lockdep_assert_held(lock);
lockdep_assert_held_read(lock);
> +
> + if (rwlock_needbreak(lock) || resched) {
> + read_unlock(lock);
> + if (resched)
> + preempt_schedule_common();
> + else
> + cpu_relax();
> + ret = 1;
> + read_lock(lock);
> + }
> + return ret;
> +}
> +EXPORT_SYMBOL(__cond_resched_rwlock_read);
> +
> +int __cond_resched_rwlock_write(rwlock_t *lock)
> +{
> + int resched = should_resched(PREEMPT_LOCK_OFFSET);
> + int ret = 0;
> +
> + lockdep_assert_held(lock);
lockdep_assert_held_write(lock);
> +
> + if (rwlock_needbreak(lock) || resched) {
> + write_unlock(lock);
> + if (resched)
> + preempt_schedule_common();
> + else
> + cpu_relax();
> + ret = 1;
> + write_lock(lock);
> + }
> + return ret;
> +}
> +EXPORT_SYMBOL(__cond_resched_rwlock_write);
If this is the only feedback (the patches look fine to me), don't bother
resending, I'll edit them when applying.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH 3/3] sched: Add cond_resched_rwlock
2020-10-27 18:44 ` Peter Zijlstra
@ 2020-10-27 18:50 ` Paolo Bonzini
0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-10-27 18:50 UTC (permalink / raw)
To: Peter Zijlstra, Ben Gardon
Cc: linux-kernel, kvm, Sean Christopherson, Peter Shier, Ingo Molnar,
Will Deacon, Waiman Long, Davidlohr Bueso
On 27/10/20 19:44, Peter Zijlstra wrote:
> On Tue, Oct 27, 2020 at 09:49:50AM -0700, Ben Gardon wrote:
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index d2003a7d5ab55..ac58e7829a063 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -6152,6 +6152,46 @@ int __cond_resched_lock(spinlock_t *lock)
>> }
>> EXPORT_SYMBOL(__cond_resched_lock);
>>
>> +int __cond_resched_rwlock_read(rwlock_t *lock)
>> +{
>> + int resched = should_resched(PREEMPT_LOCK_OFFSET);
>> + int ret = 0;
>> +
>> + lockdep_assert_held(lock);
>
> lockdep_assert_held_read(lock);
>
>> +
>> + if (rwlock_needbreak(lock) || resched) {
>> + read_unlock(lock);
>> + if (resched)
>> + preempt_schedule_common();
>> + else
>> + cpu_relax();
>> + ret = 1;
>> + read_lock(lock);
>> + }
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(__cond_resched_rwlock_read);
>> +
>> +int __cond_resched_rwlock_write(rwlock_t *lock)
>> +{
>> + int resched = should_resched(PREEMPT_LOCK_OFFSET);
>> + int ret = 0;
>> +
>> + lockdep_assert_held(lock);
>
> lockdep_assert_held_write(lock);
>
>> +
>> + if (rwlock_needbreak(lock) || resched) {
>> + write_unlock(lock);
>> + if (resched)
>> + preempt_schedule_common();
>> + else
>> + cpu_relax();
>> + ret = 1;
>> + write_lock(lock);
>> + }
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(__cond_resched_rwlock_write);
>
> If this is the only feedback (the patches look fine to me), don't bother
> resending, I'll edit them when applying.
>
If that is an Acked-by, I'll merge them through the KVM tree when time
comes.
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] sched: Add cond_resched_rwlock
2020-10-27 16:49 ` [PATCH 3/3] sched: Add cond_resched_rwlock Ben Gardon
2020-10-27 17:56 ` Sean Christopherson
2020-10-27 18:44 ` Peter Zijlstra
@ 2020-10-30 17:09 ` Waiman Long
2 siblings, 0 replies; 9+ messages in thread
From: Waiman Long @ 2020-10-30 17:09 UTC (permalink / raw)
To: Ben Gardon, linux-kernel, kvm
Cc: Paolo Bonzini, Sean Christopherson, Peter Shier, Peter Zijlstra,
Ingo Molnar, Will Deacon, Davidlohr Bueso
On 10/27/20 12:49 PM, Ben Gardon wrote:
> Rescheduling while holding a spin lock is essential for keeping long
> running kernel operations running smoothly. Add the facility to
> cond_resched rwlocks.
>
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
> include/linux/sched.h | 12 ++++++++++++
> kernel/sched/core.c | 40 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 52 insertions(+)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 77179160ec3ab..2eb0c53fce115 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1841,12 +1841,24 @@ static inline int _cond_resched(void) { return 0; }
> })
>
> extern int __cond_resched_lock(spinlock_t *lock);
> +extern int __cond_resched_rwlock_read(rwlock_t *lock);
> +extern int __cond_resched_rwlock_write(rwlock_t *lock);
>
> #define cond_resched_lock(lock) ({ \
> ___might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET);\
> __cond_resched_lock(lock); \
> })
>
> +#define cond_resched_rwlock_read(lock) ({ \
> + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \
> + __cond_resched_rwlock_read(lock); \
> +})
> +
> +#define cond_resched_rwlock_write(lock) ({ \
> + __might_sleep(__FILE__, __LINE__, PREEMPT_LOCK_OFFSET); \
> + __cond_resched_rwlock_write(lock); \
> +})
> +
> static inline void cond_resched_rcu(void)
> {
> #if defined(CONFIG_DEBUG_ATOMIC_SLEEP) || !defined(CONFIG_PREEMPT_RCU)
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d2003a7d5ab55..ac58e7829a063 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6152,6 +6152,46 @@ int __cond_resched_lock(spinlock_t *lock)
> }
> EXPORT_SYMBOL(__cond_resched_lock);
>
> +int __cond_resched_rwlock_read(rwlock_t *lock)
> +{
> + int resched = should_resched(PREEMPT_LOCK_OFFSET);
> + int ret = 0;
> +
> + lockdep_assert_held(lock);
> +
> + if (rwlock_needbreak(lock) || resched) {
> + read_unlock(lock);
> + if (resched)
> + preempt_schedule_common();
> + else
> + cpu_relax();
> + ret = 1;
> + read_lock(lock);
> + }
> + return ret;
> +}
> +EXPORT_SYMBOL(__cond_resched_rwlock_read);
> +
> +int __cond_resched_rwlock_write(rwlock_t *lock)
> +{
> + int resched = should_resched(PREEMPT_LOCK_OFFSET);
> + int ret = 0;
> +
> + lockdep_assert_held(lock);
> +
> + if (rwlock_needbreak(lock) || resched) {
> + write_unlock(lock);
> + if (resched)
> + preempt_schedule_common();
> + else
> + cpu_relax();
> + ret = 1;
> + write_lock(lock);
> + }
> + return ret;
> +}
> +EXPORT_SYMBOL(__cond_resched_rwlock_write);
> +
> /**
> * yield - yield the current processor to other threads.
> *
Other than the lockdep_assert_held() changes spotted by others, this
patch looks good to me.
Acked-by: Waiman Long <longman@redhat.com>
^ permalink raw reply [flat|nested] 9+ messages in thread