* Re: [ofa-general] [PATCH] [WORKAROUND] CONFIG_PREEMPT_RT and ib_umad_close() issue
2007-09-17 15:22 John Blackwood
@ 2007-09-17 15:56 ` Roland Dreier
2007-09-17 17:07 ` Daniel Walker
0 siblings, 1 reply; 5+ messages in thread
From: Roland Dreier @ 2007-09-17 15:56 UTC (permalink / raw)
To: John Blackwood
Cc: Sven-Thorsten Dietrich, linux-rt-users, general, linux-kernel
> When using OFED-1.2.5 based infiniband kernel modules on 2.6.22 based
> kernels with the Ingo Molnar CONFIG_PREEMPT_RT applied, then commands
> such as ibnetdiscvoer, smpquery, sminfo, etc. will hang. The problem
> is with the downgrade_write() rw semaphore usage in the
> ib_umad_close() routine.
Can you give a few more details on how PREEMPT_RT changes locking
rules (or just exposes existing bugs maybe?) so that the
downgrade_write() causes the issue? I would like to fix this cleanly
but I don't really understand what the problem is.
- R.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [ofa-general] [PATCH] [WORKAROUND] CONFIG_PREEMPT_RT and ib_umad_close() issue
2007-09-17 15:56 ` [ofa-general] " Roland Dreier
@ 2007-09-17 17:07 ` Daniel Walker
0 siblings, 0 replies; 5+ messages in thread
From: Daniel Walker @ 2007-09-17 17:07 UTC (permalink / raw)
To: Roland Dreier
Cc: John Blackwood, linux-rt-users, linux-kernel, general,
Sven-Thorsten Dietrich
On Mon, 2007-09-17 at 08:56 -0700, Roland Dreier wrote:
> > When using OFED-1.2.5 based infiniband kernel modules on 2.6.22 based
> > kernels with the Ingo Molnar CONFIG_PREEMPT_RT applied, then commands
> > such as ibnetdiscvoer, smpquery, sminfo, etc. will hang. The problem
> > is with the downgrade_write() rw semaphore usage in the
> > ib_umad_close() routine.
>
> Can you give a few more details on how PREEMPT_RT changes locking
> rules (or just exposes existing bugs maybe?) so that the
> downgrade_write() causes the issue? I would like to fix this cleanly
> but I don't really understand what the problem is.
the read/write semaphore functionality is basically reduced to just a
binary semaphore , i.e. one reader, or one writer . I think the BUG();
in downgrade_write() is likely part of a removal plan for
downgrade_write() (that's just a guess tho)
Daniel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [ofa-general] [PATCH] [WORKAROUND] CONFIG_PREEMPT_RT and ib_umad_close() issue
@ 2007-09-17 17:19 John Blackwood
2007-09-17 21:40 ` Roland Dreier
0 siblings, 1 reply; 5+ messages in thread
From: John Blackwood @ 2007-09-17 17:19 UTC (permalink / raw)
To: Roland Dreier
Cc: linux-rt-users, linux-kernel, Sven-Thorsten Dietrich, general
> Subject: Re: [ofa-general] [PATCH] [WORKAROUND] CONFIG_PREEMPT_RT and
ib_umad_close() issue
> From: Roland Dreier <rdreier@cisco.com>
> Date: Mon, 17 Sep 2007 08:56:01 -0700
> To: John Blackwood <john.blackwood@ccur.com>
> CC: linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org,
general@lists.openfabrics.org, Sven-Thorsten Dietrich <sdietrich@novell.com>
>
> > When using OFED-1.2.5 based infiniband kernel modules on 2.6.22 based
> > kernels with the Ingo Molnar CONFIG_PREEMPT_RT applied, then commands
> > such as ibnetdiscvoer, smpquery, sminfo, etc. will hang. The problem
> > is with the downgrade_write() rw semaphore usage in the
> > ib_umad_close() routine.
>
> Can you give a few more details on how PREEMPT_RT changes locking
> rules (or just exposes existing bugs maybe?) so that the
> downgrade_write() causes the issue? I would like to fix this cleanly
> but I don't really understand what the problem is.
>
> - R.
Hi Roland,
Thanks for your interest in this matter.
I'm not one of the preempt rt experts, so others may want to speak up ...
(thanks Daniel...)
But basically, with CONFIG_PREEMPT_RT enabled, the lock points, such as
aqcuiring a spinlock, potentially become places where the current task
may be context switched out / preempted.
Therefore, when a call is made to lock a spinlock for example, the
caller should not currently have irqs disabled, or preemption disabled,
since a context switch may occur.
I believe that in the case of rw_semaphores, the comments
in include/linux/rt_lock.h with the rt preempt patch applied say:
/*
* RW-semaphores are a spinlock plus a reader-depth count.
*
* Note that the semantics are different from the usual
* Linux rw-sems, in PREEMPT_RT mode we do not allow
* multiple readers to hold the lock at once, we only allow
* a read-lock owner to read-lock recursively. This is
* better for latency, makes the implementation inherently
* fair and makes it simpler as well:
*/
So I believe that a read lock on a rw_semaphore is just as
exclusive as the old write lock, except that the read locks
may nest.
And with the preempt patch enabled, the downgrade_write() becomes:
void fastcall rt_downgrade_write(struct rw_semaphore *rwsem)
{
BUG();
}
EXPORT_SYMBOL(rt_downgrade_write);
So I think code such as:
ib_umad_close()
{
...
down_write(&file->port->mutex);
... do exclusive stuff
downgrade_write(&file->port->mutex);
... do potentially recursive stuff
up_read(&file->port->mutex);
...
}
Could probably become (only when CONFIG_PREEMPT_RT is enabled):
ib_umad_close()
{
...
down_read(&file->port->mutex);
... do exclusive stuff
... do potentially recursive stuff
up_read(&file->port->mutex);
...
}
since the down_read will not allow other readers at the same time,
but will allow nesting.
I'm not aware of any tools that find these issues, other than
just running through the code.
I do know that Ingo's preempt rt patch can be found at
http://www.kernel.org/pub/linux/kernel/projects/rt
and applied to an infiniband kernel.
If you enabled CONFIG_PREEMPT_RT, and maybe also enable
parameters such as
CONFIG_DEBUG_PREEMPT, CONFIG_DEBUG_SPINLOCK, etc. you should
see the issue with something like a ibnetdiscover invocation.
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [ofa-general] [PATCH] [WORKAROUND] CONFIG_PREEMPT_RT and ib_umad_close() issue
2007-09-17 17:19 [ofa-general] [PATCH] [WORKAROUND] CONFIG_PREEMPT_RT and ib_umad_close() issue John Blackwood
@ 2007-09-17 21:40 ` Roland Dreier
2007-09-17 23:41 ` John Blackwood
0 siblings, 1 reply; 5+ messages in thread
From: Roland Dreier @ 2007-09-17 21:40 UTC (permalink / raw)
To: john.blackwood
Cc: general, linux-rt-users, Sven-Thorsten Dietrich, linux-kernel
Thanks for the explanation...
> But basically, with CONFIG_PREEMPT_RT enabled, the lock points, such as
> aqcuiring a spinlock, potentially become places where the current task
> may be context switched out / preempted.
>
> Therefore, when a call is made to lock a spinlock for example, the
> caller should not currently have irqs disabled, or preemption disabled,
> since a context switch may occur.
this doesn't seem relevant here...
> void fastcall rt_downgrade_write(struct rw_semaphore *rwsem)
> {
> BUG();
> }
this seems to be the problem... the -rt patch turns downgrade_write()
into a BUG().
I need to look at the locking in user_mad.c again, but I think it may
be possible to replace both places that do downgrade_write() with
up_write() followed by down_read().
- R.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [ofa-general] [PATCH] [WORKAROUND] CONFIG_PREEMPT_RT and ib_umad_close() issue
2007-09-17 21:40 ` Roland Dreier
@ 2007-09-17 23:41 ` John Blackwood
0 siblings, 0 replies; 5+ messages in thread
From: John Blackwood @ 2007-09-17 23:41 UTC (permalink / raw)
To: Roland Dreier
Cc: general, linux-rt-users, Sven-Thorsten Dietrich, linux-kernel
Roland Dreier wrote:
> Thanks for the explanation...
>
> > But basically, with CONFIG_PREEMPT_RT enabled, the lock points, such as
> > aqcuiring a spinlock, potentially become places where the current task
> > may be context switched out / preempted.
> >
> > Therefore, when a call is made to lock a spinlock for example, the
> > caller should not currently have irqs disabled, or preemption disabled,
> > since a context switch may occur.
>
> this doesn't seem relevant here...
Hi Roland,
right. just some background info.
> > void fastcall rt_downgrade_write(struct rw_semaphore *rwsem)
> > {
> > BUG();
> > }
>
> this seems to be the problem... the -rt patch turns downgrade_write()
> into a BUG().
>
> I need to look at the locking in user_mad.c again, but I think it may
> be possible to replace both places that do downgrade_write() with
> up_write() followed by down_read().
>
> - R.
that sounds like it would be a good solution for both preempt rt and
non-preempt rt kernels.
thanks again for looking at this for us.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-09-17 23:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-17 17:19 [ofa-general] [PATCH] [WORKAROUND] CONFIG_PREEMPT_RT and ib_umad_close() issue John Blackwood
2007-09-17 21:40 ` Roland Dreier
2007-09-17 23:41 ` John Blackwood
-- strict thread matches above, loose matches on Subject: below --
2007-09-17 15:22 John Blackwood
2007-09-17 15:56 ` [ofa-general] " Roland Dreier
2007-09-17 17:07 ` Daniel Walker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).