* Re: BUG: ib_mad ftrace event unsupported migration
[not found] ` <Y2JqX3vC1mG/JDex@ziepe.ca>
@ 2022-11-02 14:02 ` Leonid Ravich
2022-11-02 14:04 ` Jason Gunthorpe
2022-11-02 14:20 ` Steven Rostedt
0 siblings, 2 replies; 11+ messages in thread
From: Leonid Ravich @ 2022-11-02 14:02 UTC (permalink / raw)
To: Jason Gunthorpe, Steven Rostedt
Cc: mingo@redhat.com, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, Yigal Korman,
linux-trace-kernel@vger.kernel.org, Leon Ravich
> > > before starting throwing some patch into the the air I would like to align with you the approach we should take here.
> > >
> > > my suggestion here :
> > >- ftrace infra should verify no migration happen (end and start happens on same CPU) in case not we will throw warning for the issue .
> >
> >The scheduler should have. On entering the ring buffer code
> >ring_buffer_lock_reserver() it disables preemption and does not
> >re-enable it until ring_buffer_unlock_commit().
> >
> >The only way to migrate is if you re-enable preemption. WHICH IS A
> >BUG!
>So what on earth did that?
>I'm guessing some driver's query_pkey op, but AFAIK we don't have any
>explicit pre-emption reenablements in the code - unless it is sneaky..
trace infra uses preempt_disable_notrace/preempt_enable_notrace to disable/enable preemtion but my kernel compiled without CONFIG_PREEMPTION so this functions are only barriers - looks like the idea behind was to avoid involuntary preemtion but in our case it is a voluntary (there is a wait_for_completion in the query_pkey rabbit hole).
so no scheduler here to warn about illegal migration.
>Leonid what driver are you testing?
mlx5 - (MLNX_OFED-5.3-1.0.0.1)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 14:02 ` BUG: ib_mad ftrace event unsupported migration Leonid Ravich
@ 2022-11-02 14:04 ` Jason Gunthorpe
2022-11-02 14:17 ` Steven Rostedt
2022-11-02 14:20 ` Steven Rostedt
1 sibling, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2022-11-02 14:04 UTC (permalink / raw)
To: Leonid Ravich
Cc: Steven Rostedt, mingo@redhat.com, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, Yigal Korman,
linux-trace-kernel@vger.kernel.org, Leon Ravich
On Wed, Nov 02, 2022 at 02:02:49PM +0000, Leonid Ravich wrote:
> > > > before starting throwing some patch into the the air I would like to align with you the approach we should take here.
> > > >
> > > > my suggestion here :
> > > >- ftrace infra should verify no migration happen (end and start happens on same CPU) in case not we will throw warning for the issue .
> > >
> > >The scheduler should have. On entering the ring buffer code
> > >ring_buffer_lock_reserver() it disables preemption and does not
> > >re-enable it until ring_buffer_unlock_commit().
> > >
> > >The only way to migrate is if you re-enable preemption. WHICH IS A
> > >BUG!
>
> >So what on earth did that?
>
> >I'm guessing some driver's query_pkey op, but AFAIK we don't have any
> >explicit pre-emption reenablements in the code - unless it is sneaky..
> trace infra uses preempt_disable_notrace/preempt_enable_notrace to disable/enable preemtion but my kernel compiled without CONFIG_PREEMPTION so this functions are only barriers - looks like the idea behind was to avoid involuntary preemtion but in our case it is a voluntary (there is a wait_for_completion in the query_pkey rabbit hole).
So this tracepoint is just wrong, you can't call a sleepable function
from a tracepoint like that?
Presumably lockdep would/should warn about this?
Delete the pkey logging from the tracepoint, it can't work, I guess.
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 14:04 ` Jason Gunthorpe
@ 2022-11-02 14:17 ` Steven Rostedt
2022-11-02 14:24 ` Jason Gunthorpe
0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2022-11-02 14:17 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leonid Ravich, mingo@redhat.com, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, Yigal Korman,
linux-trace-kernel@vger.kernel.org, Leon Ravich
On Wed, 2 Nov 2022 11:04:44 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:
> So this tracepoint is just wrong, you can't call a sleepable function
> from a tracepoint like that?
>
> Presumably lockdep would/should warn about this?
Why didn't it trigger a "scheduling while atomic" bug? That should
happen if you call a sleeping function while preemption is disabled. Or
does this function explicitly enable preemption? Which nothing checks
if you enable preemption while recording to the ring buffer. I guess we
could add that check, but this is not something that commonly happens
enough to bother.
-- Steve
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 14:02 ` BUG: ib_mad ftrace event unsupported migration Leonid Ravich
2022-11-02 14:04 ` Jason Gunthorpe
@ 2022-11-02 14:20 ` Steven Rostedt
1 sibling, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2022-11-02 14:20 UTC (permalink / raw)
To: Leonid Ravich
Cc: Jason Gunthorpe, mingo@redhat.com, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, Yigal Korman,
linux-trace-kernel@vger.kernel.org, Leon Ravich
On Wed, 2 Nov 2022 14:02:49 +0000
Leonid Ravich <leonid.ravich@toganetworks.com> wrote:
> >I'm guessing some driver's query_pkey op, but AFAIK we don't have any
> >explicit pre-emption reenablements in the code - unless it is sneaky..
> trace infra uses preempt_disable_notrace/preempt_enable_notrace to disable/enable preemtion but my kernel compiled without CONFIG_PREEMPTION so this functions are only barriers - looks like the idea behind was to avoid involuntary preemtion but in our case it is a voluntary (there is a wait_for_completion in the query_pkey rabbit hole).
>
> so no scheduler here to warn about illegal migration.
Well, you should be testing under different configs ;-)
It would have given you the reason for the real bug.
-- Steve
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 14:17 ` Steven Rostedt
@ 2022-11-02 14:24 ` Jason Gunthorpe
2022-11-02 15:59 ` Steven Rostedt
0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2022-11-02 14:24 UTC (permalink / raw)
To: Steven Rostedt
Cc: Leonid Ravich, mingo@redhat.com, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, Yigal Korman,
linux-trace-kernel@vger.kernel.org, Leon Ravich
On Wed, Nov 02, 2022 at 10:17:19AM -0400, Steven Rostedt wrote:
> On Wed, 2 Nov 2022 11:04:44 -0300
> Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> > So this tracepoint is just wrong, you can't call a sleepable function
> > from a tracepoint like that?
> >
> > Presumably lockdep would/should warn about this?
>
> Why didn't it trigger a "scheduling while atomic" bug? That should
> happen if you call a sleeping function while preemption is disabled. Or
> does this function explicitly enable preemption? Which nothing checks
> if you enable preemption while recording to the ring buffer. I guess we
> could add that check, but this is not something that commonly happens
> enough to bother.
No, it doesn't muck with preemption, it will have some sleeping lock,
eg mlx5_ib_query_pkey() does a memory allocation as the first thing
It seems like a bug that calling kmalloc(GFP_KERNEL)/might_sleep()
from within a tracepoint doesn't trigger a warning?
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 14:24 ` Jason Gunthorpe
@ 2022-11-02 15:59 ` Steven Rostedt
2022-11-02 16:01 ` Jason Gunthorpe
2022-11-02 20:01 ` Leonid Ravich
0 siblings, 2 replies; 11+ messages in thread
From: Steven Rostedt @ 2022-11-02 15:59 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Leonid Ravich, mingo@redhat.com, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, Yigal Korman,
linux-trace-kernel@vger.kernel.org, Leon Ravich
On Wed, 2 Nov 2022 11:24:20 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:
> No, it doesn't muck with preemption, it will have some sleeping lock,
> eg mlx5_ib_query_pkey() does a memory allocation as the first thing
>
> It seems like a bug that calling kmalloc(GFP_KERNEL)/might_sleep()
> from within a tracepoint doesn't trigger a warning?
Has nothing to do with tracepoints. You could call it a bug that it
doesn't trigger a warning when preemption is disabled. But then again,
it would if you enabled DEBUG_PREEMPT and possibly LOCKDEP too. So, I chalk
this up to a lack of proper testing.
-- Steve
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 15:59 ` Steven Rostedt
@ 2022-11-02 16:01 ` Jason Gunthorpe
2022-11-02 20:01 ` Leonid Ravich
1 sibling, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2022-11-02 16:01 UTC (permalink / raw)
To: Steven Rostedt
Cc: Leonid Ravich, mingo@redhat.com, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, Yigal Korman,
linux-trace-kernel@vger.kernel.org, Leon Ravich
On Wed, Nov 02, 2022 at 11:59:47AM -0400, Steven Rostedt wrote:
> On Wed, 2 Nov 2022 11:24:20 -0300
> Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> > No, it doesn't muck with preemption, it will have some sleeping lock,
> > eg mlx5_ib_query_pkey() does a memory allocation as the first thing
> >
> > It seems like a bug that calling kmalloc(GFP_KERNEL)/might_sleep()
> > from within a tracepoint doesn't trigger a warning?
>
> Has nothing to do with tracepoints. You could call it a bug that it
> doesn't trigger a warning when preemption is disabled. But then again,
> it would if you enabled DEBUG_PREEMPT and possibly LOCKDEP too. So, I chalk
> this up to a lack of proper testing.
That makes sense, assuming it does trigger in those cases.
It is interesting nobody has run those tracepoints on a debug kernel.
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 15:59 ` Steven Rostedt
2022-11-02 16:01 ` Jason Gunthorpe
@ 2022-11-02 20:01 ` Leonid Ravich
2022-11-02 22:19 ` Steven Rostedt
1 sibling, 1 reply; 11+ messages in thread
From: Leonid Ravich @ 2022-11-02 20:01 UTC (permalink / raw)
To: Steven Rostedt
Cc: Jason Gunthorpe, Leonid Ravich, mingo@redhat.com,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
Yigal Korman, linux-trace-kernel@vger.kernel.org
> On Wed, 2 Nov 2022 11:24:20 -0300
> Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> > No, it doesn't muck with preemption, it will have some sleeping lock,
> > eg mlx5_ib_query_pkey() does a memory allocation as the first thing
> >
> > It seems like a bug that calling kmalloc(GFP_KERNEL)/might_sleep()
> > from within a tracepoint doesn't trigger a warning?
>
> Has nothing to do with tracepoints. You could call it a bug that it
> doesn't trigger a warning when preemption is disabled. But then again,
> it would if you enabled DEBUG_PREEMPT and possibly LOCKDEP too. So, I chalk
> this up to a lack of proper testing.
>
disagree, without CONFIG_PREEMPTION (which is the default case in some
destros) we will not get any warning, because there will not be
preamption disable.
second issue I see and maybe it is only me, is that the assuption of
atomicity in trace is not a common knowledge for trace users.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 20:01 ` Leonid Ravich
@ 2022-11-02 22:19 ` Steven Rostedt
2022-11-03 12:22 ` Leonid Ravich
0 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2022-11-02 22:19 UTC (permalink / raw)
To: Leonid Ravich
Cc: Jason Gunthorpe, Leonid Ravich, mingo@redhat.com,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
Yigal Korman, linux-trace-kernel@vger.kernel.org
On Wed, 2 Nov 2022 22:01:17 +0200
Leonid Ravich <lravich@gmail.com> wrote:
> disagree, without CONFIG_PREEMPTION (which is the default case in some
> destros) we will not get any warning, because there will not be
> preamption disable.
I test all for my code (NON_PREEMPT, VOLUNTEER_PREEMPT, PREEMPT) and
with and without lockdep enabled.
This would be a bug if you called kmalloc(X, GFP_KERNEL) in *any* non
preempt section.
>
> second issue I see and maybe it is only me, is that the assuption of
> atomicity in trace is not a common knowledge for trace users.
Well, I suppose we could add more documentation. Would that help? Where
would you see it? In the sample code?
I advise not even grabbing locks in trace events, because in most cases
lockdep will not catch any issues with them (it will be hidden unless
the trace event is enabled).
-- Steve
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-02 22:19 ` Steven Rostedt
@ 2022-11-03 12:22 ` Leonid Ravich
2022-11-03 16:32 ` Steven Rostedt
0 siblings, 1 reply; 11+ messages in thread
From: Leonid Ravich @ 2022-11-03 12:22 UTC (permalink / raw)
To: Steven Rostedt
Cc: Jason Gunthorpe, Leonid Ravich, mingo@redhat.com,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
Yigal Korman, linux-trace-kernel@vger.kernel.org
On Wed, Nov 02, 2022 at 06:19:00PM -0400, Steven Rostedt wrote:
> On Wed, 2 Nov 2022 22:01:17 +0200
> Leonid Ravich <lravich@gmail.com> wrote:
>
> > disagree, without CONFIG_PREEMPTION (which is the default case in some
> > destros) we will not get any warning, because there will not be
> > preamption disable.
>
> I test all for my code (NON_PREEMPT, VOLUNTEER_PREEMPT, PREEMPT) and
> with and without lockdep enabled.
>
> This would be a bug if you called kmalloc(X, GFP_KERNEL) in *any* non
> preempt section.
yes, but for NON_PREEMPT trace is not non preempt section,
actualy the problem is with CONFIG_PREEMPT_COUNT not set.
ftrace uses preemot_enable/disable_notrace macro to "mark" it as non preempt section
which do it only for CONFIG_PREEMPT_COUNT.
from include/linux/preempt.h
if !CONFIG_PREEMPT_COUNT
#define preempt_enable_notrace() barrier()
this is why there is no any warning on my system.
>
> >
> > second issue I see and maybe it is only me, is that the assuption of
> > atomicity in trace is not a common knowledge for trace users.
>
> Well, I suppose we could add more documentation. Would that help? Where
> would you see it? In the sample code?
>
I think if we fix the first issue and make kernel cry for any miss
behave it we do the job.
> I advise not even grabbing locks in trace events, because in most cases
> lockdep will not catch any issues with them (it will be hidden unless
> the trace event is enabled).
>
-- Leonid
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: BUG: ib_mad ftrace event unsupported migration
2022-11-03 12:22 ` Leonid Ravich
@ 2022-11-03 16:32 ` Steven Rostedt
0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2022-11-03 16:32 UTC (permalink / raw)
To: Leonid Ravich
Cc: Jason Gunthorpe, Leonid Ravich, mingo@redhat.com,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
Yigal Korman, linux-trace-kernel@vger.kernel.org
On Thu, 3 Nov 2022 14:22:01 +0200
Leonid Ravich <lravich@gmail.com> wrote:
> On Wed, Nov 02, 2022 at 06:19:00PM -0400, Steven Rostedt wrote:
> > On Wed, 2 Nov 2022 22:01:17 +0200
> > Leonid Ravich <lravich@gmail.com> wrote:
> >
> > > disagree, without CONFIG_PREEMPTION (which is the default case in some
> > > destros) we will not get any warning, because there will not be
> > > preamption disable.
> >
> > I test all for my code (NON_PREEMPT, VOLUNTEER_PREEMPT, PREEMPT) and
> > with and without lockdep enabled.
> >
> > This would be a bug if you called kmalloc(X, GFP_KERNEL) in *any* non
> > preempt section.
> yes, but for NON_PREEMPT trace is not non preempt section,
> actualy the problem is with CONFIG_PREEMPT_COUNT not set.
>
> ftrace uses preemot_enable/disable_notrace macro to "mark" it as non preempt section
> which do it only for CONFIG_PREEMPT_COUNT.
Correct.
>
> from include/linux/preempt.h
> if !CONFIG_PREEMPT_COUNT
> #define preempt_enable_notrace() barrier()
>
> this is why there is no any warning on my system.
Correct.
> >
> > >
> > > second issue I see and maybe it is only me, is that the assuption of
> > > atomicity in trace is not a common knowledge for trace users.
> >
> > Well, I suppose we could add more documentation. Would that help? Where
> > would you see it? In the sample code?
> >
> I think if we fix the first issue and make kernel cry for any miss
> behave it we do the job.
It does with the right configs enable. I will NACK any change to add
more checks here for the production case. This is an extremely fast
path where every nanosecond counts (trace events can be under 100ns).
The fact that it doesn't do anything is a feature not a bug.
It's up to the developer to run their code with lockdep and other debug
settings (and possibly even PREEMPT config tests) while developing
their code. Not to expect the production use case to do it for them.
-- Steve
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-11-03 16:33 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <VI1PR02MB623706DA8A01842834FC191089399@VI1PR02MB6237.eurprd02.prod.outlook.com>
[not found] ` <20221102074457.08f538a8@rorschach.local.home>
[not found] ` <Y2JqX3vC1mG/JDex@ziepe.ca>
2022-11-02 14:02 ` BUG: ib_mad ftrace event unsupported migration Leonid Ravich
2022-11-02 14:04 ` Jason Gunthorpe
2022-11-02 14:17 ` Steven Rostedt
2022-11-02 14:24 ` Jason Gunthorpe
2022-11-02 15:59 ` Steven Rostedt
2022-11-02 16:01 ` Jason Gunthorpe
2022-11-02 20:01 ` Leonid Ravich
2022-11-02 22:19 ` Steven Rostedt
2022-11-03 12:22 ` Leonid Ravich
2022-11-03 16:32 ` Steven Rostedt
2022-11-02 14:20 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).