Runtime verification for instances

linux-trace-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Runtime verification for instances
@ 2022-10-03 20:35 Joel Fernandes
  2022-10-05 16:07 ` Daniel Bristot de Oliveira
  0 siblings, 1 reply; 5+ messages in thread
From: Joel Fernandes @ 2022-10-03 20:35 UTC (permalink / raw)
  To: Linux Trace Devel, Daniel Bristot de Oliveira

Hi Daniel,
I was thinking about your RV for a recent problem.

Basically, we want to check for state transitions on a per-RCU
callback basis. For example, we want to verify that for certain types
of RCU callback, there is no chance of a scheduler wake up happening
(provided such wake up happens for non-hardIRQ context).

This means every RCU callback in flight needs its own automaton which
is created when the callback is queued and destroyed when the callback
is executed. There cannot be a single global automaton. And we have to
make transitions for different automaton depending on their state.

I bet you have already thought of something like this for tasks where
you need a per-task automaton instead of a global one for some use
case which requires tracking each task separately.

We can implement this in RCU itself, however it will be ugly due to
CONFIG options. And also we cannot do it easily in production. On the
other hand, with RV, we could simply have RV watch the tracepoints and
verify them. It will be clean.

This is just 1 usecase (wakeup) , we also want to extend something
like this to more things, like making sure callbacks execute in a time
bounded fashion. I know verifying temporal properties doesn't directly
fall under RV, but hey we can extend it somehow :) . i.e. we reach the
final state in bounded time.

Let me know your thoughts and thanks!

 - Joel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Runtime verification for instances
  2022-10-03 20:35 Runtime verification for instances Joel Fernandes
@ 2022-10-05 16:07 ` Daniel Bristot de Oliveira
  2022-10-06 18:57   ` Joel Fernandes
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Bristot de Oliveira @ 2022-10-05 16:07 UTC (permalink / raw)
  To: Joel Fernandes, Linux Trace Devel

On 10/3/22 22:35, Joel Fernandes wrote:
> Hi Daniel,
> I was thinking about your RV for a recent problem.
> 
> Basically, we want to check for state transitions on a per-RCU
> callback basis. For example, we want to verify that for certain types
> of RCU callback, there is no chance of a scheduler wake up happening
> (provided such wake up happens for non-hardIRQ context).

ack.

> 
> This means every RCU callback in flight needs its own automaton which
> is created when the callback is queued and destroyed when the callback
> is executed.

do you have a "structure callback?" or could attach some extra info into it?

There cannot be a single global automaton. And we have to
> make transitions for different automaton depending on their state.
> 
> I bet you have already thought of something like this for tasks where
> you need a per-task automaton instead of a global one for some use
> case which requires tracking each task separately.

Yep! we do have per task monitors (already).
 
> We can implement this in RCU itself, however it will be ugly due to
> CONFIG options. And also we cannot do it easily in production. On the
> other hand, with RV, we could simply have RV watch the tracepoints and
> verify them. It will be clean.

So, I am not sure if I got the problem here. So, you are saying that it will
be easier with RV. Right?

If you need one monitor (automata) per task, we have support for it already.
If we need "another type of object," we can add.

> 
> This is just 1 usecase (wakeup) , we also want to extend something
> like this to more things, like making sure callbacks execute in a time
> bounded fashion.


I know verifying temporal properties doesn't directly
> fall under RV,

temporal or order or temporal of time? For temporal order yep we have it (I
se an overload of this term here... haha). For time, we can add additional
structure to transform them into logical events.

-- Daniel

but hey we can extend it somehow :) . i.e. we reach the
> final state in bounded time.
> 
> Let me know your thoughts and thanks!
> 
>  - Joel
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Runtime verification for instances
  2022-10-05 16:07 ` Daniel Bristot de Oliveira
@ 2022-10-06 18:57   ` Joel Fernandes
  2022-10-11 17:01     ` Daniel Bristot de Oliveira
  0 siblings, 1 reply; 5+ messages in thread
From: Joel Fernandes @ 2022-10-06 18:57 UTC (permalink / raw)
  To: Daniel Bristot de Oliveira; +Cc: Linux Trace Devel

On Wed, Oct 5, 2022 at 12:07 PM Daniel Bristot de Oliveira
<bristot@redhat.com> wrote:
>
> On 10/3/22 22:35, Joel Fernandes wrote:
> > Hi Daniel,
> > I was thinking about your RV for a recent problem.
> >
> > Basically, we want to check for state transitions on a per-RCU
> > callback basis. For example, we want to verify that for certain types
> > of RCU callback, there is no chance of a scheduler wake up happening
> > (provided such wake up happens for non-hardIRQ context).
>
> ack.
>
> >
> > This means every RCU callback in flight needs its own automaton which
> > is created when the callback is queued and destroyed when the callback
> > is executed.
>
> do you have a "structure callback?" or could attach some extra info into it?

Unfortunately no, we have rcu_head but the idea is to do this without
having to grow it.  In theory, you can do this with a hashtable
maintained by RV and use the callback function as key, with value
being automata.

> There cannot be a single global automaton. And we have to
> > make transitions for different automaton depending on their state.
> >
> > I bet you have already thought of something like this for tasks where
> > you need a per-task automaton instead of a global one for some use
> > case which requires tracking each task separately.
>
> Yep! we do have per task monitors (already).

Nice. This depends on growing task_struct though right?

> > We can implement this in RCU itself, however it will be ugly due to
> > CONFIG options. And also we cannot do it easily in production. On the
> > other hand, with RV, we could simply have RV watch the tracepoints and
> > verify them. It will be clean.
>
> So, I am not sure if I got the problem here. So, you are saying that it will
> be easier with RV. Right?
>
> If you need one monitor (automata) per task, we have support for it already.
> If we need "another type of object," we can add.

Yeah I was hoping that with RV, you could already allocate a new
automata dynamically or something for a new task/callback/whatever,
and then drive that automata using tracepoints. So RV can be a nice
place to do this kind of stuff, without adding debugging code to other
subsystems.

> > This is just 1 usecase (wakeup) , we also want to extend something
> > like this to more things, like making sure callbacks execute in a time
> > bounded fashion.
>
>
> I know verifying temporal properties doesn't directly
> > fall under RV,
>
> temporal or order or temporal of time? For temporal order yep we have it (I
> se an overload of this term here... haha). For time, we can add additional
> structure to transform them into logical events.

Yes, sorry I mean time. So time from when callback queued to when it
is invoked falls within a bounded time period.

But more urgently what we need is:

For a certain type of callback that is invoked, check if the callback
does a scheduler wake up. That is - doing a wake up when this type of
callback is called is a BAD state.

This can be done purely using tracepoints:
- We have to add a new tracepoint to call_rcu() for the new callback
type (the type being Lazy).
- We have to add a new tracepoint for when callback invocation ends
(for invocation start, we already have trace_rcu_invoke_callback).

Then hook up RV to all of these, and verify that no scheduler_wakeup
can happen during the invocation of a Lazy callback. But it has to be
done on a per-callback basis.

On the other hand, we can simplify it to a single automata by setting
a per-cpu state for whenever a Lazy callback is invoked. But the
problem is we have no idea that a callback is lazy when we are
invoking it.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Runtime verification for instances
  2022-10-06 18:57   ` Joel Fernandes
@ 2022-10-11 17:01     ` Daniel Bristot de Oliveira
  2022-10-13 16:19       ` Joel Fernandes
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Bristot de Oliveira @ 2022-10-11 17:01 UTC (permalink / raw)
  To: Joel Fernandes; +Cc: Linux Trace Devel

On 10/6/22 20:57, Joel Fernandes wrote:
> On Wed, Oct 5, 2022 at 12:07 PM Daniel Bristot de Oliveira
> <bristot@redhat.com> wrote:
>>
>> On 10/3/22 22:35, Joel Fernandes wrote:
>>> Hi Daniel,
>>> I was thinking about your RV for a recent problem.
>>>
>>> Basically, we want to check for state transitions on a per-RCU
>>> callback basis. For example, we want to verify that for certain types
>>> of RCU callback, there is no chance of a scheduler wake up happening
>>> (provided such wake up happens for non-hardIRQ context).
>>
>> ack.
>>
>>>
>>> This means every RCU callback in flight needs its own automaton which
>>> is created when the callback is queued and destroyed when the callback
>>> is executed.
>>
>> do you have a "structure callback?" or could attach some extra info into it?
> 
> Unfortunately no, we have rcu_head but the idea is to do this without
> having to grow it.  In theory, you can do this with a hashtable
> maintained by RV and use the callback function as key, with value
> being automata.

right, a global hash table? per-cpu? per task?

>> There cannot be a single global automaton. And we have to
>>> make transitions for different automaton depending on their state.
>>>
>>> I bet you have already thought of something like this for tasks where
>>> you need a per-task automaton instead of a global one for some use
>>> case which requires tracking each task separately.
>>
>> Yep! we do have per task monitors (already).
> 
> Nice. This depends on growing task_struct though right?

We have it there already. If we need a per-task "aux" data we can justify adding
a pointer to a priv data. Note that I was already foreseeing this.

For example I will likely use it to monitor sched deadline tasks (I need
additional data to track the task from the monitor point of view).

Note that this will be available for *any* monitor so, not something
only for your use-case.

>>> We can implement this in RCU itself, however it will be ugly due to
>>> CONFIG options. And also we cannot do it easily in production. On the
>>> other hand, with RV, we could simply have RV watch the tracepoints and
>>> verify them. It will be clean.
>>
>> So, I am not sure if I got the problem here. So, you are saying that it will
>> be easier with RV. Right?
>>
>> If you need one monitor (automata) per task, we have support for it already.
>> If we need "another type of object," we can add.
> 
> Yeah I was hoping that with RV, you could already allocate a new
> automata dynamically or something for a new task/callback/whatever,
> and then drive that automata using tracepoints. So RV can be a nice
> place to do this kind of stuff, without adding debugging code to other
> subsystems.

We can make a dynamic as well, as long as you can allocate memory in the point where
you identify a new "instance," it would work.

Monitors are pretty self-contained, they can do their magic.

We can figure out a possible way.

>>> This is just 1 usecase (wakeup) , we also want to extend something
>>> like this to more things, like making sure callbacks execute in a time
>>> bounded fashion.
>>
>>
>> I know verifying temporal properties doesn't directly
>>> fall under RV,
>>
>> temporal or order or temporal of time? For temporal order yep we have it (I
>> se an overload of this term here... haha). For time, we can add additional
>> structure to transform them into logical events.
> 
> Yes, sorry I mean time. So time from when callback queued to when it
> is invoked falls within a bounded time period.
> 
> But more urgently what we need is:
> 
> For a certain type of callback that is invoked, check if the callback
> does a scheduler wake up. That is - doing a wake up when this type of
> callback is called is a BAD state.
> 
> This can be done purely using tracepoints:
> - We have to add a new tracepoint to call_rcu() for the new callback
> type (the type being Lazy).
> - We have to add a new tracepoint for when callback invocation ends
> (for invocation start, we already have trace_rcu_invoke_callback).
> 
> Then hook up RV to all of these, and verify that no scheduler_wakeup
> can happen during the invocation of a Lazy callback. But it has to be
> done on a per-callback basis.

So, let me see if I understood:

You have a Lazy callback, and you know it is lazy only at the call_rcu time.

Then when you start running the callback:
	no wakeup if it is of the type lazy
until you finish the execution?

> On the other hand, we can simplify it to a single automata by setting
> a per-cpu state for whenever a Lazy callback is invoked. But the
> problem is we have no idea that a callback is lazy when we are
> invoking it.

Wouldn't a global hash help for this?

per-cpu monitor + a global structure that is feed at call_rcu time?

> Thanks.
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Runtime verification for instances
  2022-10-11 17:01     ` Daniel Bristot de Oliveira
@ 2022-10-13 16:19       ` Joel Fernandes
  0 siblings, 0 replies; 5+ messages in thread
From: Joel Fernandes @ 2022-10-13 16:19 UTC (permalink / raw)
  To: Daniel Bristot de Oliveira; +Cc: Linux Trace Devel

On Tue, Oct 11, 2022 at 1:01 PM Daniel Bristot de Oliveira
<bristot@redhat.com> wrote:
[...]
> >>> This means every RCU callback in flight needs its own automaton which
> >>> is created when the callback is queued and destroyed when the callback
> >>> is executed.
> >>
> >> do you have a "structure callback?" or could attach some extra info into it?
> >
> > Unfortunately no, we have rcu_head but the idea is to do this without
> > having to grow it.  In theory, you can do this with a hashtable
> > maintained by RV and use the callback function as key, with value
> > being automata.
>
> right, a global hash table? per-cpu? per task?

Yes, global hash table.

> >> There cannot be a single global automaton. And we have to
> >>> make transitions for different automaton depending on their state.
> >>>
> >>> I bet you have already thought of something like this for tasks where
> >>> you need a per-task automaton instead of a global one for some use
> >>> case which requires tracking each task separately.
> >>
> >> Yep! we do have per task monitors (already).
> >
> > Nice. This depends on growing task_struct though right?
>
> We have it there already. If we need a per-task "aux" data we can justify adding
> a pointer to a priv data. Note that I was already foreseeing this.
>
> For example I will likely use it to monitor sched deadline tasks (I need
> additional data to track the task from the monitor point of view).
>
> Note that this will be available for *any* monitor so, not something
> only for your use-case.

Yes, I was suggesting though that if you have a global hashtable, then
you do not need to grow data structures, you can map the "task_struct"
pointer to a data structure directly. I shared a patch [1] where I do
this and use tracepoints, I would like to convert this patch to RV.
Can you try converting it (or convert it to pseudocode first before
real code) ? You can see the trace point probes where I am checking
for the state. This is a great RV usecase :)

[1] https://lore.kernel.org/rcu/20221011180142.2742289-1-joel@joelfernandes.org/T/#m74a3a95ae288e79e5455001f6dd5510a591992df

> >>> We can implement this in RCU itself, however it will be ugly due to
> >>> CONFIG options. And also we cannot do it easily in production. On the
> >>> other hand, with RV, we could simply have RV watch the tracepoints and
> >>> verify them. It will be clean.
> >>
> >> So, I am not sure if I got the problem here. So, you are saying that it will
> >> be easier with RV. Right?
> >>
> >> If you need one monitor (automata) per task, we have support for it already.
> >> If we need "another type of object," we can add.
> >
> > Yeah I was hoping that with RV, you could already allocate a new
> > automata dynamically or something for a new task/callback/whatever,
> > and then drive that automata using tracepoints. So RV can be a nice
> > place to do this kind of stuff, without adding debugging code to other
> > subsystems.
>
> We can make a dynamic as well, as long as you can allocate memory in the point where
> you identify a new "instance," it would work.
>
> Monitors are pretty self-contained, they can do their magic.
>
> We can figure out a possible way.

Cool!

> >>> This is just 1 usecase (wakeup) , we also want to extend something
> >>> like this to more things, like making sure callbacks execute in a time
> >>> bounded fashion.
> >>
> >>
> >> I know verifying temporal properties doesn't directly
> >>> fall under RV,
> >>
> >> temporal or order or temporal of time? For temporal order yep we have it (I
> >> se an overload of this term here... haha). For time, we can add additional
> >> structure to transform them into logical events.
> >
> > Yes, sorry I mean time. So time from when callback queued to when it
> > is invoked falls within a bounded time period.
> >
> > But more urgently what we need is:
> >
> > For a certain type of callback that is invoked, check if the callback
> > does a scheduler wake up. That is - doing a wake up when this type of
> > callback is called is a BAD state.
> >
> > This can be done purely using tracepoints:
> > - We have to add a new tracepoint to call_rcu() for the new callback
> > type (the type being Lazy).
> > - We have to add a new tracepoint for when callback invocation ends
> > (for invocation start, we already have trace_rcu_invoke_callback).
> >
> > Then hook up RV to all of these, and verify that no scheduler_wakeup
> > can happen during the invocation of a Lazy callback. But it has to be
> > done on a per-callback basis.
>
> So, let me see if I understood:
>
> You have a Lazy callback, and you know it is lazy only at the call_rcu time.
>
> Then when you start running the callback:
>         no wakeup if it is of the type lazy
> until you finish the execution?
>
> > On the other hand, we can simplify it to a single automata by setting
> > a per-cpu state for whenever a Lazy callback is invoked. But the
> > problem is we have no idea that a callback is lazy when we are
> > invoking it.
>
> Wouldn't a global hash help for this?
>
> per-cpu monitor + a global structure that is feed at call_rcu time?

You can have multiple callbacks in flight, and others getting invoked.
So automata have to be per-callback AFAICS. The callback can be queued
and invoked on different CPUs, and there can be multiple such ones.

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-10-13 16:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-03 20:35 Runtime verification for instances Joel Fernandes
2022-10-05 16:07 ` Daniel Bristot de Oliveira
2022-10-06 18:57   ` Joel Fernandes
2022-10-11 17:01     ` Daniel Bristot de Oliveira
2022-10-13 16:19       ` Joel Fernandes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).