Reproducible run and binary logs (ideas)

public inbox for trinity@vger.kernel.org
 help / color / mirror / Atom feed

* Reproducible run and binary logs (ideas)
@ 2013-09-18 19:48 Ildar Muslukhov
  2013-09-19 15:07 ` Dave Jones
  0 siblings, 1 reply; 7+ messages in thread
From: Ildar Muslukhov @ 2013-09-18 19:48 UTC (permalink / raw)
  To: trinity

Hello everyone,

Decided to share some ideas before jumping into them and how they
align with the TODO list.

Lets just assume for now that we have something similar to sockets
cache if FDs (i.e., they are also reproducible to some extent). I am
planning to work on it rather sooner than later, but it is not crucial
for the current discussion.

Here is my thoughts:
One runs trinity either with a single child process or with multiple
child processes.
In a single child approach we don't catch cases where a race condition
occurs, however, we do catch everything else (like usual bugs, when
param's value handled incorrectly). [Q: Am I missing something here?]
Now assume that we caught a bug, how can we reproduce it? What is the
minimal set of details we need? I think, in general, we need only rand
seed. In rerun we will use that number to reseed the prng and trinity
should follow the same execution path (i.e., all the places where
rand() func is called for branching) and generate the same input
parameters. Of course, having the iteration number where kernel oopsed
would help improving the speed of replay, like a fast forward.

Although, the way it is implemented right now is a bit risky, since
the rand functions are called in place, and we cannot guarantee that
no other code (like gcc libraries) hasn't made a call to rand()
function, thus moving the rand queue forward. If such calls repeats
all the times its not a big deal, it is a problem, however, if that is
a stochastic behavior. The idea I have it to have a buffer with rand
numbers, which is filled by parent process, and in the logs we report
details about what is in the beginning of that buffer and what is in
the end. So that later, when we use "replay" function we are sure that
the same rand buffer is used. If we need several buffers, we can
always use these details in the logs to fast forward rand function if
we see that it does not match with the buffer being used according the
logs.

Now with multiple child approach. It gets even more interesting. When
we fork a child, we reseed it. This approach is very good from
simplicity point of view, but there is a small caveat to it, in
particular we do not explore every seed deep enough into its random
queue. And also, now we have to keep track of all child seeds, which
makes the replay function a bit more complicated. The idea I had is
that in the SHM struct we will have a buffer of rand numbers generated
before we fork a child, and each child uses its own buffer to be
predictable. Once the buffer is empty, parent process regenerates it.
I.e., basically avoid call to rand function from a child completely.
This makes the support of randomness uniform of whether you go with a
single child or multiple child processes.

For the replay function in multiple child processes to be more
successful it should follow the exact call sequence as it was during
original run when the bug was discovered, e.g., if child A executed
syscalls 1,2,3,4,4,4 and only then child B executed syscalls 5 and 6,
the replay has to do exactly the same. Doing it just as is, is also
OK, but in my view, such approach has less chances to reproduce race
condition types of bugs. Of course, the exact repetition of the
sequence does not guarantee us in repeating the bug, still, it should
have a bit higher chances.

Summarizing this, for multiple child processes replay we need that
"call script" which stores the sequence of syscall execution. Writing
such into a text file like logs is bad, since it will definitely
become an IO problem. Having a binary logs, like you mentioned in the
TODO seems to be the only option. Then we can add the parameter
--parselog=mylog.bin to it to rerun the exact things. Of course this
log should also store all the config params.

Another benefit of having such call script is that we can put syscall
result into it as well as an int, and then use that date to do
analysis on which syscalls fails constantly. This is inline with TODO
in terms of gathering more stats on current syscalls success/failure
rate. Having these stats might reveal syscalls that need a better call
handling (i.e., input parameter guessing and other related parts).

So that's it for now. My gut feeling is that first I should look into
rand function use, then work on FDs and then only go to bin-logs. What
do you think of it?

Thanks,
Ildar

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reproducible run and binary logs (ideas)
  2013-09-18 19:48 Reproducible run and binary logs (ideas) Ildar Muslukhov
@ 2013-09-19 15:07 ` Dave Jones
  2013-09-19 16:03   ` Ildar Muslukhov
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Jones @ 2013-09-19 15:07 UTC (permalink / raw)
  To: Ildar Muslukhov; +Cc: trinity

On Wed, Sep 18, 2013 at 12:48:40PM -0700, Ildar Muslukhov wrote:

 > Although, the way it is implemented right now is a bit risky, since
 > the rand functions are called in place, and we cannot guarantee that
 > no other code (like gcc libraries) hasn't made a call to rand()
 > function, thus moving the rand queue forward.

This whole idea seems to hang on this statement, and it bothers me.

Can this even happen ? I'm not sure it can.
And even if it can, surely it's going to happen again when we re-run
with the same seed, so we don't need to compensate for it.

Before going too far down this rabbit hole, I want to be sure we're not
over-engineering for a problem that actually isn't.

	Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reproducible run and binary logs (ideas)
  2013-09-19 15:07 ` Dave Jones
@ 2013-09-19 16:03   ` Ildar Muslukhov
  2013-09-19 16:21     ` Dave Jones
  0 siblings, 1 reply; 7+ messages in thread
From: Ildar Muslukhov @ 2013-09-19 16:03 UTC (permalink / raw)
  To: Dave Jones; +Cc: trinity

Ok, lets me just make some tests on that. But what about synching
multiple children processes? And what do you make of a binary log,
that allows to do that.

Ildar

On Thu, Sep 19, 2013 at 8:07 AM, Dave Jones <davej@redhat.com> wrote:
> On Wed, Sep 18, 2013 at 12:48:40PM -0700, Ildar Muslukhov wrote:
>
>  > Although, the way it is implemented right now is a bit risky, since
>  > the rand functions are called in place, and we cannot guarantee that
>  > no other code (like gcc libraries) hasn't made a call to rand()
>  > function, thus moving the rand queue forward.
>
> This whole idea seems to hang on this statement, and it bothers me.
>
> Can this even happen ? I'm not sure it can.
> And even if it can, surely it's going to happen again when we re-run
> with the same seed, so we don't need to compensate for it.
>
> Before going too far down this rabbit hole, I want to be sure we're not
> over-engineering for a problem that actually isn't.
>
>         Dave
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reproducible run and binary logs (ideas)
  2013-09-19 16:03   ` Ildar Muslukhov
@ 2013-09-19 16:21     ` Dave Jones
  2013-09-19 16:49       ` Vince Weaver
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Jones @ 2013-09-19 16:21 UTC (permalink / raw)
  To: Ildar Muslukhov; +Cc: trinity

On Thu, Sep 19, 2013 at 09:03:41AM -0700, Ildar Muslukhov wrote:
 > Ok, lets me just make some tests on that. But what about synching
 > multiple children processes? And what do you make of a binary log,
 > that allows to do that.
 > 
 > Ildar

So while things like synchronising threads are a nice idea, I'm not
really sweating it, because I don't think it's a major factor in
the lack of reproducability right now. There are probably a whole bunch
of other factors that have a higher impact.  Also, it's not as simple
a problem as just starting threads at a certain time, because they
can go out of sync later. Think about what happens if one of the syscalls
we run does a write() of a gigabyte of data. There's no deterministic
time that has to complete, and depending on the filesystem/mount options in use, we could
go away for a while until that write() completes while the filesystem does
tree rebalancing or whatever..  On a subsequent run, that same write() might
return almost instantly because the filesystem doesn't need to do as much
book-keeping. And that's just write(). Other syscalls have similar non-deterministic
runtimes.

It's a tricky problem, and one that I think we're probably better off not trying
to solve completely, because I don't think we'll ever recreate the exact conditions across two runs.
I'm not saying it's all useless to even try and recreate conditions, but that perfection
isn't really attainable, and some of those ideas may be more realistic than others.

Of all the bugs that trinity has found, I can only recall one that
actually had multi-thread interaction (involving ecryptfs fd passing,
which was actually a problem between main process & child, rather than child<->child)

We might be better off just treating the individual trinity child processes
as standalone instances, and concentrating on taking a logfile from each of those,
and working on strategies to extract the useful info from them for re-run them
as a single process, because 99.9% of the time, what the other child processes are
doing is irrelevant to the bug.

It might be worth looking at Vince's perf fuzzer to see what he's done there
for bisecting/generating test cases.

	Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reproducible run and binary logs (ideas)
  2013-09-19 16:21     ` Dave Jones
@ 2013-09-19 16:49       ` Vince Weaver
  2013-09-19 17:04         ` Ildar Muslukhov
  0 siblings, 1 reply; 7+ messages in thread
From: Vince Weaver @ 2013-09-19 16:49 UTC (permalink / raw)
  To: Dave Jones; +Cc: Ildar Muslukhov, trinity

On Thu, 19 Sep 2013, Dave Jones wrote:

> It might be worth looking at Vince's perf fuzzer to see what he's done there
> for bisecting/generating test cases.

my perf_fuzzer doesn't do anything that complicated.

In its current setup it only has one thread, which occasionally forks off 
a child but it always kills the child before forking off another.

I find the run-to-run reproducibility of traces to be surprisingly 
consistent when given the same random seed.  Even hundreds of thousands
of syscalls in I can usually hit the same bugs at roughly the same
syscall count.

I find this suprising because when perf events are involved nothing is 
deterministic; if you are measuring things like cycles with overflow, the 
cycle count is going to depend on what else is going on in the system, and 
if your overflow handler consumes randomness (hard because rand() isn't
signal-handler safe, but possible with a few hacks) you'd expect things ro 
diverge.

As for bisecting, my major problem is dependent syscalls, something that 
might not be as big an issue on trinity.  The way perf_fuzzer is set up 
you open an fd with perf_event_open(), then lots of other syscalls act on 
the fd.  So if your bisect happens to cut off the open of the fd, then 
all subsequent dependent syscalls fail.  I don't have a good automatic 
solution to that, I've so far been placing the bisect points by hand and 
just having the replayer be robust in cases where invalid dependent 
syscalls appear in the trace.

perf_fuzzer also has a somewhat human-readable log format, which wastes 
disk space but makes it easier to do somewhat manual bisects using a text 
editor.

Vince

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reproducible run and binary logs (ideas)
  2013-09-19 16:49       ` Vince Weaver
@ 2013-09-19 17:04         ` Ildar Muslukhov
  2013-09-19 17:15           ` Ildar Muslukhov
  0 siblings, 1 reply; 7+ messages in thread
From: Ildar Muslukhov @ 2013-09-19 17:04 UTC (permalink / raw)
  To: Vince Weaver; +Cc: Dave Jones, trinity

Thanks Vince,

Will take a look at it.


On Thu, Sep 19, 2013 at 9:49 AM, Vince Weaver <vincent.weaver@maine.edu> wrote:
> On Thu, 19 Sep 2013, Dave Jones wrote:
>
>> It might be worth looking at Vince's perf fuzzer to see what he's done there
>> for bisecting/generating test cases.
>
> my perf_fuzzer doesn't do anything that complicated.
>
> In its current setup it only has one thread, which occasionally forks off
> a child but it always kills the child before forking off another.
>
> I find the run-to-run reproducibility of traces to be surprisingly
> consistent when given the same random seed.  Even hundreds of thousands
> of syscalls in I can usually hit the same bugs at roughly the same
> syscall count.
>
> I find this suprising because when perf events are involved nothing is
> deterministic; if you are measuring things like cycles with overflow, the
> cycle count is going to depend on what else is going on in the system, and
> if your overflow handler consumes randomness (hard because rand() isn't
> signal-handler safe, but possible with a few hacks) you'd expect things ro
> diverge.
>
> As for bisecting, my major problem is dependent syscalls, something that
> might not be as big an issue on trinity.  The way perf_fuzzer is set up
> you open an fd with perf_event_open(), then lots of other syscalls act on
> the fd.  So if your bisect happens to cut off the open of the fd, then
> all subsequent dependent syscalls fail.  I don't have a good automatic
> solution to that, I've so far been placing the bisect points by hand and
> just having the replayer be robust in cases where invalid dependent
> syscalls appear in the trace.
>
> perf_fuzzer also has a somewhat human-readable log format, which wastes
> disk space but makes it easier to do somewhat manual bisects using a text
> editor.
>
> Vince
>
>
>
>
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Reproducible run and binary logs (ideas)
  2013-09-19 17:04         ` Ildar Muslukhov
@ 2013-09-19 17:15           ` Ildar Muslukhov
  0 siblings, 0 replies; 7+ messages in thread
From: Ildar Muslukhov @ 2013-09-19 17:15 UTC (permalink / raw)
  To: Dave Jones, trinity

Dave, I will make some tests today and will let you know once I have them.

-Ildar

On Thu, Sep 19, 2013 at 10:04 AM, Ildar Muslukhov <ildarm@google.com> wrote:
> Thanks Vince,
>
> Will take a look at it.
>
>
> On Thu, Sep 19, 2013 at 9:49 AM, Vince Weaver <vincent.weaver@maine.edu> wrote:
>> On Thu, 19 Sep 2013, Dave Jones wrote:
>>
>>> It might be worth looking at Vince's perf fuzzer to see what he's done there
>>> for bisecting/generating test cases.
>>
>> my perf_fuzzer doesn't do anything that complicated.
>>
>> In its current setup it only has one thread, which occasionally forks off
>> a child but it always kills the child before forking off another.
>>
>> I find the run-to-run reproducibility of traces to be surprisingly
>> consistent when given the same random seed.  Even hundreds of thousands
>> of syscalls in I can usually hit the same bugs at roughly the same
>> syscall count.
>>
>> I find this suprising because when perf events are involved nothing is
>> deterministic; if you are measuring things like cycles with overflow, the
>> cycle count is going to depend on what else is going on in the system, and
>> if your overflow handler consumes randomness (hard because rand() isn't
>> signal-handler safe, but possible with a few hacks) you'd expect things ro
>> diverge.
>>
>> As for bisecting, my major problem is dependent syscalls, something that
>> might not be as big an issue on trinity.  The way perf_fuzzer is set up
>> you open an fd with perf_event_open(), then lots of other syscalls act on
>> the fd.  So if your bisect happens to cut off the open of the fd, then
>> all subsequent dependent syscalls fail.  I don't have a good automatic
>> solution to that, I've so far been placing the bisect points by hand and
>> just having the replayer be robust in cases where invalid dependent
>> syscalls appear in the trace.
>>
>> perf_fuzzer also has a somewhat human-readable log format, which wastes
>> disk space but makes it easier to do somewhat manual bisects using a text
>> editor.
>>
>> Vince
>>
>>
>>
>>
>>
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-09-19 17:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-18 19:48 Reproducible run and binary logs (ideas) Ildar Muslukhov
2013-09-19 15:07 ` Dave Jones
2013-09-19 16:03   ` Ildar Muslukhov
2013-09-19 16:21     ` Dave Jones
2013-09-19 16:49       ` Vince Weaver
2013-09-19 17:04         ` Ildar Muslukhov
2013-09-19 17:15           ` Ildar Muslukhov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox