Re: [PATCH v2 10/13] fuzz/x86_emulate: Make input more compact

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: George Dunlap <george.dunlap@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Wei Liu <wei.liu2@citrix.com>,
	xen-devel@lists.xenproject.org,
	Ian Jackson <ian.jackson@citrix.com>
Subject: Re: [PATCH v2 10/13] fuzz/x86_emulate: Make input more compact
Date: Thu, 5 Oct 2017 16:04:44 +0100	[thread overview]
Message-ID: <94e4d279-f3db-5db4-4fea-f209d140b17c@citrix.com> (raw)
In-Reply-To: <59D4B74D0200007800181E60@prv-mh.provo.novell.com>

On 10/04/2017 09:26 AM, Jan Beulich wrote:
>>>> On 25.09.17 at 16:26, <george.dunlap@citrix.com> wrote:
>> @@ -22,13 +25,17 @@ int main(int argc, char **argv)
>>      setbuf(stdin, NULL);
>>      setbuf(stdout, NULL);
>>  
>> +    opt_compact = true;
> 
> How about giving the variable an initializer instead?

Actually, if we want fuzz-emul.c to be usable by itself (e.g., for the
Google ossfuz project), we *must* use a static initializer from within
fuzz-emul.c for it to have the correct defaults.  I'll change that...

> 
>> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> @@ -53,6 +53,15 @@ struct fuzz_state
>>  };
>>  #define DATA_OFFSET offsetof(struct fuzz_state, corpus)
>>  
>> +bool opt_compact;
>> +
>> +unsigned int fuzz_minimal_input_size(void)
>> +{
>> +    if ( opt_compact )
>> +        return sizeof(unsigned long) + 1;
> 
> What is this value choice based on / derived from? Oh, judging from
> code further down it may be one more than the size of the options
> field, in which case it should be sizeof(...->options) here.

What about renaming DATA_OFFSET to DATA_SIZE_FULL, and adding
DATA_SIZE_COMPACT?

Then is could be:

    return (opt_compact ? DATA_SIZE_COMPACT : DATA_SIZE_FULL) + 1;

>> @@ -647,9 +656,81 @@ static void setup_state(struct x86_emulate_ctxt *ctxt)
>>  {
>>      struct fuzz_state *s = ctxt->data;
>>  
>> -    /* Fuzz all of the state in one go */
>> -    if (!input_read(s, s, DATA_OFFSET))
>> -        exit(-1);
>> +    if ( !opt_compact )
>> +    {
>> +        /* Fuzz all of the state in one go */
>> +        if (!input_read(s, s, DATA_OFFSET))
> 
> Missing blanks.

Ack

> 
>> +            exit(-1);
>> +        return;
>> +    }
>> +
>> +    /* Modify only select bits of state */
>> +
>> +    /* Always read 'options' */
>> +    if ( !input_read(s, &s->options, sizeof(s->options)) )
>> +        return;
>> +    
>> +    while(1) {
> 
> Style. And for compatibility (read: no warnings) with as wide a range
> of compilers as possible, generally for ( ; ; ) is better to use.

I can do that; but would you mind explaining?  What kinds of compilers
don't like while(1)?

>> +        uint16_t offset;
>> +
>> +        /* Read 16 bits to decide what bit of state to modify */
>> +        if ( !input_read(s, &offset, sizeof(offset)) )
>> +            return;
> 
> Doesn't this suggest minimal input size wants to be one higher than
> what you currently enforce? And isn't the use of uint16_t here in
> conflict with the description talking about reading a byte every time?

Hmm, actually it rather implies that it should be one less... with the
new format there's no way to guarantee that the very first insn_fetch
will have any data to read.

>> +        /* 
>> +         * Then decide if it's "pointing to" different bits of the
>> +         * state 
>> +         */
>> +
>> +        /* cr[]? */
>> +        if ( offset < 5 )
> 
> ARRAY_SIZE()

Ack

>> +        {
>> +            if ( !input_read(s, s->cr + offset, sizeof(*s->cr)) )
>> +                return;
>> +            printf("Setting CR %d to %lx\n", offset, s->cr[offset]);
>> +            continue;
>> +        }
>> +        
>> +        offset -= 5;
> 
> Same here then.
> 
>> +        /* msr[]? */
>> +        if ( offset < MSR_INDEX_MAX )
> 
> Even here (and below) use of ARRAY_SIZE() may be better.
> 
>> +        {
>> +            if ( !input_read(s, s->msr + offset, sizeof(*s->msr)) )
>> +                return;
>> +            printf("Setting MSR i%d (%x) to %lx\n", offset,
>> +                   msr_index[offset], s->msr[offset]);
>> +            continue;
>> +        }
>> +
>> +        offset -= MSR_INDEX_MAX;
>> +
>> +        /* segments[]? */
>> +        if ( offset < SEG_NUM )
>> +        {
>> +            if ( !input_read(s, s->segments + offset, sizeof(*s->segments)) )
>> +                return;
>> +            printf("Setting Segment %d\n", offset);
>> +            continue;
>> +            
>> +        }
>> +
>> +        offset -= SEG_NUM;
>> +
>> +        /* regs? */
>> +        if ( offset < sizeof(struct cpu_user_regs)
>> +             && offset + sizeof(uint64_t) <= sizeof(struct cpu_user_regs) )
>> +        {
>> +            if ( !input_read(s, ((char *)ctxt->regs) + offset, sizeof(uint64_t)) )
>> +                return;
>> +            printf("Setting cpu_user_regs offset %x\n", offset);
>> +            continue;
>> +        }
>> +
>> +        /* None of the above -- take that as "start emulating" */
>> +        
>> +        return;
>> +    }
> 
> Having come here I wonder whether the use of "byte" in the description
> is right, and you mean "uint8_t offset" above, as you're far from
> consuming the entire 256 value range.

Isn't cpu_user_regs larger than 256 bytes?  And in any case, the offset
will become larger than 256 bytes one we include the FPU state.

> Additionally, was the order of elements here chosen for any specific
> reason? It would seem to me that elements having a more significant
> effect on emulation may be worth filling first, and I'm not convinced
> the "all CRs, all MSRs, all SREGs, all GPRs" order matches that.

I'm not aware of any particular order; it's probably some combination of
"the order they were in the cpu_regs struct" and "the order in which I
found it useful to add them".  Given that the input will be more or less
random, I don't think the order in the struct will have too much of an
impact on the order in which AFL explores them.

If you have an alternative suggestion for an order you think would be
more logical I'm happy to rearrange the structure.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2017-10-05 15:06 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-25 14:26 [PATCH v2 01/13] x86emul/fuzz: add rudimentary limit checking George Dunlap
2017-09-25 14:26 ` [PATCH v2 02/13] fuzz/x86_emulate: Actually use cpu_regs input George Dunlap
2017-10-04  8:21   ` Jan Beulich
2017-09-25 14:26 ` [PATCH v2 03/13] fuzz/x86_emulate: Clear errors after each iteration George Dunlap
2017-10-04  8:22   ` Jan Beulich
2017-09-25 14:26 ` [PATCH v2 04/13] fuzz/x86_emulate: Improve failure descriptions in x86_emulate harness George Dunlap
2017-10-04  8:22   ` Jan Beulich
2017-09-25 14:26 ` [PATCH v2 05/13] fuzz/x86_emulate: Implement input_read() and input_avail() George Dunlap
2017-10-04  8:22   ` Jan Beulich
2017-10-04 16:23     ` George Dunlap
2017-09-25 14:26 ` [PATCH v2 06/13] fuzz/x86_emulate: Rename the file containing the wrapper code George Dunlap
2017-10-04  8:23   ` Jan Beulich
2017-10-04 16:34     ` George Dunlap
2017-10-05  9:01       ` Jan Beulich
2017-10-05 13:50         ` George Dunlap
2017-09-25 14:26 ` [PATCH v2 07/13] fuzz/x86_emulate: Add 'afl-cov' target George Dunlap
2017-10-04  8:23   ` Jan Beulich
2017-10-04 16:48     ` George Dunlap
2017-10-05  9:06       ` Jan Beulich
2017-09-25 14:26 ` [PATCH v2 08/13] fuzz/x86_emulate: Take multiple test files for inputs George Dunlap
2017-10-04  8:24   ` Jan Beulich
2017-10-04 16:58     ` George Dunlap
2017-10-05  9:08       ` Jan Beulich
2017-09-25 14:26 ` [PATCH v2 09/13] fuzz/x86_emulate: Move all state into fuzz_state George Dunlap
2017-10-04  8:25   ` Jan Beulich
2017-10-04 16:51     ` George Dunlap
2017-09-25 14:26 ` [PATCH v2 10/13] fuzz/x86_emulate: Make input more compact George Dunlap
2017-10-04  8:26   ` Jan Beulich
2017-10-05 15:04     ` George Dunlap [this message]
2017-10-05 15:37       ` Jan Beulich
2017-09-25 14:26 ` [PATCH v2 11/13] fuzz/x86_emulate: Add --rerun option to try to track down instability George Dunlap
2017-10-04  8:27   ` Jan Beulich
2017-10-05 16:12     ` George Dunlap
2017-10-06  5:53       ` Jan Beulich
2017-09-25 14:26 ` [PATCH v2 12/13] fuzz/x86_emulate: Set and fuzz more CPU state George Dunlap
2017-10-04  8:28   ` Jan Beulich
2017-10-05 17:08     ` George Dunlap
2017-10-06  6:10       ` Jan Beulich
2017-10-06 10:53         ` George Dunlap
2017-10-06  9:57       ` Jan Beulich
2017-10-06 10:50         ` George Dunlap
2017-10-06 11:53           ` Jan Beulich
2017-10-06 11:56   ` Jan Beulich
2017-10-10 15:45     ` George Dunlap
2017-09-25 14:26 ` [PATCH v2 13/13] fuzz/x86_emulate: Add an option to limit the number of instructions executed George Dunlap
2017-10-04  8:28   ` Jan Beulich
2017-10-06 10:40     ` George Dunlap
2017-10-06 12:12       ` Jan Beulich
2017-10-06 13:02         ` George Dunlap
2017-10-06 15:21 ` [PATCH v2 01/13] x86emul/fuzz: add rudimentary limit checking George Dunlap
2017-10-06 15:54   ` Jan Beulich
2017-10-06 17:06     ` George Dunlap
2017-10-09  7:17       ` Jan Beulich
2017-10-09 12:54 ` Andrew Cooper
2017-10-09 13:26   ` Jan Beulich
2017-10-09 13:35     ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94e4d279-f3db-5db4-4fea-f209d140b17c@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ian.jackson@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).