From: Alexander Graf <graf@amazon.com>
To: <samcacc@amazon.com>, Sam Caccavale <samcacc@amazon.de>
Cc: <samcaccavale@gmail.com>, <nmanthey@amazon.de>,
<wipawel@amazon.de>, <dwmw@amazon.co.uk>, <mpohlack@amazon.de>,
<graf@amazon.de>, <karahmed@amazon.de>,
<andrew.cooper3@citrix.com>, <JBeulich@suse.com>,
<pbonzini@redhat.com>, <rkrcmar@redhat.com>, <tglx@linutronix.de>,
<mingo@redhat.com>, <bp@alien8.de>, <hpa@zytor.com>,
<paullangton4@gmail.com>, <anirudhkaushik@google.com>,
<x86@kernel.org>, <kvm@vger.kernel.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/3] Emulate simple x86 instructions in userspace
Date: Fri, 21 Jun 2019 15:28:34 +0200 [thread overview]
Message-ID: <c5fbed80-5933-eca3-001e-0e2aaccfcd1d@amazon.com> (raw)
In-Reply-To: <7e0188fa-351f-157b-2815-ab19222f44b4@amazon.com>
On 12.06.19 17:19, samcacc@amazon.com wrote:
> On 5/31/19 10:38 AM, Alexander Graf wrote:
>> On 21.05.19 17:39, Sam Caccavale wrote:
>>
>>> +static void dump_state_after(const char *desc, struct state *state)
>>> +{
>>> + debug(" -- State after %s --\n", desc);
>>> + debug("mode: %s\n", x86emul_mode_string[state->ctxt.mode]);
>>> + debug(" cr0: %lx\n", state->vcpu.cr[0]);
>>> + debug(" cr3: %lx\n", state->vcpu.cr[3]);
>>> + debug(" cr4: %lx\n", state->vcpu.cr[4]);
>>> +
>>> + debug("Decode _eip: %lu\n", state->ctxt._eip);
>>> + debug("Emulate eip: %lu\n", state->ctxt.eip);
>>> +
>>> + debug("\n");
>>> }
>>> int step_emulator(struct state *state)
>>> {
>>> - return 0;
>>> + int rc, prev_eip = state->ctxt.eip;
>>> + int decode_size = state->data_available - decode_offset;
>>> +
>>> + if (decode_size < 15) {
>>> + rc = x86_decode_insn(&state->ctxt, &state->data[decode_offset],
>>> + decode_size);
>>> + } else {
>>> + rc = x86_decode_insn(&state->ctxt, NULL, 0);
>>
>> Isn't this going to fetch instructions from data as well? Why do we need
>> the < 15 special case at all?
>>
> I've changed the method of acquiring data in v2, but the 15 limit is
> still relevant. If x86_decode_insn is called with a NULL pointer and
> instruction size 0, the bytes are fetched via the emulator_ops.fetch
> function. This would be nice, but there is no way of limiting how many
> bytes it will try and fetch-- and it usually grabs 15 since that is the
> longest x86 instruction (as of yet?). When there are less than 15 bytes
> left, limiting the fetch size to the remaining bytes is important.
You want to at least add a comment here, detailing the fact that where
the magic 15 comes from and that you want to exercise the normal
prefetch path while still allowing the buffer to shrink < 15 bytes :).
Maybe move MAX_INST_SIZE from svm.c into a .h file and reuse that while
at it.
[...]
>>> diff --git a/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>>> b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>>> new file mode 100755
>>> index 000000000000..e570b17f9404
>>> --- /dev/null
>>> +++ b/tools/fuzz/x86_instruction_emulation/scripts/bin_fuzz
>>> @@ -0,0 +1,23 @@
>>> +#!/bin/bash
>>> +# SPDX-License-Identifier: GPL-2.0+
>>> +# This runs the afl-harness at $1, $2 times (or 100)
>>> +# It runs uniq and sorts the output to give an idea of what is
>>> causing the
>>> +# most crashes. Useful for deciding what to implement next.
>>> +
>>> +if [ "$#" -lt 1 ]; then
>>> + echo "Usage: './bin_fuzz path_to_afl-harness [number of times to run]"
>>> + exit
>>> +fi
>>> +
>>> +mkdir -p fuzz
>>> +rm -f fuzz/*.in fuzz/*.out
>>> +
>>> +for i in $(seq 1 1 ${2:-100})
>>> +do
>>> + {
>>> + head -c 500 /dev/urandom | tee fuzz/$i.in | ./$1
>>> + } > fuzz/$i.out 2>&1
>>> +
>>> +done
>>> +
>>> +find ./fuzz -name '*.out' -exec tail -1 {} \; | sed 's/.*
>>> Segmen/Segman/' | sed -r 's/^(\s[0-9a-f]{2})+$/misc instruction
>>> output/' | sort | uniq -c | sort -rn
>>
>> What is that Segman thing about?
>>
> This was for binning crashes-- check `tools/fuzz/x86ie/scripts/bin.sh`
> in v2 for the updated version. Basically, it checks whether a
> segmentation fault has happened, and if so, launches a gdb session to
> see whether it was caused by an unimplemented x86_emulator_op. This is
> useful in development for prioritizing the unimplemented features which
> are causing the most fake crashes.
I can see why you want to combine them, but I don't understand where
"Segman" comes from. Where is there a man here?
Alex
next prev parent reply other threads:[~2019-06-21 13:28 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-21 15:39 x86 instruction emulator fuzzing Sam Caccavale
2019-05-21 15:39 ` [PATCH 1/3] Build target for emulate.o as a userspace binary Sam Caccavale
2019-05-31 8:02 ` Alexander Graf
2019-06-12 15:19 ` samcacc
2019-05-21 15:39 ` [PATCH 2/3] Emulate simple x86 instructions in userspace Sam Caccavale
2019-05-31 8:38 ` Alexander Graf
2019-06-12 15:19 ` samcacc
2019-06-21 13:28 ` Alexander Graf [this message]
2019-05-21 15:39 ` [PATCH 3/3] Demonstrating unit testing via simple-harness Sam Caccavale
2019-05-31 8:39 ` x86 instruction emulator fuzzing Alexander Graf
2019-06-12 15:19 ` samcacc
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c5fbed80-5933-eca3-001e-0e2aaccfcd1d@amazon.com \
--to=graf@amazon.com \
--cc=JBeulich@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=anirudhkaushik@google.com \
--cc=bp@alien8.de \
--cc=dwmw@amazon.co.uk \
--cc=graf@amazon.de \
--cc=hpa@zytor.com \
--cc=karahmed@amazon.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mpohlack@amazon.de \
--cc=nmanthey@amazon.de \
--cc=paullangton4@gmail.com \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
--cc=samcacc@amazon.com \
--cc=samcacc@amazon.de \
--cc=samcaccavale@gmail.com \
--cc=tglx@linutronix.de \
--cc=wipawel@amazon.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox