Re: [RFC] Have insn decoder functions return success/failure

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Masami Hiramatsu <mhiramat@kernel.org>
To: Borislav Petkov <bp@alien8.de>
Cc: x86-ml <x86@kernel.org>, Joerg Roedel <jroedel@suse.de>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] Have insn decoder functions return success/failure
Date: Thu, 22 Oct 2020 16:31:00 +0900	[thread overview]
Message-ID: <20201022163100.1139b28220da4eafb5e70fcc@kernel.org> (raw)
In-Reply-To: <20201021164558.GB4050@zn.tnic>

On Wed, 21 Oct 2020 18:45:58 +0200
Borislav Petkov <bp@alien8.de> wrote:

> On Wed, Oct 21, 2020 at 11:26:13PM +0900, Masami Hiramatsu wrote:
> > Hmm, I meant someone might think it can be used for filtering the
> > instruction something like,
> > 
> > insn_init(insn, buf, buflen, 1);
> > ret = insn_get_length(insn);
> > if (!ret) {
> > 	/* OK, this is safe */
> > 	patch_text(buf, trampoline);
> > }
> > 
> > No, we need another validator for such usage.
> 
> Well, I think calling insn_get_length() should give you only the
> *length* of the insn and nothing else - I mean that is what the function
> is called. And I believe current use is wrong.
> 
> Examples:
> 
> arch/x86/kernel/kprobes/core.c:
>                 insn_get_length(&insn);
> 
>                 /*
>                  * Another debugging subsystem might insert this breakpoint.
>                  * In that case, we can't recover it.
>                  */
>                 if (insn.opcode.bytes[0] == INT3_INSN_OPCODE)
> 
> So this has called get_length but it is far from clear that after that
> call, the opcode bytes in insn.opcode.bytes are there.

No, insn_get_length() implies it decodes whole of the instruction.
(yeah, we need an alias of that, something like insn_get_complete())

> 
> What that should do instead IMO is this:
> 
> 	insn_get_opcode(&insn);
> 

No, you've cut the last lines of that loop.

                /*
                 * Another debugging subsystem might insert this breakpoint.
                 * In that case, we can't recover it.
                 */
                if (insn.opcode.bytes[0] == INT3_INSN_OPCODE)
                        return 0;
                addr += insn.length;
        }

I need insn.length too. Of course we can split it into 2 calls. But
as I said, since the insn_get_length() implies it decodes all other
parts, I just called it once.

> and *then* the return value can tell you whether the opcode bytes were
> parsed properly or not. See what I mean?

I agreed to check the return value of insn_get_length() at that point
only for checking whether the instruction parsing was failed or not.

> 
> That's even documented that way:
> 
> /**
>  * insn_get_opcode - collect opcode(s)
>  * @insn:       &struct insn containing instruction
>  *
>  * Populates @insn->opcode, updates @insn->next_byte to point past the
>  * opcode byte(s), and set @insn->attr (except for groups).
> 
> 
> Similarly here:
> 
> static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
> 
> 	...
> 
>         insn_get_length(&ctxt->insn);
> 
>         ret = ctxt->insn.immediate.got ? ES_OK : ES_DECODE_FAILED;
> 
> that thing wants to decode the insn but it is looking whether it parsed
> an *immediate*?!

Hm, it is better to call insn_get_immediate() if it doesn't use length later.

> 
> I'm not saying this is necessarily wrong - just the naming nomenclature
> and the API should be properly defined when you call a function of the
> insn decoder, what you are guaranteed to get and what a caller can
> assume after that. And then the proper functions be called.

Would you mean we'd better have something like insn_get_until_immediate() ? 

Since the x86 instruction is CISC, we can not decode intermediate
parts. The APIs follows that. If you are confused, I'm sorry about that.

> 
> In the kprobes/core.c example above, it does a little further:
> 
> 	ddr += insn.length;	
> 
> which, IMO, it should be either preceeded by a call to insn_get_length()
> - yes, this time we want the insn length or, the code should call a
> decoding function which gives you *both* length* and opcode bytes.
> insn_decode_insn() or whatever. And *that* should be documented in that
> function's kernel-doc section. And so on...

Actually, there is a historical reason too. INT3 check was added afterwards.
At first, I just calculated the instruction length in the loop...

Thank you,

> 
> Does that make more sense?
> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette


-- 
Masami Hiramatsu <mhiramat@kernel.org>

next prev parent reply	other threads:[~2020-10-22  7:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20 12:02 [RFC] Have insn decoder functions return success/failure Borislav Petkov
2020-10-20 14:27 ` Masami Hiramatsu
2020-10-20 14:37   ` Borislav Petkov
2020-10-21  0:50     ` Masami Hiramatsu
2020-10-21  9:27       ` Borislav Petkov
2020-10-21 14:26         ` Masami Hiramatsu
2020-10-21 16:45           ` Borislav Petkov
2020-10-22  7:31             ` Masami Hiramatsu [this message]
2020-10-22  9:30               ` Borislav Petkov
2020-10-22 13:21                 ` Masami Hiramatsu
2020-10-22 17:58                   ` Andy Lutomirski
2020-10-23  9:20                     ` Borislav Petkov
2020-10-23  9:28                     ` Masami Hiramatsu
2020-10-23  9:32                       ` Borislav Petkov
2020-10-23 10:47                         ` Masami Hiramatsu
2020-10-23 23:27                           ` Borislav Petkov
2020-10-24  0:12                             ` Andy Lutomirski
2020-10-24  7:21                               ` Masami Hiramatsu
2020-10-24  8:23                               ` Borislav Petkov
2020-10-24 16:10                                 ` Andy Lutomirski
2020-10-27 13:42                                   ` Borislav Petkov
2020-10-28 11:36                                     ` Masami Hiramatsu
2020-10-24  7:13                             ` Masami Hiramatsu
2020-10-24  8:24                               ` Borislav Petkov
2020-10-29 12:42                             ` Borislav Petkov
2020-10-30  1:24                               ` Masami Hiramatsu
2020-10-30 13:07                                 ` Borislav Petkov
2020-10-23  9:17                   ` Borislav Petkov
2020-10-22  8:04             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201022163100.1139b28220da4eafb5e70fcc@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=bp@alien8.de \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox