All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Carpenter <dan.carpenter@oracle.com>
To: Andrew Murray <andrew.murray@arm.com>
Cc: Catalin.Marinas@arm.com, smatch@vger.kernel.org
Subject: Re: [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations
Date: Thu, 5 Dec 2019 18:04:43 +0300	[thread overview]
Message-ID: <20191205150443.GR1787@kadam> (raw)
In-Reply-To: <20191007155543.GF42880@e119886-lin.cambridge.arm.com>

On Mon, Oct 07, 2019 at 04:55:44PM +0100, Andrew Murray wrote:
> On Mon, Oct 07, 2019 at 04:35:43PM +0100, Andrew Murray wrote:
> > The kernel provides a number of annotations that can be used by smatch
> > for static analysis, for example __user and __kernel - the kernel relies
> > on the GCC attribute 'address_space(x)' to implement these annotations.
> > By adding a new annotation named __untagged with address space 5 we can
> > use this annotation to indicate function parameters which should not
> > be treated as tagged addresses.
> > 
> > Let's give Smatch an understanding of this annotation, when used by
> > function parameters, by adding a '[u]' tag to the end of their user value
> > ranges. We ensure that upon merges of range lists states, we only preserve
> > the tag if both states contain the tag.
> > 
> > In other words, if two functions A and B both call function C (let's assume
> > these functions all have a single parameter) - then the range list for
> > function C's parameter will only contain the tag if the parameters for both
> > A and B are also tagged. Therefore we know that if if we make a comparison
> > between a tagged and untagged address in C then we know it doesn't need
> > to flag a detection as the values concerned have all come from functions
> > that are marked as __untagged.
> > 
> > If only function A was tagged, then we could raise a detection in C - and
> > rely on parameter tracing to walk up both callers of C to build call trees
> > and exclude anything and its parents that are tagged.
> > 
> > In practice we will always raise a detection and let the smdb.py script
> > figure out which call trees to display.
> 
> This approach works well, however it sometimes falls over due to gaps in
> the Smatch flow-tracking/parameter-tracking. E.g.
> 
> void func1(unsigned long __untagged addr, unsigned long len)
> {
>     unsigned long tmp = 1;
>     tmp += addr;
>     tmp = func2(tmp, len, 4);
>     find_vma(NULL, tmp);
> }
> 
> The parameter addr is annotated as untagged, therefore its range list may
> look something like '0-u64max[u]'.

Ideally, it would never be 0-u64max and [u] at the same time, right?
The [u] is already a work around because Smatch is parsing it badly.

> 
> The first line of code results in a range list of tmp to be '1';
> 
> The second line of code assigns addr to tmp, resulting in tmp's range list
> changing to '1-u64max[u]'
> 
> The third line of code passes tmp to a function and assigns the result back
> to tmp. Just like any other function Smatch keeps track of all the callers
> to func2, and when it parses the code in func2 it will consider 'tmp' to
> have a range list of all those callers combined - if there is a caller of
> func2 anywhere in the kernel that doesn't provide tmp with the '[u]' tag
> then the tag will be dropped. This results in the return value of func2 not
> containing a tag and resulting in 'tmp' inside func1 loosing it's '[u]' tag.

Yeah.  Gotcha.  We don't know if the return value is tagged untagged,
but maybe we could store something to say that it inherits the tag from
param $0.

> Sometimes Smatch is clever and can describe the range of values returned by a
> function in terms relative to the input arguments, in that case the tag would
> be preserved.

Yeah.

> 
> This means that find_vma, if it detects a tagged pointer will flag it seeing as
> 'tmp' does not contain the '[u]' tag. Fortunately, in this case we rely on
> smdb.py to walk the tree from the detection, therefore it will see that func1
> calls find_vma, and func1 'addr' is tagged and thus it can suppress the
> detection. Unfortunately this relies on the ability of Smatch to detect that
> 'tmp' as passed to find_vma is the 'addr' in the function arguments - this
> parameter tracking doesn't always work (for the same reasons as described). It is
> for these reasons that using __untagged annotations is hit or miss.

I feel like my temptation is always to do a hack around.  Like we could
say func2() never calls get_user() or copy_from_user(), it just takes
user data from the parameters and returns it.  Smatch already
differentiates these situations with USER_DATA and USER_DATA_SET so we
could say if a function returns USER_DATA set and only passed untagged
to the function then the return is also untagged.

Would that work?

regards,
dan carpenter

  reply	other threads:[~2019-12-05 15:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-07 15:35 [RFC PATCH 0/7] Tagged Pointer Detection Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 1/7] build: Add '-lm' build flag Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 2/7] smdb.py: remove undocumented test command Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 3/7] arm64: add check for comparison against tagged address Andrew Murray
2019-10-07 15:49   ` Andrew Murray
2019-12-04 15:41     ` Andrew Murray
2019-12-05 13:27     ` Dan Carpenter
2019-12-05 14:28       ` Dan Carpenter
2019-12-05 14:35         ` Dan Carpenter
2019-12-05 17:21           ` Luc Van Oostenryck
2019-12-05 17:24         ` Luc Van Oostenryck
2019-10-07 15:35 ` [RFC PATCH 4/7] smdb.py: add find_tagged and parse_warns_tagged commands Andrew Murray
2019-10-07 15:52   ` Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations Andrew Murray
2019-10-07 15:55   ` Andrew Murray
2019-12-05 15:04     ` Dan Carpenter [this message]
2019-10-08  8:24   ` Dan Carpenter
2019-10-08  8:41     ` Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 6/7] smdb.py: filter out __untagged from find_tagged results Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 7/7] Documentation: add guide for tagged addresses Andrew Murray
2019-10-08  8:58 ` [RFC PATCH 0/7] Tagged Pointer Detection Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191205150443.GR1787@kadam \
    --to=dan.carpenter@oracle.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=andrew.murray@arm.com \
    --cc=smatch@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.