public inbox for smatch@vger.kernel.org
 help / color / mirror / Atom feed
From: Dan Carpenter <dan.carpenter@oracle.com>
To: Andrew Murray <andrew.murray@arm.com>
Cc: Catalin.Marinas@arm.com, smatch@vger.kernel.org
Subject: Re: [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations
Date: Thu, 5 Dec 2019 18:04:43 +0300	[thread overview]
Message-ID: <20191205150443.GR1787@kadam> (raw)
In-Reply-To: <20191007155543.GF42880@e119886-lin.cambridge.arm.com>

On Mon, Oct 07, 2019 at 04:55:44PM +0100, Andrew Murray wrote:
> On Mon, Oct 07, 2019 at 04:35:43PM +0100, Andrew Murray wrote:
> > The kernel provides a number of annotations that can be used by smatch
> > for static analysis, for example __user and __kernel - the kernel relies
> > on the GCC attribute 'address_space(x)' to implement these annotations.
> > By adding a new annotation named __untagged with address space 5 we can
> > use this annotation to indicate function parameters which should not
> > be treated as tagged addresses.
> > 
> > Let's give Smatch an understanding of this annotation, when used by
> > function parameters, by adding a '[u]' tag to the end of their user value
> > ranges. We ensure that upon merges of range lists states, we only preserve
> > the tag if both states contain the tag.
> > 
> > In other words, if two functions A and B both call function C (let's assume
> > these functions all have a single parameter) - then the range list for
> > function C's parameter will only contain the tag if the parameters for both
> > A and B are also tagged. Therefore we know that if if we make a comparison
> > between a tagged and untagged address in C then we know it doesn't need
> > to flag a detection as the values concerned have all come from functions
> > that are marked as __untagged.
> > 
> > If only function A was tagged, then we could raise a detection in C - and
> > rely on parameter tracing to walk up both callers of C to build call trees
> > and exclude anything and its parents that are tagged.
> > 
> > In practice we will always raise a detection and let the smdb.py script
> > figure out which call trees to display.
> 
> This approach works well, however it sometimes falls over due to gaps in
> the Smatch flow-tracking/parameter-tracking. E.g.
> 
> void func1(unsigned long __untagged addr, unsigned long len)
> {
>     unsigned long tmp = 1;
>     tmp += addr;
>     tmp = func2(tmp, len, 4);
>     find_vma(NULL, tmp);
> }
> 
> The parameter addr is annotated as untagged, therefore its range list may
> look something like '0-u64max[u]'.

Ideally, it would never be 0-u64max and [u] at the same time, right?
The [u] is already a work around because Smatch is parsing it badly.

> 
> The first line of code results in a range list of tmp to be '1';
> 
> The second line of code assigns addr to tmp, resulting in tmp's range list
> changing to '1-u64max[u]'
> 
> The third line of code passes tmp to a function and assigns the result back
> to tmp. Just like any other function Smatch keeps track of all the callers
> to func2, and when it parses the code in func2 it will consider 'tmp' to
> have a range list of all those callers combined - if there is a caller of
> func2 anywhere in the kernel that doesn't provide tmp with the '[u]' tag
> then the tag will be dropped. This results in the return value of func2 not
> containing a tag and resulting in 'tmp' inside func1 loosing it's '[u]' tag.

Yeah.  Gotcha.  We don't know if the return value is tagged untagged,
but maybe we could store something to say that it inherits the tag from
param $0.

> Sometimes Smatch is clever and can describe the range of values returned by a
> function in terms relative to the input arguments, in that case the tag would
> be preserved.

Yeah.

> 
> This means that find_vma, if it detects a tagged pointer will flag it seeing as
> 'tmp' does not contain the '[u]' tag. Fortunately, in this case we rely on
> smdb.py to walk the tree from the detection, therefore it will see that func1
> calls find_vma, and func1 'addr' is tagged and thus it can suppress the
> detection. Unfortunately this relies on the ability of Smatch to detect that
> 'tmp' as passed to find_vma is the 'addr' in the function arguments - this
> parameter tracking doesn't always work (for the same reasons as described). It is
> for these reasons that using __untagged annotations is hit or miss.

I feel like my temptation is always to do a hack around.  Like we could
say func2() never calls get_user() or copy_from_user(), it just takes
user data from the parameters and returns it.  Smatch already
differentiates these situations with USER_DATA and USER_DATA_SET so we
could say if a function returns USER_DATA set and only passed untagged
to the function then the return is also untagged.

Would that work?

regards,
dan carpenter

  reply	other threads:[~2019-12-05 15:04 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-07 15:35 [RFC PATCH 0/7] Tagged Pointer Detection Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 1/7] build: Add '-lm' build flag Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 2/7] smdb.py: remove undocumented test command Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 3/7] arm64: add check for comparison against tagged address Andrew Murray
2019-10-07 15:49   ` Andrew Murray
2019-12-04 15:41     ` Andrew Murray
2019-12-05 13:27     ` Dan Carpenter
2019-12-05 14:28       ` Dan Carpenter
2019-12-05 14:35         ` Dan Carpenter
2019-12-05 17:21           ` Luc Van Oostenryck
2019-12-05 17:24         ` Luc Van Oostenryck
2019-10-07 15:35 ` [RFC PATCH 4/7] smdb.py: add find_tagged and parse_warns_tagged commands Andrew Murray
2019-10-07 15:52   ` Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations Andrew Murray
2019-10-07 15:55   ` Andrew Murray
2019-12-05 15:04     ` Dan Carpenter [this message]
2019-10-08  8:24   ` Dan Carpenter
2019-10-08  8:41     ` Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 6/7] smdb.py: filter out __untagged from find_tagged results Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 7/7] Documentation: add guide for tagged addresses Andrew Murray
2019-10-08  8:58 ` [RFC PATCH 0/7] Tagged Pointer Detection Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191205150443.GR1787@kadam \
    --to=dan.carpenter@oracle.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=andrew.murray@arm.com \
    --cc=smatch@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox