From: Dan Carpenter <dan.carpenter@oracle.com>
To: Andrew Murray <andrew.murray@arm.com>
Cc: Catalin.Marinas@arm.com, smatch@vger.kernel.org
Subject: Re: [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations
Date: Thu, 5 Dec 2019 18:04:43 +0300 [thread overview]
Message-ID: <20191205150443.GR1787@kadam> (raw)
In-Reply-To: <20191007155543.GF42880@e119886-lin.cambridge.arm.com>
On Mon, Oct 07, 2019 at 04:55:44PM +0100, Andrew Murray wrote:
> On Mon, Oct 07, 2019 at 04:35:43PM +0100, Andrew Murray wrote:
> > The kernel provides a number of annotations that can be used by smatch
> > for static analysis, for example __user and __kernel - the kernel relies
> > on the GCC attribute 'address_space(x)' to implement these annotations.
> > By adding a new annotation named __untagged with address space 5 we can
> > use this annotation to indicate function parameters which should not
> > be treated as tagged addresses.
> >
> > Let's give Smatch an understanding of this annotation, when used by
> > function parameters, by adding a '[u]' tag to the end of their user value
> > ranges. We ensure that upon merges of range lists states, we only preserve
> > the tag if both states contain the tag.
> >
> > In other words, if two functions A and B both call function C (let's assume
> > these functions all have a single parameter) - then the range list for
> > function C's parameter will only contain the tag if the parameters for both
> > A and B are also tagged. Therefore we know that if if we make a comparison
> > between a tagged and untagged address in C then we know it doesn't need
> > to flag a detection as the values concerned have all come from functions
> > that are marked as __untagged.
> >
> > If only function A was tagged, then we could raise a detection in C - and
> > rely on parameter tracing to walk up both callers of C to build call trees
> > and exclude anything and its parents that are tagged.
> >
> > In practice we will always raise a detection and let the smdb.py script
> > figure out which call trees to display.
>
> This approach works well, however it sometimes falls over due to gaps in
> the Smatch flow-tracking/parameter-tracking. E.g.
>
> void func1(unsigned long __untagged addr, unsigned long len)
> {
> unsigned long tmp = 1;
> tmp += addr;
> tmp = func2(tmp, len, 4);
> find_vma(NULL, tmp);
> }
>
> The parameter addr is annotated as untagged, therefore its range list may
> look something like '0-u64max[u]'.
Ideally, it would never be 0-u64max and [u] at the same time, right?
The [u] is already a work around because Smatch is parsing it badly.
>
> The first line of code results in a range list of tmp to be '1';
>
> The second line of code assigns addr to tmp, resulting in tmp's range list
> changing to '1-u64max[u]'
>
> The third line of code passes tmp to a function and assigns the result back
> to tmp. Just like any other function Smatch keeps track of all the callers
> to func2, and when it parses the code in func2 it will consider 'tmp' to
> have a range list of all those callers combined - if there is a caller of
> func2 anywhere in the kernel that doesn't provide tmp with the '[u]' tag
> then the tag will be dropped. This results in the return value of func2 not
> containing a tag and resulting in 'tmp' inside func1 loosing it's '[u]' tag.
Yeah. Gotcha. We don't know if the return value is tagged untagged,
but maybe we could store something to say that it inherits the tag from
param $0.
> Sometimes Smatch is clever and can describe the range of values returned by a
> function in terms relative to the input arguments, in that case the tag would
> be preserved.
Yeah.
>
> This means that find_vma, if it detects a tagged pointer will flag it seeing as
> 'tmp' does not contain the '[u]' tag. Fortunately, in this case we rely on
> smdb.py to walk the tree from the detection, therefore it will see that func1
> calls find_vma, and func1 'addr' is tagged and thus it can suppress the
> detection. Unfortunately this relies on the ability of Smatch to detect that
> 'tmp' as passed to find_vma is the 'addr' in the function arguments - this
> parameter tracking doesn't always work (for the same reasons as described). It is
> for these reasons that using __untagged annotations is hit or miss.
I feel like my temptation is always to do a hack around. Like we could
say func2() never calls get_user() or copy_from_user(), it just takes
user data from the parameters and returns it. Smatch already
differentiates these situations with USER_DATA and USER_DATA_SET so we
could say if a function returns USER_DATA set and only passed untagged
to the function then the return is also untagged.
Would that work?
regards,
dan carpenter
next prev parent reply other threads:[~2019-12-05 15:04 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-07 15:35 [RFC PATCH 0/7] Tagged Pointer Detection Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 1/7] build: Add '-lm' build flag Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 2/7] smdb.py: remove undocumented test command Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 3/7] arm64: add check for comparison against tagged address Andrew Murray
2019-10-07 15:49 ` Andrew Murray
2019-12-04 15:41 ` Andrew Murray
2019-12-05 13:27 ` Dan Carpenter
2019-12-05 14:28 ` Dan Carpenter
2019-12-05 14:35 ` Dan Carpenter
2019-12-05 17:21 ` Luc Van Oostenryck
2019-12-05 17:24 ` Luc Van Oostenryck
2019-10-07 15:35 ` [RFC PATCH 4/7] smdb.py: add find_tagged and parse_warns_tagged commands Andrew Murray
2019-10-07 15:52 ` Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations Andrew Murray
2019-10-07 15:55 ` Andrew Murray
2019-12-05 15:04 ` Dan Carpenter [this message]
2019-10-08 8:24 ` Dan Carpenter
2019-10-08 8:41 ` Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 6/7] smdb.py: filter out __untagged from find_tagged results Andrew Murray
2019-10-07 15:35 ` [RFC PATCH 7/7] Documentation: add guide for tagged addresses Andrew Murray
2019-10-08 8:58 ` [RFC PATCH 0/7] Tagged Pointer Detection Dan Carpenter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191205150443.GR1787@kadam \
--to=dan.carpenter@oracle.com \
--cc=Catalin.Marinas@arm.com \
--cc=andrew.murray@arm.com \
--cc=smatch@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox