From: Nicholas Mc Guire <der.herr@hofr.at>
To: Julia Lawall <julia.lawall@lip6.fr>
Cc: Nicholas Mc Guire <hofrat@osadl.org>,
Gilles Muller <Gilles.Muller@lip6.fr>,
Nicolas Palix <nicolas.palix@imag.fr>,
Michal Marek <mmarek@suse.cz>,
cocci@systeme.lip6.fr, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] Coccinelle: Check for return not matching function signature
Date: Fri, 8 May 2015 08:59:16 +0200 [thread overview]
Message-ID: <20150508065916.GA3886@opentech.at> (raw)
In-Reply-To: <alpine.DEB.2.10.1505052323270.2915@hadrien>
On Tue, 05 May 2015, Julia Lawall wrote:
>
>
> On Tue, 5 May 2015, Nicholas Mc Guire wrote:
>
> > On Tue, 05 May 2015, Julia Lawall wrote:
> >
> > > > +@match@
> > > > +identifier f,ret;
> > > > +position p;
> > > > +type T1,T2;
> > > > +@@
> > > > +
> > > > +T1 f(...) {
> > > > + T2 ret;
> > > > +<+...
> > > > +* return@p ret
> > > > +;
> > > > +...+>
> > > > +}
> > >
> > > Given the number of results, it may seem surprising, but I think that you
> > > are actually missing a lot of results. Becaue you require that ret be the
> > > first variable that is declared in the function. Also, you require that
> > > ret be an identifier. If you want to keep the restriction about being an
> > > identifier, you could put:
> > >
> > > @match exists@
> > > type T1,T2;
> > > idexpression T2 ret;
>
> I was think ing that you don't want expression in general, because for all
> contansts that will give you int.
>
> You can of course put return C; for constant metavariable C in the
> disjunction to avoid that possibility.
>
looks a lot better and removed a lot of false positives - the main problem
now is managing classification of the kernels type "system" - seems like there
are atleast 5 ways to describe every type (except for enum) and infinitely
many possible assignments for ssize_t ...
here a little summary of the outputs - might be motivation to put some
quite simple scanners into mainline to catch such issues.
comparison of return types in functions to the functions signature for
kernel 4.1-rc2, glibc-2.9 and busybox 1.2.2.1 - no particular reason for
that glibc/busybox versions they just happend to be on my harddrive.
This is using the version that was fixed by Julia Lawal
<snip>
@match exists@
type T1,T2;
idexpression T1 ok;
idexpression T2 ret;
identifier f;
constant C;
position p;
@@
T1 f(...) {
<+...
(
return ok;
|
return C;
|
return@p ret;
)
...+>
}
<snip>
component Nr funcs != type %
kernel : 374600 10727 2.85
glibc : 9184 268 2.92
busybox : 3645 43 1.18
kernel glibc busybox criticality
wrong ? : 8 4 0 not sure
sign missmatch : 2279 30 9 critical
down sized : 435 49 4 critical
up sized : 910 20 3 ugly
declaration missmatch : 7095 165 27 wishlist
wrong: seems plain wrong like float assigned to int (did not check details yet)
sign missmatch: assigning signed types to unsiggned or vice versa
down sized: some form of possible truncation like u64 being assigned u32
up sized: non-critical as it was correct type and it fit
declaration missmatch: means that they were named differently s32/int
Some limitations:
The glibc runs produced some error cases (spatch level) that were ignored
for now e.g.:
<snip>
EXN:Failure("match: node 194: return ...[1,2,90] in rec_dirsearch reachable
by inconsistent control-flow paths")
<snip>
The kernel numbers are a bit inaccurate because not all types can be
checked reliably - e.g. when they are config dependent also due to the
enourmous type-"system" in the kernel not all assignments are sure
but that does not change the overall result.
I did not yet manage to automate the classification - just too many types
where its hard to say due to config dependencies - probably need to put
thos into a "don't know" category. Also all assignments of pointers of
any type on one side to void * on the other side was counted as legitimate.
Some results were mangled probably because of inacurate filtering resulting
in things like "_EXTERN_INLINE != mp_limb_t" just dropped those for now.
Conclusion:
* atleast the sign missmatch cases (2279) and potentially truncating
assignments (435) are problematic.
* the scripts needs a lot more cleanup in the classification of the reported
types to be useful
* probably not realistic to cleanup all currently present tupe mismatches
but scanning continuously and reporting before it goes into mainline or
integrating such a check in the routine submission process seems
worthwhile
Once the classifier is working properly I'll post the next version.
thx!
hofrat
next prev parent reply other threads:[~2015-05-08 6:59 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-05 10:12 [PATCH RFC] Coccinelle: Check for return not matching function signature Nicholas Mc Guire
2015-05-05 12:32 ` SF Markus Elfring
2015-05-05 13:04 ` Nicholas Mc Guire
2015-05-05 13:29 ` [Cocci] " Julia Lawall
2015-05-05 13:45 ` SF Markus Elfring
2015-05-05 14:40 ` Julia Lawall
2015-05-05 16:00 ` Nicholas Mc Guire
2015-05-05 21:24 ` Julia Lawall
2015-05-08 6:59 ` Nicholas Mc Guire [this message]
2015-05-05 16:25 ` SF Markus Elfring
2015-05-05 21:46 ` Julia Lawall
2015-05-06 7:15 ` SF Markus Elfring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150508065916.GA3886@opentech.at \
--to=der.herr@hofr.at \
--cc=Gilles.Muller@lip6.fr \
--cc=cocci@systeme.lip6.fr \
--cc=hofrat@osadl.org \
--cc=julia.lawall@lip6.fr \
--cc=linux-kernel@vger.kernel.org \
--cc=mmarek@suse.cz \
--cc=nicolas.palix@imag.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox