From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp2120.oracle.com ([156.151.31.85]:47974 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229684AbhEJHYu (ORCPT ); Mon, 10 May 2021 03:24:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=corp-2020-01-29; bh=tUyhrmOz5vtkmY4fUSV2Ig3dosOBfaDenjxEr+3O6WQ=; b=E9N4BjMQIFyfUE1YZC5mI+67+qxNGtYB09IvW65/JFk+W5h8kr/kJgES4bPC4vMQNwaa ei3WYaxiyp5SlL9AYbhDVlam4k/QzEr4cs3x7qn3mm+6k8JEqzjLiizf9TDIdom8xQod Hn+ob0XE4s0k0kZ21DzTtgKSmWhX6TJnDnlamUL5XInzYQdq3HjB3H6Ljq8q317mZ1Am TAh9bVNjGLsQ3rXcNTT2ma+YIA8A2SXuO/0dkzxIGj847cSjZuooPxpNUVOjFupGZhst F9MeCm8UEp8NqoRbuSzrilc0Ql/te+ATA7uEgl1ainbBr9tLFey1+fo8swKQ+iTLRs1A Lg== Date: Mon, 10 May 2021 10:20:22 +0300 From: Dan Carpenter Subject: Re: Smatch mailing list archives Message-ID: <20210510072022.GS1955@kadam> References: <20210508051626.GI1955@kadam> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-ID: To: "Reshetova, Elena" Cc: "smatch@vger.kernel.org" , "Kleen, Andi" On Mon, May 10, 2021 at 06:17:27AM +0000, Reshetova, Elena wrote: > > On Fri, May 07, 2021 at 02:22:37PM +0000, Reshetova, Elena wrote: > > > Hi, > > > > > > I have been working for a while now on a new smatch pattern, but > > > would really appreciate additional information points such as past > > > email discussions, etc. > > > > > > So I am wondering if there is a way to browse through > > > the archives of this mailing list in order to try to find the > > > information I need? > > > > Sorry, I don't think it's archived anywhere. There isn't a lot of > > traffic on the list. About three times a year someone reports that > > Smatch is crashing for them. > > > > I'm always happy to answer questions if there is any way I can help? > > Thank you Dan! I am pretty new with smatch so that's why I was > hoping to browse through the existing mails to see if my simple questions > are already answered, but here is my current issue. > > What is the best way to create identifiers for the findings that certain smatch > pattern finds in the kernel? Let's say I have a new pattern that is able to find > different problematic places and report them in usual smatch way: errors and > warnings with file name, line number, function name, etc. > Now for our pattern in order to be sure that the reported issue exists/does not > exists, somebody needs to go and look at the code manually and make a call. > After this, it would be nice to mark this place as safe/concern in the report and be > able to transfer these results for kernel versions bumps (5.11->5.12, etc.) as soon as > the code in this function where finding was reported has not changed (and there > might be multiple findings per function). > > What is the best way of doing it? > I was first thinking of using some simple hash for the reported line (lines around, relative > position within the reported function), > but now I think I need also to hash the whole function in addition to the finding itself. > > Then the logic of transferring the result would be: > > For each finding calculate: > 1. finding_line_hash: the hash of the line that resulted in finding (becomes a unique id > within the function). > 2. finding_function_hash: the hash of the function that produced the finding (becomes a > unique global id within the kernel) and helps to determine if the function has not been > changed between the kernel versions. > > Logic for the result transfer: > > If both finding_line_hash and finding_function_hash match between the two smatch reports > for two different versions, then it is relatively safe to transfer this concrete smatch finding > and its manual audit result automatically. > > Does it make sense overall? If yes, what is the easiest way in smatch to get hash data for > 1 and 2? I.e. get full reported line as a string and full function content as a string? I use the a script smatch_scripts/new_bugs.pl It strips out the variables names from the single quotes and any numbers and the parentheses so it looks like this: Original warning: fs/fuse/virtio_fs.c:1468 virtio_fs_get_tree() error: double free of 'fm' Stripped: fs.fuse.virtio_fs.c.virtio_fs_get_tree_error:_double_free_of_'' You could hash the stripped string. Looking at it now, the variable name is actually useful and shouldn't be stripped out. Doh... I don't know what the zero day bot does for this to mark warnings as dealt with or not. There is also the Aiaiai project (https://www.openhub.net/p/aiaiai) which probably has a feature for marking warnings as reviewed. regards, dan carpenter