From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0099DC4332F for ; Tue, 13 Dec 2022 01:44:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8F2DD8E0003; Mon, 12 Dec 2022 20:44:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 87AE68E0002; Mon, 12 Dec 2022 20:44:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 71BEA8E0003; Mon, 12 Dec 2022 20:44:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5CDBC8E0002 for ; Mon, 12 Dec 2022 20:44:31 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0E538160243 for ; Tue, 13 Dec 2022 01:44:31 +0000 (UTC) X-FDA: 80235588342.26.D1AE415 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) by imf11.hostedemail.com (Postfix) with ESMTP id 313CD4000D for ; Tue, 13 Dec 2022 01:44:28 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux.org.uk header.s=zeniv-20220401 header.b=nyJaWbpY; dmarc=pass (policy=none) header.from=zeniv.linux.org.uk; spf=none (imf11.hostedemail.com: domain of viro@ftp.linux.org.uk has no SPF policy when checking 62.89.141.173) smtp.mailfrom=viro@ftp.linux.org.uk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670895869; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oZhnyIx6FZ/d9c0hcACv5/lZB73n75q4YswuEdSIHpE=; b=gXXe4ZXodzPbVmHuTxKdGaU4c1GW6egYBwN+Ml3m69DS+45X/kui0YbtwPAjJXXgyHQoX1 PhhlvnGQn3BNl6h8nLMYa6GypxVXyisv71o0MTGylgA6c3R4SvbQtmaAseYuy7OY/1WW8X 1Uv0rKd38FFzFs5ZLqdNCG1E4Acotas= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux.org.uk header.s=zeniv-20220401 header.b=nyJaWbpY; dmarc=pass (policy=none) header.from=zeniv.linux.org.uk; spf=none (imf11.hostedemail.com: domain of viro@ftp.linux.org.uk has no SPF policy when checking 62.89.141.173) smtp.mailfrom=viro@ftp.linux.org.uk ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670895869; a=rsa-sha256; cv=none; b=eHrE4FLCsKwFEi1o2Lxq9A8+54FEocp3l1/nUV/XRuFr5xaThqmUHpMXfac94RHu43xyPT FUdSpUulISYBSavvo8UqoNqXRO6Iwl0BwJ4RwK39bNwZQgWgPzo28i4xNm/r8dn9PeNb6Z LEPd8jRIUs4sBp3xB1jrwNXW1+yfHZg= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=oZhnyIx6FZ/d9c0hcACv5/lZB73n75q4YswuEdSIHpE=; b=nyJaWbpY3X7xZn+iD8xVkZUZde UTcxqm2Wz8mVnIOi26FZZ55XqVT0jLJ0OGGc07qkyYO4RCGcaYV5g8J4bxWhbFhqbEpJnxHSvDKZ2 4gXctIiwRgpgJMKFIayEjxh9bNBohX5owOBsT2j+kDqwtw8No3BYSMjXHxdTWBTB98dt0qhWF3weO F3YmX957O6V3eUi0/drDxGognma6SuXnFbBu/JmZAb4u9Jzk1Mm6JQuEDdJ+VSrSuTG+lht34jmC1 212fSAFajLqQHXUXbtD+h4ihquTAurTp58pFaKdnOPkJIqsRm2c2MljpKnqSzuY/0fYfCp78olhxn /kqifUyg==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.96 #2 (Red Hat Linux)) id 1p4uL2-00BVAb-2o; Tue, 13 Dec 2022 01:44:08 +0000 Date: Tue, 13 Dec 2022 01:44:08 +0000 From: Al Viro To: Marco Elver Cc: Theodore Ts'o , Hillf Danton , Matthew Wilcox , syzbot , linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com, Aleksandr Nogikh Subject: Re: [syzbot] WARNING in do_mkdirat Message-ID: References: <20221211002908.2210-1-hdanton@sina.com> <00000000000025ff8d05ef842be6@google.com> <20221211075612.2486-1-hdanton@sina.com> <20221211102208.2600-1-hdanton@sina.com> <20221212032911.2965-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 313CD4000D X-Stat-Signature: k86c8t7x7ki3jg6apbguhe9ya3ozh5im X-HE-Tag: 1670895868-839829 X-HE-Meta: U2FsdGVkX1+irbBQv+2BV4rWOJ0hIy3uu3FsKJxE+i1bZYS7Yisj+131UygGWTA+ealoJY2ugjviQ2HmkI76wPUCnabqZi5yc4YEKSJZ/ebYA/DmWGqJ9iqLND7MahHCvFfnVMjWx51UsLq3a9zbixNC/afPq3NSJ9s4DVdO3/WsfsXOGHepxH75DdPAJvAA+BM4VMrHDT+VNxGjjxMnginhwbScMQk6EUijAwMcaF34o6IbLemzSliuFxS8T6Jt84uOf6KT6Uk6b4aVfV88MMnWbAMLPtpDkZWlHoY0TrMIXO2TX4PiCfxYtx7IthBLnXDOeWLTnid1p9E1F0H6/PwLssvrJRWh6SJVMg/Og85wbabgZqwH7g/aObJqKi6HA3LXYN9+Hyl91GBaYPJ1l6TKPT5uNxGhVF8OM2QHIffCGckg5g5c+iCe+Hkp88Goea46obABYV4nFZzymKRYCfFyO2To+NJ8w3XdcAJOwFj8zHoBzuZLvhTMWLSeJNNVc1vpUif5iJ5qkRK19G4HgGMaBp+bXm4XfvLN5roXMQoxfuWPCDVH0IM0em0JPfQkhm95XAkjHCcCq5NjzUvfLzJ7qTeEgeUNC/AT4bHTOPvHrkC9mx6M2ry7ujsp3EIvCIayEURIqYYWtEOtkF+k+0VSRg9a+06Fw8qFgUGHvDfnXAas1U18VVKjyAU4x+osVf93NWDagGr7jqDsSPleDYtu/M7Z5qTFfj6lhJEO/s2F4wpKPrGLAvEr7Kom6OjxmaWhwgJS3sImEsRQ3YuwgkQs2FsIyLZVOl0Lrlw1SYtgOxl1J/jH0epdRXpnOPAq0zsAohIEcmi7p6zGGU7ohXQbSGFP0KglZmwlpGcIkPC7yaZ/dATNB21I3DZkHnH8VPb4AarScFI7Ubf2K5cY5oHnoLdJbFbSlaxhQWbVlCuCVouFu/usxayLW1fSbJrN5fxx2YA0HyOGmkq9cyo iBzRdkVC wVMSFoc3M2eQZ+Bt1NBmMceiNG0qcDL0+PaUVhVPjy09snb47zmm5lt4IwOTWyQsvVqHspxqfK4xaMx5RkXte4uBD3UR61koSgWQZTOG4nr8CNbnARXdBbC+owszsuLN22ZokJGC1R0njU/O6P4tmjxLpzH74snvsKSEJMtlVjKWChys+lt0f7jK11TWz+36IA3YesJ86LpkxxbRLkhMFbFeL2pBJKHRguUKOApKq8xW19tE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Dec 12, 2022 at 08:29:10PM +0100, Marco Elver wrote: > > > Given the call trace above, how do you know the ntfs3 guys should be also > > > Cced in addition to AV? What if it would take more than three months for > > > syzbot to learn the skills in your mind? Depends. If you really are talking about the *BOT* learning to do that on its own, it certainly would take more than 3 months; strong AI is hard. If, OTOH, it is not an AI research project and intervention of somebody capable of passing the Turing test does not violate the purity of experiment... Surely converting "if it mounts an image as filesystem of type $T, grep the tree for "MODULE_ALIAS_FS($T)" and treat that as if a function from the resulting file had been found in stack trace" into something usable for the bot should not take more than 3 months, should it? If expressing that rule really takes "more than three months", I would suggest that something is very wrong with the bot architecture... > Teaching a bot the pattern matching skills of a human is non-trivial. > The current design will likely do the simplest thing: regex match > reproducers and map a match to some kernel source dir, for which the > maintainers are Cc'd. If you have better suggestions on how to > mechanize subsystem selection based on a reproducer, please shout. Er... Yes? Look, it's really that simple - for i in `sed -ne 's/.*syz_mount_image$\([_[:alnum:]]*\).*/\1/p' <$REPRO`; do git grep -l "MODULE_ALIAS_FS(\"$i\")" done | sort | uniq gets you the list of files. No, I'm not suggesting to go for that kind of shell use, but it's clearly doable with regex and search over the source for fixed strings. Unless something's drastically wrong with the way the bot is written, it should be capable of something as basic as that... If it can't do that kind of mapping, precalculating it for given tree is also not hard: git grep 'MODULE_ALIAS_FS("'|sed -ne 's/\(.*\):.*MODULE_ALIAS_FS("\([_[:alnum:]]*\)".*/syz_mount_image$\2:\1/p' will yield lines like syz_mount_image$ext2:fs/ext2/super.c syz_mount_image$ext2:fs/ext4/super.c syz_mount_image$ext3:fs/ext4/super.c syz_mount_image$ext4:fs/ext4/super.c etc. Surely turning *that* into whatever form the bot wants can't be terribly hard? [*] All of that assumes that pattern-matching in syzkaller reproducer is expressible; if "we must do everything by call trace alone" is a real limitation, we are SOL; stack trace simply doesn't have that information. Is there such an architectural limitation? [*] depending upon config, ext2 could be mounted by ext2.ko and ext4.ko; both have the same maillist for bug reports, so this ambiguity doesn't matter - either match would do.