From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8C04C41514 for ; Tue, 3 Sep 2019 15:40:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A1BA42077B for ; Tue, 3 Sep 2019 15:40:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729499AbfICPkR (ORCPT ); Tue, 3 Sep 2019 11:40:17 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:59386 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729056AbfICPkQ (ORCPT ); Tue, 3 Sep 2019 11:40:16 -0400 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.92.1 #3 (Red Hat Linux)) id 1i5Aud-0007hC-Uy; Tue, 03 Sep 2019 15:40:08 +0000 Date: Tue, 3 Sep 2019 16:40:07 +0100 From: Al Viro To: "zhengbin (A)" Cc: jack@suse.cz, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, "zhangyi (F)" Subject: Re: Possible FS race condition between iterate_dir and d_alloc_parallel Message-ID: <20190903154007.GJ1131@ZenIV.linux.org.uk> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.12.0 (2019-05-25) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Tue, Sep 03, 2019 at 10:44:32PM +0800, zhengbin (A) wrote: > We recently encountered an oops(the filesystem is tmpfs) > crash> bt > #9 [ffff0000ae77bd60] dcache_readdir at ffff0000672954bc > > The reason is as follows: > Process 1 cat test which is not exist in directory A, process 2 cat test in directory A too. > process 3 create new file in directory B, process 4 ls directory A. good grief, what screen width do you have to make the table below readable? What I do not understand is how the hell does your dtry2 manage to get actually freed and reused without an RCU delay between its removal from parent's ->d_subdirs and freeing its memory. What should've happened in that scenario is * process 4, in next_positive() grabs rcu_read_lock(). * it walks into your dtry2, which might very well be just a chunk of memory waiting to be freed; it sure as hell is not positive. skipped is set to true, 'i' is not decremented. Note that ->d_child.next points to the next non-cursor sibling (if any) or to the ->d_subdir of parent, so we can keep walking. * we keep walking for a while; eventually we run out of counter and leave the loop. Only after that we do rcu_read_unlock() and only then anything observed in that loop might be freed and reused. Confused... OTOH, I might be misreading that table of yours - it's about 30% wider than the widest xterm I can get while still being able to read the font...