From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68AF0612E for ; Sun, 27 Aug 2023 13:30:59 +0000 (UTC) Received: from out-249.mta1.migadu.com (out-249.mta1.migadu.com [95.215.58.249]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02DF5C6 for ; Sun, 27 Aug 2023 06:30:57 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143055; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=L3dZ2wnhLl3MpFT8Fx3GtXFd2RCwphC3uDYko8Q0nKz7ONNiGRoAamaCC5cUw2SONEU+W+ FNti/TRpT0IYzrVdjkthri+O6/3aV24bEu1xrJpC4BygFoLKHcj9ftj4kgaNqPcbLz1Ibc FihN2zv2jXx86v+rCxyuW2cfhV75bQE= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 02/11] xfs: add NOWAIT semantics for readdir Date: Sun, 27 Aug 2023 21:28:26 +0800 Message-Id: <20230827132835.1373581-3-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net From: Hao Xu Implement NOWAIT semantics for readdir. Return EAGAIN error to the caller if it would block, like failing to get locks, or going to do IO. Co-developed-by: Dave Chinner Signed-off-by: Dave Chinner Signed-off-by: Hao Xu [fixes deadlock issue, tweak code style] --- fs/xfs/libxfs/xfs_da_btree.c | 16 +++++++++++ fs/xfs/libxfs/xfs_da_btree.h | 1 + fs/xfs/libxfs/xfs_dir2_block.c | 7 ++--- fs/xfs/libxfs/xfs_dir2_priv.h | 2 +- fs/xfs/scrub/dir.c | 2 +- fs/xfs/scrub/readdir.c | 2 +- fs/xfs/xfs_dir2_readdir.c | 49 ++++++++++++++++++++++++++-------- fs/xfs/xfs_inode.c | 27 +++++++++++++++++++ fs/xfs/xfs_inode.h | 17 +++++++----- 9 files changed, 99 insertions(+), 24 deletions(-) diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index e576560b46e9..7a1a0af24197 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2643,16 +2643,32 @@ xfs_da_read_buf( struct xfs_buf_map map, *mapp = ↦ int nmap = 1; int error; + int buf_flags = 0; *bpp = NULL; error = xfs_dabuf_map(dp, bno, flags, whichfork, &mapp, &nmap); if (error || !nmap) goto out_free; + /* + * NOWAIT semantics mean we don't wait on the buffer lock nor do we + * issue IO for this buffer if it is not already in memory. Caller will + * retry. This will return -EAGAIN if the buffer is in memory and cannot + * be locked, and no buffer and no error if it isn't in memory. We + * translate both of those into a return state of -EAGAIN and *bpp = + * NULL. + */ + if (flags & XFS_DABUF_NOWAIT) + buf_flags |= XBF_TRYLOCK | XBF_INCORE; error = xfs_trans_read_buf_map(mp, tp, mp->m_ddev_targp, mapp, nmap, 0, &bp, ops); if (error) goto out_free; + if (!bp) { + ASSERT(flags & XFS_DABUF_NOWAIT); + error = -EAGAIN; + goto out_free; + } if (whichfork == XFS_ATTR_FORK) xfs_buf_set_ref(bp, XFS_ATTR_BTREE_REF); diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h index ffa3df5b2893..32e7b1cca402 100644 --- a/fs/xfs/libxfs/xfs_da_btree.h +++ b/fs/xfs/libxfs/xfs_da_btree.h @@ -205,6 +205,7 @@ int xfs_da3_node_read_mapped(struct xfs_trans *tp, struct xfs_inode *dp, */ #define XFS_DABUF_MAP_HOLE_OK (1u << 0) +#define XFS_DABUF_NOWAIT (1u << 1) int xfs_da_grow_inode(xfs_da_args_t *args, xfs_dablk_t *new_blkno); int xfs_da_grow_inode_int(struct xfs_da_args *args, xfs_fileoff_t *bno, diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c index 00f960a703b2..59b24a594add 100644 --- a/fs/xfs/libxfs/xfs_dir2_block.c +++ b/fs/xfs/libxfs/xfs_dir2_block.c @@ -135,13 +135,14 @@ int xfs_dir3_block_read( struct xfs_trans *tp, struct xfs_inode *dp, + unsigned int flags, struct xfs_buf **bpp) { struct xfs_mount *mp = dp->i_mount; xfs_failaddr_t fa; int err; - err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, 0, bpp, + err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, flags, bpp, XFS_DATA_FORK, &xfs_dir3_block_buf_ops); if (err || !*bpp) return err; @@ -380,7 +381,7 @@ xfs_dir2_block_addname( tp = args->trans; /* Read the (one and only) directory block into bp. */ - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; @@ -695,7 +696,7 @@ xfs_dir2_block_lookup_int( dp = args->dp; tp = args->trans; - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h index 7404a9ff1a92..7d4cf8a0f15b 100644 --- a/fs/xfs/libxfs/xfs_dir2_priv.h +++ b/fs/xfs/libxfs/xfs_dir2_priv.h @@ -51,7 +51,7 @@ extern int xfs_dir_cilookup_result(struct xfs_da_args *args, /* xfs_dir2_block.c */ extern int xfs_dir3_block_read(struct xfs_trans *tp, struct xfs_inode *dp, - struct xfs_buf **bpp); + unsigned int flags, struct xfs_buf **bpp); extern int xfs_dir2_block_addname(struct xfs_da_args *args); extern int xfs_dir2_block_lookup(struct xfs_da_args *args); extern int xfs_dir2_block_removename(struct xfs_da_args *args); diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c index 0b491784b759..5cc51f201bd7 100644 --- a/fs/xfs/scrub/dir.c +++ b/fs/xfs/scrub/dir.c @@ -313,7 +313,7 @@ xchk_directory_data_bestfree( /* dir block format */ if (lblk != XFS_B_TO_FSBT(mp, XFS_DIR2_DATA_OFFSET)) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk); - error = xfs_dir3_block_read(sc->tp, sc->ip, &bp); + error = xfs_dir3_block_read(sc->tp, sc->ip, 0, &bp); } else { /* dir data format */ error = xfs_dir3_data_read(sc->tp, sc->ip, lblk, 0, &bp); diff --git a/fs/xfs/scrub/readdir.c b/fs/xfs/scrub/readdir.c index e51c1544be63..f0a727311632 100644 --- a/fs/xfs/scrub/readdir.c +++ b/fs/xfs/scrub/readdir.c @@ -101,7 +101,7 @@ xchk_dir_walk_block( unsigned int off, next_off, end; int error; - error = xfs_dir3_block_read(sc->tp, dp, &bp); + error = xfs_dir3_block_read(sc->tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c index 9f3ceb461515..dcdbd26e0402 100644 --- a/fs/xfs/xfs_dir2_readdir.c +++ b/fs/xfs/xfs_dir2_readdir.c @@ -149,6 +149,7 @@ xfs_dir2_block_getdents( struct xfs_da_geometry *geo = args->geo; unsigned int offset, next_offset; unsigned int end; + unsigned int flags = 0; /* * If the block number in the offset is out of range, we're done. @@ -156,7 +157,9 @@ xfs_dir2_block_getdents( if (xfs_dir2_dataptr_to_db(geo, ctx->pos) > geo->datablk) return 0; - error = xfs_dir3_block_read(args->trans, dp, &bp); + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) + flags |= XFS_DABUF_NOWAIT; + error = xfs_dir3_block_read(args->trans, dp, flags, &bp); if (error) return error; @@ -240,6 +243,7 @@ xfs_dir2_block_getdents( STATIC int xfs_dir2_leaf_readbuf( struct xfs_da_args *args, + struct dir_context *ctx, size_t bufsize, xfs_dir2_off_t *cur_off, xfs_dablk_t *ra_blk, @@ -258,10 +262,15 @@ xfs_dir2_leaf_readbuf( struct xfs_iext_cursor icur; int ra_want; int error = 0; - - error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); - if (error) - goto out; + unsigned int flags = 0; + + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) { + flags |= XFS_DABUF_NOWAIT; + } else { + error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); + if (error) + goto out; + } /* * Look for mapped directory blocks at or above the current offset. @@ -280,7 +289,7 @@ xfs_dir2_leaf_readbuf( new_off = xfs_dir2_da_to_byte(geo, map.br_startoff); if (new_off > *cur_off) *cur_off = new_off; - error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, 0, &bp); + error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, flags, &bp); if (error) goto out; @@ -360,6 +369,7 @@ xfs_dir2_leaf_getdents( int byteoff; /* offset in current block */ unsigned int offset = 0; int error = 0; /* error return value */ + int written = 0; /* * If the offset is at or past the largest allowed value, @@ -391,10 +401,17 @@ xfs_dir2_leaf_getdents( bp = NULL; } - if (*lock_mode == 0) - *lock_mode = xfs_ilock_data_map_shared(dp); - error = xfs_dir2_leaf_readbuf(args, bufsize, &curoff, - &rablk, &bp); + if (*lock_mode == 0) { + *lock_mode = + xfs_ilock_data_map_shared_generic(dp, + ctx->flags & DIR_CONTEXT_F_NOWAIT); + if (!*lock_mode) { + error = -EAGAIN; + break; + } + } + error = xfs_dir2_leaf_readbuf(args, ctx, bufsize, + &curoff, &rablk, &bp); if (error || !bp) break; @@ -479,6 +496,7 @@ xfs_dir2_leaf_getdents( */ offset += length; curoff += length; + written += length; /* bufsize may have just been a guess; don't go negative */ bufsize = bufsize > length ? bufsize - length : 0; } @@ -492,6 +510,8 @@ xfs_dir2_leaf_getdents( ctx->pos = xfs_dir2_byte_to_dataptr(curoff) & 0x7fffffff; if (bp) xfs_trans_brelse(args->trans, bp); + if (error == -EAGAIN && written > 0) + error = 0; return error; } @@ -514,6 +534,7 @@ xfs_readdir( unsigned int lock_mode; bool isblock; int error; + bool nowait; trace_xfs_readdir(dp); @@ -531,7 +552,11 @@ xfs_readdir( if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) return xfs_dir2_sf_getdents(&args, ctx); - lock_mode = xfs_ilock_data_map_shared(dp); + nowait = ctx->flags & DIR_CONTEXT_F_NOWAIT; + lock_mode = xfs_ilock_data_map_shared_generic(dp, nowait); + if (!lock_mode) + return -EAGAIN; + error = xfs_dir2_isblock(&args, &isblock); if (error) goto out_unlock; @@ -546,5 +571,7 @@ xfs_readdir( out_unlock: if (lock_mode) xfs_iunlock(dp, lock_mode); + if (error == -EAGAIN) + ASSERT(nowait); return error; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 9e62cc500140..d088f7d0c23a 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -120,6 +120,33 @@ xfs_ilock_data_map_shared( return lock_mode; } +/* + * Similar to xfs_ilock_data_map_shared(), except that it will only try to lock + * the inode in shared mode if the extents are already in memory. If it fails to + * get the lock or has to do IO to read the extent list, fail the operation by + * returning 0 as the lock mode. + */ +uint +xfs_ilock_data_map_shared_nowait( + struct xfs_inode *ip) +{ + if (xfs_need_iread_extents(&ip->i_df)) + return 0; + if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED)) + return 0; + return XFS_ILOCK_SHARED; +} + +int +xfs_ilock_data_map_shared_generic( + struct xfs_inode *dp, + bool nowait) +{ + if (nowait) + return xfs_ilock_data_map_shared_nowait(dp); + return xfs_ilock_data_map_shared(dp); +} + uint xfs_ilock_attr_map_shared( struct xfs_inode *ip) diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 7547caf2f2ab..ea206a5a27df 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -490,13 +490,16 @@ int xfs_rename(struct mnt_idmap *idmap, struct xfs_name *target_name, struct xfs_inode *target_ip, unsigned int flags); -void xfs_ilock(xfs_inode_t *, uint); -int xfs_ilock_nowait(xfs_inode_t *, uint); -void xfs_iunlock(xfs_inode_t *, uint); -void xfs_ilock_demote(xfs_inode_t *, uint); -bool xfs_isilocked(struct xfs_inode *, uint); -uint xfs_ilock_data_map_shared(struct xfs_inode *); -uint xfs_ilock_attr_map_shared(struct xfs_inode *); +void xfs_ilock(struct xfs_inode *ip, uint lockmode); +int xfs_ilock_nowait(struct xfs_inode *ip, uint lockmode); +void xfs_iunlock(struct xfs_inode *ip, uint lockmode); +void xfs_ilock_demote(struct xfs_inode *ip, uint lockmode); +bool xfs_isilocked(struct xfs_inode *ip, uint lockmode); +uint xfs_ilock_data_map_shared(struct xfs_inode *ip); +uint xfs_ilock_data_map_shared_nowait(struct xfs_inode *ip); +int xfs_ilock_data_map_shared_generic(struct xfs_inode *ip, + bool nowait); +uint xfs_ilock_attr_map_shared(struct xfs_inode *ip); uint xfs_ip2xflags(struct xfs_inode *); int xfs_ifree(struct xfs_trans *, struct xfs_inode *); -- 2.25.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7F976C83F31 for ; Tue, 29 Aug 2023 08:41:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1693298477; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=CdGEQETDwE+Uoi7ezMKKcl/TFxh2iy7JvcCWfpqxv/Y=; b=g3vOkJuHcsOQ+pB0vifHLbVjk1j9UWDkx7Oyhpv5m7/0UAo17/hmsF1lPGvfvLGuDg2pvP Ad0l62ruHZ2o7iexf/fSLTyLYnAMyKNgFhjz3o/q0ige0gSc0sFq/wJErPACWB6Rccs5Ib 62iq2dUsNzF3g5WRxxMawDTC95NwEuA= Received: from mimecast-mx02.redhat.com (66.187.233.73 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-498-N_mXK3oAPTuFy3HTqbcnug-1; Tue, 29 Aug 2023 04:41:16 -0400 X-MC-Unique: N_mXK3oAPTuFy3HTqbcnug-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8CAF61C05EB7; Tue, 29 Aug 2023 08:41:15 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7C7062166B25; Tue, 29 Aug 2023 08:41:15 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 5355719465BB; Tue, 29 Aug 2023 08:41:15 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 6360019465A8 for ; Sun, 27 Aug 2023 13:30:59 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 4405B40C2070; Sun, 27 Aug 2023 13:30:59 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast10.extmail.prod.ext.rdu2.redhat.com [10.11.55.26]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3CC6D40C2063 for ; Sun, 27 Aug 2023 13:30:59 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1CFBE1C0514D for ; Sun, 27 Aug 2023 13:30:59 +0000 (UTC) Received: from out-253.mta1.migadu.com (out-253.mta1.migadu.com [95.215.58.253]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-371-2Ym2YxnUORO3nBL8R23sng-1; Sun, 27 Aug 2023 09:30:56 -0400 X-MC-Unique: 2Ym2YxnUORO3nBL8R23sng-1 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Date: Sun, 27 Aug 2023 21:28:26 +0800 Message-Id: <20230827132835.1373581-3-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mailman-Approved-At: Tue, 29 Aug 2023 08:41:07 +0000 Subject: [Cluster-devel] [PATCH 02/11] xfs: add NOWAIT semantics for readdir X-BeenThere: cluster-devel@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: "\[Cluster devel\]" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wanpeng Li , "Darrick J . Wong" , Dominique Martinet , Dave Chinner , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Stefan Roesch , Clay Harris , linux-s390@vger.kernel.org, linux-nilfs@vger.kernel.org, codalist@coda.cs.cmu.edu, cluster-devel@redhat.com, linux-cachefs@redhat.com, linux-ext4@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-block@vger.kernel.org, Alexander Viro , Christian Brauner , netdev@vger.kernel.org, samba-technical@lists.samba.org, linux-unionfs@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mtd@lists.infradead.org, bpf@vger.kernel.org, Pavel Begunkov , linux-btrfs@vger.kernel.org Errors-To: cluster-devel-bounces@redhat.com Sender: "Cluster-devel" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: linux.dev Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="US-ASCII"; x-default=true From: Hao Xu Implement NOWAIT semantics for readdir. Return EAGAIN error to the caller if it would block, like failing to get locks, or going to do IO. Co-developed-by: Dave Chinner Signed-off-by: Dave Chinner Signed-off-by: Hao Xu [fixes deadlock issue, tweak code style] --- fs/xfs/libxfs/xfs_da_btree.c | 16 +++++++++++ fs/xfs/libxfs/xfs_da_btree.h | 1 + fs/xfs/libxfs/xfs_dir2_block.c | 7 ++--- fs/xfs/libxfs/xfs_dir2_priv.h | 2 +- fs/xfs/scrub/dir.c | 2 +- fs/xfs/scrub/readdir.c | 2 +- fs/xfs/xfs_dir2_readdir.c | 49 ++++++++++++++++++++++++++-------- fs/xfs/xfs_inode.c | 27 +++++++++++++++++++ fs/xfs/xfs_inode.h | 17 +++++++----- 9 files changed, 99 insertions(+), 24 deletions(-) diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index e576560b46e9..7a1a0af24197 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2643,16 +2643,32 @@ xfs_da_read_buf( =09struct xfs_buf_map=09map, *mapp =3D ↦ =09int=09=09=09nmap =3D 1; =09int=09=09=09error; +=09int=09=09=09buf_flags =3D 0; =20 =09*bpp =3D NULL; =09error =3D xfs_dabuf_map(dp, bno, flags, whichfork, &mapp, &nmap); =09if (error || !nmap) =09=09goto out_free; =20 +=09/* +=09 * NOWAIT semantics mean we don't wait on the buffer lock nor do we +=09 * issue IO for this buffer if it is not already in memory. Caller will +=09 * retry. This will return -EAGAIN if the buffer is in memory and canno= t +=09 * be locked, and no buffer and no error if it isn't in memory. We +=09 * translate both of those into a return state of -EAGAIN and *bpp =3D +=09 * NULL. +=09 */ +=09if (flags & XFS_DABUF_NOWAIT) +=09=09buf_flags |=3D XBF_TRYLOCK | XBF_INCORE; =09error =3D xfs_trans_read_buf_map(mp, tp, mp->m_ddev_targp, mapp, nmap, = 0, =09=09=09&bp, ops); =09if (error) =09=09goto out_free; +=09if (!bp) { +=09=09ASSERT(flags & XFS_DABUF_NOWAIT); +=09=09error =3D -EAGAIN; +=09=09goto out_free; +=09} =20 =09if (whichfork =3D=3D XFS_ATTR_FORK) =09=09xfs_buf_set_ref(bp, XFS_ATTR_BTREE_REF); diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h index ffa3df5b2893..32e7b1cca402 100644 --- a/fs/xfs/libxfs/xfs_da_btree.h +++ b/fs/xfs/libxfs/xfs_da_btree.h @@ -205,6 +205,7 @@ int=09xfs_da3_node_read_mapped(struct xfs_trans *tp, st= ruct xfs_inode *dp, */ =20 #define XFS_DABUF_MAP_HOLE_OK=09(1u << 0) +#define XFS_DABUF_NOWAIT=09(1u << 1) =20 int=09xfs_da_grow_inode(xfs_da_args_t *args, xfs_dablk_t *new_blkno); int=09xfs_da_grow_inode_int(struct xfs_da_args *args, xfs_fileoff_t *bno, diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.= c index 00f960a703b2..59b24a594add 100644 --- a/fs/xfs/libxfs/xfs_dir2_block.c +++ b/fs/xfs/libxfs/xfs_dir2_block.c @@ -135,13 +135,14 @@ int xfs_dir3_block_read( =09struct xfs_trans=09*tp, =09struct xfs_inode=09*dp, +=09unsigned int=09=09flags, =09struct xfs_buf=09=09**bpp) { =09struct xfs_mount=09*mp =3D dp->i_mount; =09xfs_failaddr_t=09=09fa; =09int=09=09=09err; =20 -=09err =3D xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, 0, bpp, +=09err =3D xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, flags, bpp, =09=09=09=09XFS_DATA_FORK, &xfs_dir3_block_buf_ops); =09if (err || !*bpp) =09=09return err; @@ -380,7 +381,7 @@ xfs_dir2_block_addname( =09tp =3D args->trans; =20 =09/* Read the (one and only) directory block into bp. */ -=09error =3D xfs_dir3_block_read(tp, dp, &bp); +=09error =3D xfs_dir3_block_read(tp, dp, 0, &bp); =09if (error) =09=09return error; =20 @@ -695,7 +696,7 @@ xfs_dir2_block_lookup_int( =09dp =3D args->dp; =09tp =3D args->trans; =20 -=09error =3D xfs_dir3_block_read(tp, dp, &bp); +=09error =3D xfs_dir3_block_read(tp, dp, 0, &bp); =09if (error) =09=09return error; =20 diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h index 7404a9ff1a92..7d4cf8a0f15b 100644 --- a/fs/xfs/libxfs/xfs_dir2_priv.h +++ b/fs/xfs/libxfs/xfs_dir2_priv.h @@ -51,7 +51,7 @@ extern int xfs_dir_cilookup_result(struct xfs_da_args *ar= gs, =20 /* xfs_dir2_block.c */ extern int xfs_dir3_block_read(struct xfs_trans *tp, struct xfs_inode *dp, -=09=09=09 struct xfs_buf **bpp); +=09=09=09 unsigned int flags, struct xfs_buf **bpp); extern int xfs_dir2_block_addname(struct xfs_da_args *args); extern int xfs_dir2_block_lookup(struct xfs_da_args *args); extern int xfs_dir2_block_removename(struct xfs_da_args *args); diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c index 0b491784b759..5cc51f201bd7 100644 --- a/fs/xfs/scrub/dir.c +++ b/fs/xfs/scrub/dir.c @@ -313,7 +313,7 @@ xchk_directory_data_bestfree( =09=09/* dir block format */ =09=09if (lblk !=3D XFS_B_TO_FSBT(mp, XFS_DIR2_DATA_OFFSET)) =09=09=09xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk); -=09=09error =3D xfs_dir3_block_read(sc->tp, sc->ip, &bp); +=09=09error =3D xfs_dir3_block_read(sc->tp, sc->ip, 0, &bp); =09} else { =09=09/* dir data format */ =09=09error =3D xfs_dir3_data_read(sc->tp, sc->ip, lblk, 0, &bp); diff --git a/fs/xfs/scrub/readdir.c b/fs/xfs/scrub/readdir.c index e51c1544be63..f0a727311632 100644 --- a/fs/xfs/scrub/readdir.c +++ b/fs/xfs/scrub/readdir.c @@ -101,7 +101,7 @@ xchk_dir_walk_block( =09unsigned int=09=09off, next_off, end; =09int=09=09=09error; =20 -=09error =3D xfs_dir3_block_read(sc->tp, dp, &bp); +=09error =3D xfs_dir3_block_read(sc->tp, dp, 0, &bp); =09if (error) =09=09return error; =20 diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c index 9f3ceb461515..dcdbd26e0402 100644 --- a/fs/xfs/xfs_dir2_readdir.c +++ b/fs/xfs/xfs_dir2_readdir.c @@ -149,6 +149,7 @@ xfs_dir2_block_getdents( =09struct xfs_da_geometry=09*geo =3D args->geo; =09unsigned int=09=09offset, next_offset; =09unsigned int=09=09end; +=09unsigned int=09=09flags =3D 0; =20 =09/* =09 * If the block number in the offset is out of range, we're done. @@ -156,7 +157,9 @@ xfs_dir2_block_getdents( =09if (xfs_dir2_dataptr_to_db(geo, ctx->pos) > geo->datablk) =09=09return 0; =20 -=09error =3D xfs_dir3_block_read(args->trans, dp, &bp); +=09if (ctx->flags & DIR_CONTEXT_F_NOWAIT) +=09=09flags |=3D XFS_DABUF_NOWAIT; +=09error =3D xfs_dir3_block_read(args->trans, dp, flags, &bp); =09if (error) =09=09return error; =20 @@ -240,6 +243,7 @@ xfs_dir2_block_getdents( STATIC int xfs_dir2_leaf_readbuf( =09struct xfs_da_args=09*args, +=09struct dir_context=09*ctx, =09size_t=09=09=09bufsize, =09xfs_dir2_off_t=09=09*cur_off, =09xfs_dablk_t=09=09*ra_blk, @@ -258,10 +262,15 @@ xfs_dir2_leaf_readbuf( =09struct xfs_iext_cursor=09icur; =09int=09=09=09ra_want; =09int=09=09=09error =3D 0; - -=09error =3D xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); -=09if (error) -=09=09goto out; +=09unsigned int=09=09flags =3D 0; + +=09if (ctx->flags & DIR_CONTEXT_F_NOWAIT) { +=09=09flags |=3D XFS_DABUF_NOWAIT; +=09} else { +=09=09error =3D xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); +=09=09if (error) +=09=09=09goto out; +=09} =20 =09/* =09 * Look for mapped directory blocks at or above the current offset. @@ -280,7 +289,7 @@ xfs_dir2_leaf_readbuf( =09new_off =3D xfs_dir2_da_to_byte(geo, map.br_startoff); =09if (new_off > *cur_off) =09=09*cur_off =3D new_off; -=09error =3D xfs_dir3_data_read(args->trans, dp, map.br_startoff, 0, &bp); +=09error =3D xfs_dir3_data_read(args->trans, dp, map.br_startoff, flags, &= bp); =09if (error) =09=09goto out; =20 @@ -360,6 +369,7 @@ xfs_dir2_leaf_getdents( =09int=09=09=09byteoff;=09/* offset in current block */ =09unsigned int=09=09offset =3D 0; =09int=09=09=09error =3D 0;=09/* error return value */ +=09int=09=09=09written =3D 0; =20 =09/* =09 * If the offset is at or past the largest allowed value, @@ -391,10 +401,17 @@ xfs_dir2_leaf_getdents( =09=09=09=09bp =3D NULL; =09=09=09} =20 -=09=09=09if (*lock_mode =3D=3D 0) -=09=09=09=09*lock_mode =3D xfs_ilock_data_map_shared(dp); -=09=09=09error =3D xfs_dir2_leaf_readbuf(args, bufsize, &curoff, -=09=09=09=09=09&rablk, &bp); +=09=09=09if (*lock_mode =3D=3D 0) { +=09=09=09=09*lock_mode =3D +=09=09=09=09=09xfs_ilock_data_map_shared_generic(dp, +=09=09=09=09=09ctx->flags & DIR_CONTEXT_F_NOWAIT); +=09=09=09=09if (!*lock_mode) { +=09=09=09=09=09error =3D -EAGAIN; +=09=09=09=09=09break; +=09=09=09=09} +=09=09=09} +=09=09=09error =3D xfs_dir2_leaf_readbuf(args, ctx, bufsize, +=09=09=09=09=09&curoff, &rablk, &bp); =09=09=09if (error || !bp) =09=09=09=09break; =20 @@ -479,6 +496,7 @@ xfs_dir2_leaf_getdents( =09=09 */ =09=09offset +=3D length; =09=09curoff +=3D length; +=09=09written +=3D length; =09=09/* bufsize may have just been a guess; don't go negative */ =09=09bufsize =3D bufsize > length ? bufsize - length : 0; =09} @@ -492,6 +510,8 @@ xfs_dir2_leaf_getdents( =09=09ctx->pos =3D xfs_dir2_byte_to_dataptr(curoff) & 0x7fffffff; =09if (bp) =09=09xfs_trans_brelse(args->trans, bp); +=09if (error =3D=3D -EAGAIN && written > 0) +=09=09error =3D 0; =09return error; } =20 @@ -514,6 +534,7 @@ xfs_readdir( =09unsigned int=09=09lock_mode; =09bool=09=09=09isblock; =09int=09=09=09error; +=09bool=09=09=09nowait; =20 =09trace_xfs_readdir(dp); =20 @@ -531,7 +552,11 @@ xfs_readdir( =09if (dp->i_df.if_format =3D=3D XFS_DINODE_FMT_LOCAL) =09=09return xfs_dir2_sf_getdents(&args, ctx); =20 -=09lock_mode =3D xfs_ilock_data_map_shared(dp); +=09nowait =3D ctx->flags & DIR_CONTEXT_F_NOWAIT; +=09lock_mode =3D xfs_ilock_data_map_shared_generic(dp, nowait); +=09if (!lock_mode) +=09=09return -EAGAIN; + =09error =3D xfs_dir2_isblock(&args, &isblock); =09if (error) =09=09goto out_unlock; @@ -546,5 +571,7 @@ xfs_readdir( out_unlock: =09if (lock_mode) =09=09xfs_iunlock(dp, lock_mode); +=09if (error =3D=3D -EAGAIN) +=09=09ASSERT(nowait); =09return error; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 9e62cc500140..d088f7d0c23a 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -120,6 +120,33 @@ xfs_ilock_data_map_shared( =09return lock_mode; } =20 +/* + * Similar to xfs_ilock_data_map_shared(), except that it will only try to= lock + * the inode in shared mode if the extents are already in memory. If it fa= ils to + * get the lock or has to do IO to read the extent list, fail the operatio= n by + * returning 0 as the lock mode. + */ +uint +xfs_ilock_data_map_shared_nowait( +=09struct xfs_inode=09*ip) +{ +=09if (xfs_need_iread_extents(&ip->i_df)) +=09=09return 0; +=09if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED)) +=09=09return 0; +=09return XFS_ILOCK_SHARED; +} + +int +xfs_ilock_data_map_shared_generic( +=09struct xfs_inode=09*dp, +=09bool=09=09=09nowait) +{ +=09if (nowait) +=09=09return xfs_ilock_data_map_shared_nowait(dp); +=09return xfs_ilock_data_map_shared(dp); +} + uint xfs_ilock_attr_map_shared( =09struct xfs_inode=09*ip) diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 7547caf2f2ab..ea206a5a27df 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -490,13 +490,16 @@ int=09=09xfs_rename(struct mnt_idmap *idmap, =09=09=09 struct xfs_name *target_name, =09=09=09 struct xfs_inode *target_ip, unsigned int flags); =20 -void=09=09xfs_ilock(xfs_inode_t *, uint); -int=09=09xfs_ilock_nowait(xfs_inode_t *, uint); -void=09=09xfs_iunlock(xfs_inode_t *, uint); -void=09=09xfs_ilock_demote(xfs_inode_t *, uint); -bool=09=09xfs_isilocked(struct xfs_inode *, uint); -uint=09=09xfs_ilock_data_map_shared(struct xfs_inode *); -uint=09=09xfs_ilock_attr_map_shared(struct xfs_inode *); +void=09=09xfs_ilock(struct xfs_inode *ip, uint lockmode); +int=09=09xfs_ilock_nowait(struct xfs_inode *ip, uint lockmode); +void=09=09xfs_iunlock(struct xfs_inode *ip, uint lockmode); +void=09=09xfs_ilock_demote(struct xfs_inode *ip, uint lockmode); +bool=09=09xfs_isilocked(struct xfs_inode *ip, uint lockmode); +uint=09=09xfs_ilock_data_map_shared(struct xfs_inode *ip); +uint=09=09xfs_ilock_data_map_shared_nowait(struct xfs_inode *ip); +int=09=09xfs_ilock_data_map_shared_generic(struct xfs_inode *ip, +=09=09=09=09=09=09 bool nowait); +uint=09=09xfs_ilock_attr_map_shared(struct xfs_inode *ip); =20 uint=09=09xfs_ip2xflags(struct xfs_inode *); int=09=09xfs_ifree(struct xfs_trans *, struct xfs_inode *); --=20 2.25.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.sourceforge.net (lists.sourceforge.net [216.105.38.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 69D0AC83F01 for ; Sun, 27 Aug 2023 13:31:05 +0000 (UTC) Received: from [127.0.0.1] (helo=sfs-ml-4.v29.lw.sourceforge.com) by sfs-ml-4.v29.lw.sourceforge.com with esmtp (Exim 4.95) (envelope-from ) id 1qaFr6-0008Sl-7R; Sun, 27 Aug 2023 13:31:04 +0000 Received: from [172.30.20.202] (helo=mx.sourceforge.net) by sfs-ml-4.v29.lw.sourceforge.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1qaFr4-0008Se-Rp for linux-f2fs-devel@lists.sourceforge.net; Sun, 27 Aug 2023 13:31:03 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sourceforge.net; s=x; h=Content-Transfer-Encoding:MIME-Version:References: In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=gQlAqdwpwXVrA4nulu6i666kwX /UsL44xrpiOcOUkJuBxtjPx2ujK/+DTonGw3/jBkQtiuGdDkvZsZfpiRvcAzk1ZByCSBcWkhmQAhB WpkUIA2R7720jhfbqmANa8M6tKWvXdgqvgkJq8FeZFaZ+H18v1pmRArtRx4zZ4lw/XX0=; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sf.net; s=x ; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=PvH3jbtCTcf0RgPl6Q/HzqlQ2V ruQMUa8jyHuCqBDLcfgt1WM3L9UePl2JrbaoTnPeUSL7HphGSFI2wnoSrjjl4KSRaWkn808to/xCm PDqZTWFtw3YdFBHxXK0OJiDMCh83seDweZr15VYg093kDJKxX+Ze+0mfRCtJjmdA6O4k=; Received: from out-249.mta1.migadu.com ([95.215.58.249]) by sfi-mx-2.v28.lw.sourceforge.com with esmtps (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.95) id 1qaFr4-0005co-3r for linux-f2fs-devel@lists.sourceforge.net; Sun, 27 Aug 2023 13:31:03 +0000 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143055; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=L3dZ2wnhLl3MpFT8Fx3GtXFd2RCwphC3uDYko8Q0nKz7ONNiGRoAamaCC5cUw2SONEU+W+ FNti/TRpT0IYzrVdjkthri+O6/3aV24bEu1xrJpC4BygFoLKHcj9ftj4kgaNqPcbLz1Ibc FihN2zv2jXx86v+rCxyuW2cfhV75bQE= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Date: Sun, 27 Aug 2023 21:28:26 +0800 Message-Id: <20230827132835.1373581-3-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Headers-End: 1qaFr4-0005co-3r Subject: [f2fs-dev] [PATCH 02/11] xfs: add NOWAIT semantics for readdir X-BeenThere: linux-f2fs-devel@lists.sourceforge.net X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wanpeng Li , "Darrick J . Wong" , Dominique Martinet , Dave Chinner , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Stefan Roesch , Clay Harris , linux-s390@vger.kernel.org, linux-nilfs@vger.kernel.org, codalist@coda.cs.cmu.edu, cluster-devel@redhat.com, linux-cachefs@redhat.com, linux-ext4@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-block@vger.kernel.org, Alexander Viro , Christian Brauner , netdev@vger.kernel.org, samba-technical@lists.samba.org, linux-unionfs@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mtd@lists.infradead.org, bpf@vger.kernel.org, Pavel Begunkov , linux-btrfs@vger.kernel.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-f2fs-devel-bounces@lists.sourceforge.net From: Hao Xu Implement NOWAIT semantics for readdir. Return EAGAIN error to the caller if it would block, like failing to get locks, or going to do IO. Co-developed-by: Dave Chinner Signed-off-by: Dave Chinner Signed-off-by: Hao Xu [fixes deadlock issue, tweak code style] --- fs/xfs/libxfs/xfs_da_btree.c | 16 +++++++++++ fs/xfs/libxfs/xfs_da_btree.h | 1 + fs/xfs/libxfs/xfs_dir2_block.c | 7 ++--- fs/xfs/libxfs/xfs_dir2_priv.h | 2 +- fs/xfs/scrub/dir.c | 2 +- fs/xfs/scrub/readdir.c | 2 +- fs/xfs/xfs_dir2_readdir.c | 49 ++++++++++++++++++++++++++-------- fs/xfs/xfs_inode.c | 27 +++++++++++++++++++ fs/xfs/xfs_inode.h | 17 +++++++----- 9 files changed, 99 insertions(+), 24 deletions(-) diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index e576560b46e9..7a1a0af24197 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2643,16 +2643,32 @@ xfs_da_read_buf( struct xfs_buf_map map, *mapp = ↦ int nmap = 1; int error; + int buf_flags = 0; *bpp = NULL; error = xfs_dabuf_map(dp, bno, flags, whichfork, &mapp, &nmap); if (error || !nmap) goto out_free; + /* + * NOWAIT semantics mean we don't wait on the buffer lock nor do we + * issue IO for this buffer if it is not already in memory. Caller will + * retry. This will return -EAGAIN if the buffer is in memory and cannot + * be locked, and no buffer and no error if it isn't in memory. We + * translate both of those into a return state of -EAGAIN and *bpp = + * NULL. + */ + if (flags & XFS_DABUF_NOWAIT) + buf_flags |= XBF_TRYLOCK | XBF_INCORE; error = xfs_trans_read_buf_map(mp, tp, mp->m_ddev_targp, mapp, nmap, 0, &bp, ops); if (error) goto out_free; + if (!bp) { + ASSERT(flags & XFS_DABUF_NOWAIT); + error = -EAGAIN; + goto out_free; + } if (whichfork == XFS_ATTR_FORK) xfs_buf_set_ref(bp, XFS_ATTR_BTREE_REF); diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h index ffa3df5b2893..32e7b1cca402 100644 --- a/fs/xfs/libxfs/xfs_da_btree.h +++ b/fs/xfs/libxfs/xfs_da_btree.h @@ -205,6 +205,7 @@ int xfs_da3_node_read_mapped(struct xfs_trans *tp, struct xfs_inode *dp, */ #define XFS_DABUF_MAP_HOLE_OK (1u << 0) +#define XFS_DABUF_NOWAIT (1u << 1) int xfs_da_grow_inode(xfs_da_args_t *args, xfs_dablk_t *new_blkno); int xfs_da_grow_inode_int(struct xfs_da_args *args, xfs_fileoff_t *bno, diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c index 00f960a703b2..59b24a594add 100644 --- a/fs/xfs/libxfs/xfs_dir2_block.c +++ b/fs/xfs/libxfs/xfs_dir2_block.c @@ -135,13 +135,14 @@ int xfs_dir3_block_read( struct xfs_trans *tp, struct xfs_inode *dp, + unsigned int flags, struct xfs_buf **bpp) { struct xfs_mount *mp = dp->i_mount; xfs_failaddr_t fa; int err; - err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, 0, bpp, + err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, flags, bpp, XFS_DATA_FORK, &xfs_dir3_block_buf_ops); if (err || !*bpp) return err; @@ -380,7 +381,7 @@ xfs_dir2_block_addname( tp = args->trans; /* Read the (one and only) directory block into bp. */ - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; @@ -695,7 +696,7 @@ xfs_dir2_block_lookup_int( dp = args->dp; tp = args->trans; - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h index 7404a9ff1a92..7d4cf8a0f15b 100644 --- a/fs/xfs/libxfs/xfs_dir2_priv.h +++ b/fs/xfs/libxfs/xfs_dir2_priv.h @@ -51,7 +51,7 @@ extern int xfs_dir_cilookup_result(struct xfs_da_args *args, /* xfs_dir2_block.c */ extern int xfs_dir3_block_read(struct xfs_trans *tp, struct xfs_inode *dp, - struct xfs_buf **bpp); + unsigned int flags, struct xfs_buf **bpp); extern int xfs_dir2_block_addname(struct xfs_da_args *args); extern int xfs_dir2_block_lookup(struct xfs_da_args *args); extern int xfs_dir2_block_removename(struct xfs_da_args *args); diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c index 0b491784b759..5cc51f201bd7 100644 --- a/fs/xfs/scrub/dir.c +++ b/fs/xfs/scrub/dir.c @@ -313,7 +313,7 @@ xchk_directory_data_bestfree( /* dir block format */ if (lblk != XFS_B_TO_FSBT(mp, XFS_DIR2_DATA_OFFSET)) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk); - error = xfs_dir3_block_read(sc->tp, sc->ip, &bp); + error = xfs_dir3_block_read(sc->tp, sc->ip, 0, &bp); } else { /* dir data format */ error = xfs_dir3_data_read(sc->tp, sc->ip, lblk, 0, &bp); diff --git a/fs/xfs/scrub/readdir.c b/fs/xfs/scrub/readdir.c index e51c1544be63..f0a727311632 100644 --- a/fs/xfs/scrub/readdir.c +++ b/fs/xfs/scrub/readdir.c @@ -101,7 +101,7 @@ xchk_dir_walk_block( unsigned int off, next_off, end; int error; - error = xfs_dir3_block_read(sc->tp, dp, &bp); + error = xfs_dir3_block_read(sc->tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c index 9f3ceb461515..dcdbd26e0402 100644 --- a/fs/xfs/xfs_dir2_readdir.c +++ b/fs/xfs/xfs_dir2_readdir.c @@ -149,6 +149,7 @@ xfs_dir2_block_getdents( struct xfs_da_geometry *geo = args->geo; unsigned int offset, next_offset; unsigned int end; + unsigned int flags = 0; /* * If the block number in the offset is out of range, we're done. @@ -156,7 +157,9 @@ xfs_dir2_block_getdents( if (xfs_dir2_dataptr_to_db(geo, ctx->pos) > geo->datablk) return 0; - error = xfs_dir3_block_read(args->trans, dp, &bp); + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) + flags |= XFS_DABUF_NOWAIT; + error = xfs_dir3_block_read(args->trans, dp, flags, &bp); if (error) return error; @@ -240,6 +243,7 @@ xfs_dir2_block_getdents( STATIC int xfs_dir2_leaf_readbuf( struct xfs_da_args *args, + struct dir_context *ctx, size_t bufsize, xfs_dir2_off_t *cur_off, xfs_dablk_t *ra_blk, @@ -258,10 +262,15 @@ xfs_dir2_leaf_readbuf( struct xfs_iext_cursor icur; int ra_want; int error = 0; - - error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); - if (error) - goto out; + unsigned int flags = 0; + + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) { + flags |= XFS_DABUF_NOWAIT; + } else { + error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); + if (error) + goto out; + } /* * Look for mapped directory blocks at or above the current offset. @@ -280,7 +289,7 @@ xfs_dir2_leaf_readbuf( new_off = xfs_dir2_da_to_byte(geo, map.br_startoff); if (new_off > *cur_off) *cur_off = new_off; - error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, 0, &bp); + error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, flags, &bp); if (error) goto out; @@ -360,6 +369,7 @@ xfs_dir2_leaf_getdents( int byteoff; /* offset in current block */ unsigned int offset = 0; int error = 0; /* error return value */ + int written = 0; /* * If the offset is at or past the largest allowed value, @@ -391,10 +401,17 @@ xfs_dir2_leaf_getdents( bp = NULL; } - if (*lock_mode == 0) - *lock_mode = xfs_ilock_data_map_shared(dp); - error = xfs_dir2_leaf_readbuf(args, bufsize, &curoff, - &rablk, &bp); + if (*lock_mode == 0) { + *lock_mode = + xfs_ilock_data_map_shared_generic(dp, + ctx->flags & DIR_CONTEXT_F_NOWAIT); + if (!*lock_mode) { + error = -EAGAIN; + break; + } + } + error = xfs_dir2_leaf_readbuf(args, ctx, bufsize, + &curoff, &rablk, &bp); if (error || !bp) break; @@ -479,6 +496,7 @@ xfs_dir2_leaf_getdents( */ offset += length; curoff += length; + written += length; /* bufsize may have just been a guess; don't go negative */ bufsize = bufsize > length ? bufsize - length : 0; } @@ -492,6 +510,8 @@ xfs_dir2_leaf_getdents( ctx->pos = xfs_dir2_byte_to_dataptr(curoff) & 0x7fffffff; if (bp) xfs_trans_brelse(args->trans, bp); + if (error == -EAGAIN && written > 0) + error = 0; return error; } @@ -514,6 +534,7 @@ xfs_readdir( unsigned int lock_mode; bool isblock; int error; + bool nowait; trace_xfs_readdir(dp); @@ -531,7 +552,11 @@ xfs_readdir( if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) return xfs_dir2_sf_getdents(&args, ctx); - lock_mode = xfs_ilock_data_map_shared(dp); + nowait = ctx->flags & DIR_CONTEXT_F_NOWAIT; + lock_mode = xfs_ilock_data_map_shared_generic(dp, nowait); + if (!lock_mode) + return -EAGAIN; + error = xfs_dir2_isblock(&args, &isblock); if (error) goto out_unlock; @@ -546,5 +571,7 @@ xfs_readdir( out_unlock: if (lock_mode) xfs_iunlock(dp, lock_mode); + if (error == -EAGAIN) + ASSERT(nowait); return error; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 9e62cc500140..d088f7d0c23a 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -120,6 +120,33 @@ xfs_ilock_data_map_shared( return lock_mode; } +/* + * Similar to xfs_ilock_data_map_shared(), except that it will only try to lock + * the inode in shared mode if the extents are already in memory. If it fails to + * get the lock or has to do IO to read the extent list, fail the operation by + * returning 0 as the lock mode. + */ +uint +xfs_ilock_data_map_shared_nowait( + struct xfs_inode *ip) +{ + if (xfs_need_iread_extents(&ip->i_df)) + return 0; + if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED)) + return 0; + return XFS_ILOCK_SHARED; +} + +int +xfs_ilock_data_map_shared_generic( + struct xfs_inode *dp, + bool nowait) +{ + if (nowait) + return xfs_ilock_data_map_shared_nowait(dp); + return xfs_ilock_data_map_shared(dp); +} + uint xfs_ilock_attr_map_shared( struct xfs_inode *ip) diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 7547caf2f2ab..ea206a5a27df 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -490,13 +490,16 @@ int xfs_rename(struct mnt_idmap *idmap, struct xfs_name *target_name, struct xfs_inode *target_ip, unsigned int flags); -void xfs_ilock(xfs_inode_t *, uint); -int xfs_ilock_nowait(xfs_inode_t *, uint); -void xfs_iunlock(xfs_inode_t *, uint); -void xfs_ilock_demote(xfs_inode_t *, uint); -bool xfs_isilocked(struct xfs_inode *, uint); -uint xfs_ilock_data_map_shared(struct xfs_inode *); -uint xfs_ilock_attr_map_shared(struct xfs_inode *); +void xfs_ilock(struct xfs_inode *ip, uint lockmode); +int xfs_ilock_nowait(struct xfs_inode *ip, uint lockmode); +void xfs_iunlock(struct xfs_inode *ip, uint lockmode); +void xfs_ilock_demote(struct xfs_inode *ip, uint lockmode); +bool xfs_isilocked(struct xfs_inode *ip, uint lockmode); +uint xfs_ilock_data_map_shared(struct xfs_inode *ip); +uint xfs_ilock_data_map_shared_nowait(struct xfs_inode *ip); +int xfs_ilock_data_map_shared_generic(struct xfs_inode *ip, + bool nowait); +uint xfs_ilock_attr_map_shared(struct xfs_inode *ip); uint xfs_ip2xflags(struct xfs_inode *); int xfs_ifree(struct xfs_trans *, struct xfs_inode *); -- 2.25.1 _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hao Xu Subject: [PATCH 02/11] xfs: add NOWAIT semantics for readdir Date: Sun, 27 Aug 2023 21:28:26 +0800 Message-ID: <20230827132835.1373581-3-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sourceforge.net; s=x; h=Content-Transfer-Encoding:MIME-Version:References: In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=gQlAqdwpwXVrA4nulu6i666kwX /UsL44xrpiOcOUkJuBxtjPx2ujK/+DTonGw3/jBkQtiuGdDkvZsZfpiRvcAzk1ZByCSBcWkhmQAhB WpkUIA2R7720jhfbqmANa8M6tKWvXdgqvgkJq8FeZFaZ+H18v1pmRArtRx4zZ4lw/XX0=; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sf.net; s=x ; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=PvH3jbtCTcf0RgPl6Q/HzqlQ2V ruQMUa8jyHuCqBDLcfgt1WM3L9UePl2JrbaoTnPeUSL7HphGSFI2wnoSrjjl4KSRaWkn808to/xCm PDqZTWFtw3YdFBHxXK0OJiDMCh83seDweZr15VYg093kDJKxX+Ze+0mfRCtJjmdA6O4k=; DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143055; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=L3dZ2wnhLl3MpFT8Fx3GtXFd2RCwphC3uDYko8Q0nKz7ONNiGRoAamaCC5cUw2SONEU+W+ FNti/TRpT0IYzrVdjkthri+O6/3aV24bEu1xrJpC4BygFoLKHcj9ftj4kgaNqPcbLz1Ibc FihN2zv2jXx86v+rCxyuW2cfhV75bQE= In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-f2fs-devel-bounces@lists.sourceforge.net To: io-uring@vger.kernel.org, Jens Axboe Cc: Wanpeng Li , "Darrick J . Wong" , Dominique Martinet , Dave Chinner , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Stefan Roesch , Clay Harris , linux-s390@vger.kernel.org, linux-nilfs@vger.kernel.org, codalist@coda.cs.cmu.edu, cluster-devel@redhat.com, linux-cachefs@redhat.com, linux-ext4@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-block@vger.kernel.org, Alexander Viro , Christian Brauner , netdev@vger.kernel.org, samba-technical@lists.samba.org, linux-unionfs@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mtd@lists.infradead.org, bpf@vger.kernel.o From: Hao Xu Implement NOWAIT semantics for readdir. Return EAGAIN error to the caller if it would block, like failing to get locks, or going to do IO. Co-developed-by: Dave Chinner Signed-off-by: Dave Chinner Signed-off-by: Hao Xu [fixes deadlock issue, tweak code style] --- fs/xfs/libxfs/xfs_da_btree.c | 16 +++++++++++ fs/xfs/libxfs/xfs_da_btree.h | 1 + fs/xfs/libxfs/xfs_dir2_block.c | 7 ++--- fs/xfs/libxfs/xfs_dir2_priv.h | 2 +- fs/xfs/scrub/dir.c | 2 +- fs/xfs/scrub/readdir.c | 2 +- fs/xfs/xfs_dir2_readdir.c | 49 ++++++++++++++++++++++++++-------- fs/xfs/xfs_inode.c | 27 +++++++++++++++++++ fs/xfs/xfs_inode.h | 17 +++++++----- 9 files changed, 99 insertions(+), 24 deletions(-) diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index e576560b46e9..7a1a0af24197 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2643,16 +2643,32 @@ xfs_da_read_buf( struct xfs_buf_map map, *mapp = ↦ int nmap = 1; int error; + int buf_flags = 0; *bpp = NULL; error = xfs_dabuf_map(dp, bno, flags, whichfork, &mapp, &nmap); if (error || !nmap) goto out_free; + /* + * NOWAIT semantics mean we don't wait on the buffer lock nor do we + * issue IO for this buffer if it is not already in memory. Caller will + * retry. This will return -EAGAIN if the buffer is in memory and cannot + * be locked, and no buffer and no error if it isn't in memory. We + * translate both of those into a return state of -EAGAIN and *bpp = + * NULL. + */ + if (flags & XFS_DABUF_NOWAIT) + buf_flags |= XBF_TRYLOCK | XBF_INCORE; error = xfs_trans_read_buf_map(mp, tp, mp->m_ddev_targp, mapp, nmap, 0, &bp, ops); if (error) goto out_free; + if (!bp) { + ASSERT(flags & XFS_DABUF_NOWAIT); + error = -EAGAIN; + goto out_free; + } if (whichfork == XFS_ATTR_FORK) xfs_buf_set_ref(bp, XFS_ATTR_BTREE_REF); diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h index ffa3df5b2893..32e7b1cca402 100644 --- a/fs/xfs/libxfs/xfs_da_btree.h +++ b/fs/xfs/libxfs/xfs_da_btree.h @@ -205,6 +205,7 @@ int xfs_da3_node_read_mapped(struct xfs_trans *tp, struct xfs_inode *dp, */ #define XFS_DABUF_MAP_HOLE_OK (1u << 0) +#define XFS_DABUF_NOWAIT (1u << 1) int xfs_da_grow_inode(xfs_da_args_t *args, xfs_dablk_t *new_blkno); int xfs_da_grow_inode_int(struct xfs_da_args *args, xfs_fileoff_t *bno, diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c index 00f960a703b2..59b24a594add 100644 --- a/fs/xfs/libxfs/xfs_dir2_block.c +++ b/fs/xfs/libxfs/xfs_dir2_block.c @@ -135,13 +135,14 @@ int xfs_dir3_block_read( struct xfs_trans *tp, struct xfs_inode *dp, + unsigned int flags, struct xfs_buf **bpp) { struct xfs_mount *mp = dp->i_mount; xfs_failaddr_t fa; int err; - err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, 0, bpp, + err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, flags, bpp, XFS_DATA_FORK, &xfs_dir3_block_buf_ops); if (err || !*bpp) return err; @@ -380,7 +381,7 @@ xfs_dir2_block_addname( tp = args->trans; /* Read the (one and only) directory block into bp. */ - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; @@ -695,7 +696,7 @@ xfs_dir2_block_lookup_int( dp = args->dp; tp = args->trans; - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h index 7404a9ff1a92..7d4cf8a0f15b 100644 --- a/fs/xfs/libxfs/xfs_dir2_priv.h +++ b/fs/xfs/libxfs/xfs_dir2_priv.h @@ -51,7 +51,7 @@ extern int xfs_dir_cilookup_result(struct xfs_da_args *args, /* xfs_dir2_block.c */ extern int xfs_dir3_block_read(struct xfs_trans *tp, struct xfs_inode *dp, - struct xfs_buf **bpp); + unsigned int flags, struct xfs_buf **bpp); extern int xfs_dir2_block_addname(struct xfs_da_args *args); extern int xfs_dir2_block_lookup(struct xfs_da_args *args); extern int xfs_dir2_block_removename(struct xfs_da_args *args); diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c index 0b491784b759..5cc51f201bd7 100644 --- a/fs/xfs/scrub/dir.c +++ b/fs/xfs/scrub/dir.c @@ -313,7 +313,7 @@ xchk_directory_data_bestfree( /* dir block format */ if (lblk != XFS_B_TO_FSBT(mp, XFS_DIR2_DATA_OFFSET)) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk); - error = xfs_dir3_block_read(sc->tp, sc->ip, &bp); + error = xfs_dir3_block_read(sc->tp, sc->ip, 0, &bp); } else { /* dir data format */ error = xfs_dir3_data_read(sc->tp, sc->ip, lblk, 0, &bp); diff --git a/fs/xfs/scrub/readdir.c b/fs/xfs/scrub/readdir.c index e51c1544be63..f0a727311632 100644 --- a/fs/xfs/scrub/readdir.c +++ b/fs/xfs/scrub/readdir.c @@ -101,7 +101,7 @@ xchk_dir_walk_block( unsigned int off, next_off, end; int error; - error = xfs_dir3_block_read(sc->tp, dp, &bp); + error = xfs_dir3_block_read(sc->tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c index 9f3ceb461515..dcdbd26e0402 100644 --- a/fs/xfs/xfs_dir2_readdir.c +++ b/fs/xfs/xfs_dir2_readdir.c @@ -149,6 +149,7 @@ xfs_dir2_block_getdents( struct xfs_da_geometry *geo = args->geo; unsigned int offset, next_offset; unsigned int end; + unsigned int flags = 0; /* * If the block number in the offset is out of range, we're done. @@ -156,7 +157,9 @@ xfs_dir2_block_getdents( if (xfs_dir2_dataptr_to_db(geo, ctx->pos) > geo->datablk) return 0; - error = xfs_dir3_block_read(args->trans, dp, &bp); + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) + flags |= XFS_DABUF_NOWAIT; + error = xfs_dir3_block_read(args->trans, dp, flags, &bp); if (error) return error; @@ -240,6 +243,7 @@ xfs_dir2_block_getdents( STATIC int xfs_dir2_leaf_readbuf( struct xfs_da_args *args, + struct dir_context *ctx, size_t bufsize, xfs_dir2_off_t *cur_off, xfs_dablk_t *ra_blk, @@ -258,10 +262,15 @@ xfs_dir2_leaf_readbuf( struct xfs_iext_cursor icur; int ra_want; int error = 0; - - error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); - if (error) - goto out; + unsigned int flags = 0; + + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) { + flags |= XFS_DABUF_NOWAIT; + } else { + error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); + if (error) + goto out; + } /* * Look for mapped directory blocks at or above the current offset. @@ -280,7 +289,7 @@ xfs_dir2_leaf_readbuf( new_off = xfs_dir2_da_to_byte(geo, map.br_startoff); if (new_off > *cur_off) *cur_off = new_off; - error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, 0, &bp); + error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, flags, &bp); if (error) goto out; @@ -360,6 +369,7 @@ xfs_dir2_leaf_getdents( int byteoff; /* offset in current block */ unsigned int offset = 0; int error = 0; /* error return value */ + int written = 0; /* * If the offset is at or past the largest allowed value, @@ -391,10 +401,17 @@ xfs_dir2_leaf_getdents( bp = NULL; } - if (*lock_mode == 0) - *lock_mode = xfs_ilock_data_map_shared(dp); - error = xfs_dir2_leaf_readbuf(args, bufsize, &curoff, - &rablk, &bp); + if (*lock_mode == 0) { + *lock_mode = + xfs_ilock_data_map_shared_generic(dp, + ctx->flags & DIR_CONTEXT_F_NOWAIT); + if (!*lock_mode) { + error = -EAGAIN; + break; + } + } + error = xfs_dir2_leaf_readbuf(args, ctx, bufsize, + &curoff, &rablk, &bp); if (error || !bp) break; @@ -479,6 +496,7 @@ xfs_dir2_leaf_getdents( */ offset += length; curoff += length; + written += length; /* bufsize may have just been a guess; don't go negative */ bufsize = bufsize > length ? bufsize - length : 0; } @@ -492,6 +510,8 @@ xfs_dir2_leaf_getdents( ctx->pos = xfs_dir2_byte_to_dataptr(curoff) & 0x7fffffff; if (bp) xfs_trans_brelse(args->trans, bp); + if (error == -EAGAIN && written > 0) + error = 0; return error; } @@ -514,6 +534,7 @@ xfs_readdir( unsigned int lock_mode; bool isblock; int error; + bool nowait; trace_xfs_readdir(dp); @@ -531,7 +552,11 @@ xfs_readdir( if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) return xfs_dir2_sf_getdents(&args, ctx); - lock_mode = xfs_ilock_data_map_shared(dp); + nowait = ctx->flags & DIR_CONTEXT_F_NOWAIT; + lock_mode = xfs_ilock_data_map_shared_generic(dp, nowait); + if (!lock_mode) + return -EAGAIN; + error = xfs_dir2_isblock(&args, &isblock); if (error) goto out_unlock; @@ -546,5 +571,7 @@ xfs_readdir( out_unlock: if (lock_mode) xfs_iunlock(dp, lock_mode); + if (error == -EAGAIN) + ASSERT(nowait); return error; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 9e62cc500140..d088f7d0c23a 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -120,6 +120,33 @@ xfs_ilock_data_map_shared( return lock_mode; } +/* + * Similar to xfs_ilock_data_map_shared(), except that it will only try to lock + * the inode in shared mode if the extents are already in memory. If it fails to + * get the lock or has to do IO to read the extent list, fail the operation by + * returning 0 as the lock mode. + */ +uint +xfs_ilock_data_map_shared_nowait( + struct xfs_inode *ip) +{ + if (xfs_need_iread_extents(&ip->i_df)) + return 0; + if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED)) + return 0; + return XFS_ILOCK_SHARED; +} + +int +xfs_ilock_data_map_shared_generic( + struct xfs_inode *dp, + bool nowait) +{ + if (nowait) + return xfs_ilock_data_map_shared_nowait(dp); + return xfs_ilock_data_map_shared(dp); +} + uint xfs_ilock_attr_map_shared( struct xfs_inode *ip) diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 7547caf2f2ab..ea206a5a27df 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -490,13 +490,16 @@ int xfs_rename(struct mnt_idmap *idmap, struct xfs_name *target_name, struct xfs_inode *target_ip, unsigned int flags); -void xfs_ilock(xfs_inode_t *, uint); -int xfs_ilock_nowait(xfs_inode_t *, uint); -void xfs_iunlock(xfs_inode_t *, uint); -void xfs_ilock_demote(xfs_inode_t *, uint); -bool xfs_isilocked(struct xfs_inode *, uint); -uint xfs_ilock_data_map_shared(struct xfs_inode *); -uint xfs_ilock_attr_map_shared(struct xfs_inode *); +void xfs_ilock(struct xfs_inode *ip, uint lockmode); +int xfs_ilock_nowait(struct xfs_inode *ip, uint lockmode); +void xfs_iunlock(struct xfs_inode *ip, uint lockmode); +void xfs_ilock_demote(struct xfs_inode *ip, uint lockmode); +bool xfs_isilocked(struct xfs_inode *ip, uint lockmode); +uint xfs_ilock_data_map_shared(struct xfs_inode *ip); +uint xfs_ilock_data_map_shared_nowait(struct xfs_inode *ip); +int xfs_ilock_data_map_shared_generic(struct xfs_inode *ip, + bool nowait); +uint xfs_ilock_attr_map_shared(struct xfs_inode *ip); uint xfs_ip2xflags(struct xfs_inode *); int xfs_ifree(struct xfs_trans *, struct xfs_inode *); -- 2.25.1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A9F64C83F11 for ; Sun, 27 Aug 2023 13:31:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=veI1UPhcrWFU87J9PFZartHNacQHPEv9kZNVGyDIzlk=; b=zpq3DgKjcqgJsY jKIN+McJ/Vkpmo1vW7FooHgRLErfvDtPgnt/nenzc8p3xRqdKYHpb0L3O81NagMlQGaHGXV38IJiT SUjzmfTJ9jE6XptG49qMkkezNz51JK937ubL6vvtvXCOfW1ZfWhdfq81X7W+kt8D/3KQkfewn5wLX gkVuD6H1PQVxRkTVM2g1sr6AxBdtOJmSs8FcqWLq+WX6rHdoAh/8L0hFaLJlG9k8Mkcns/G4CVuJg etLLtvseKvEFqhdChuiTCyBG8tmqSQNcIaJtOxUkOgaD7eJz6QiAd1Z39zOyPDkbIK/2Vw4szQk2t bhd5X0riHNwaA0DBcA3Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qaFr3-0085Ow-1i; Sun, 27 Aug 2023 13:31:01 +0000 Received: from out-250.mta1.migadu.com ([2001:41d0:203:375::fa]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qaFr0-0085Lu-1h for linux-mtd@lists.infradead.org; Sun, 27 Aug 2023 13:31:00 +0000 X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143055; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=L3dZ2wnhLl3MpFT8Fx3GtXFd2RCwphC3uDYko8Q0nKz7ONNiGRoAamaCC5cUw2SONEU+W+ FNti/TRpT0IYzrVdjkthri+O6/3aV24bEu1xrJpC4BygFoLKHcj9ftj4kgaNqPcbLz1Ibc FihN2zv2jXx86v+rCxyuW2cfhV75bQE= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 02/11] xfs: add NOWAIT semantics for readdir Date: Sun, 27 Aug 2023 21:28:26 +0800 Message-Id: <20230827132835.1373581-3-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230827_063058_708125_89E9B330 X-CRM114-Status: GOOD ( 29.31 ) X-BeenThere: linux-mtd@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Linux MTD discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-mtd" Errors-To: linux-mtd-bounces+linux-mtd=archiver.kernel.org@lists.infradead.org From: Hao Xu Implement NOWAIT semantics for readdir. Return EAGAIN error to the caller if it would block, like failing to get locks, or going to do IO. Co-developed-by: Dave Chinner Signed-off-by: Dave Chinner Signed-off-by: Hao Xu [fixes deadlock issue, tweak code style] --- fs/xfs/libxfs/xfs_da_btree.c | 16 +++++++++++ fs/xfs/libxfs/xfs_da_btree.h | 1 + fs/xfs/libxfs/xfs_dir2_block.c | 7 ++--- fs/xfs/libxfs/xfs_dir2_priv.h | 2 +- fs/xfs/scrub/dir.c | 2 +- fs/xfs/scrub/readdir.c | 2 +- fs/xfs/xfs_dir2_readdir.c | 49 ++++++++++++++++++++++++++-------- fs/xfs/xfs_inode.c | 27 +++++++++++++++++++ fs/xfs/xfs_inode.h | 17 +++++++----- 9 files changed, 99 insertions(+), 24 deletions(-) diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index e576560b46e9..7a1a0af24197 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2643,16 +2643,32 @@ xfs_da_read_buf( struct xfs_buf_map map, *mapp = ↦ int nmap = 1; int error; + int buf_flags = 0; *bpp = NULL; error = xfs_dabuf_map(dp, bno, flags, whichfork, &mapp, &nmap); if (error || !nmap) goto out_free; + /* + * NOWAIT semantics mean we don't wait on the buffer lock nor do we + * issue IO for this buffer if it is not already in memory. Caller will + * retry. This will return -EAGAIN if the buffer is in memory and cannot + * be locked, and no buffer and no error if it isn't in memory. We + * translate both of those into a return state of -EAGAIN and *bpp = + * NULL. + */ + if (flags & XFS_DABUF_NOWAIT) + buf_flags |= XBF_TRYLOCK | XBF_INCORE; error = xfs_trans_read_buf_map(mp, tp, mp->m_ddev_targp, mapp, nmap, 0, &bp, ops); if (error) goto out_free; + if (!bp) { + ASSERT(flags & XFS_DABUF_NOWAIT); + error = -EAGAIN; + goto out_free; + } if (whichfork == XFS_ATTR_FORK) xfs_buf_set_ref(bp, XFS_ATTR_BTREE_REF); diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h index ffa3df5b2893..32e7b1cca402 100644 --- a/fs/xfs/libxfs/xfs_da_btree.h +++ b/fs/xfs/libxfs/xfs_da_btree.h @@ -205,6 +205,7 @@ int xfs_da3_node_read_mapped(struct xfs_trans *tp, struct xfs_inode *dp, */ #define XFS_DABUF_MAP_HOLE_OK (1u << 0) +#define XFS_DABUF_NOWAIT (1u << 1) int xfs_da_grow_inode(xfs_da_args_t *args, xfs_dablk_t *new_blkno); int xfs_da_grow_inode_int(struct xfs_da_args *args, xfs_fileoff_t *bno, diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c index 00f960a703b2..59b24a594add 100644 --- a/fs/xfs/libxfs/xfs_dir2_block.c +++ b/fs/xfs/libxfs/xfs_dir2_block.c @@ -135,13 +135,14 @@ int xfs_dir3_block_read( struct xfs_trans *tp, struct xfs_inode *dp, + unsigned int flags, struct xfs_buf **bpp) { struct xfs_mount *mp = dp->i_mount; xfs_failaddr_t fa; int err; - err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, 0, bpp, + err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, flags, bpp, XFS_DATA_FORK, &xfs_dir3_block_buf_ops); if (err || !*bpp) return err; @@ -380,7 +381,7 @@ xfs_dir2_block_addname( tp = args->trans; /* Read the (one and only) directory block into bp. */ - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; @@ -695,7 +696,7 @@ xfs_dir2_block_lookup_int( dp = args->dp; tp = args->trans; - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h index 7404a9ff1a92..7d4cf8a0f15b 100644 --- a/fs/xfs/libxfs/xfs_dir2_priv.h +++ b/fs/xfs/libxfs/xfs_dir2_priv.h @@ -51,7 +51,7 @@ extern int xfs_dir_cilookup_result(struct xfs_da_args *args, /* xfs_dir2_block.c */ extern int xfs_dir3_block_read(struct xfs_trans *tp, struct xfs_inode *dp, - struct xfs_buf **bpp); + unsigned int flags, struct xfs_buf **bpp); extern int xfs_dir2_block_addname(struct xfs_da_args *args); extern int xfs_dir2_block_lookup(struct xfs_da_args *args); extern int xfs_dir2_block_removename(struct xfs_da_args *args); diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c index 0b491784b759..5cc51f201bd7 100644 --- a/fs/xfs/scrub/dir.c +++ b/fs/xfs/scrub/dir.c @@ -313,7 +313,7 @@ xchk_directory_data_bestfree( /* dir block format */ if (lblk != XFS_B_TO_FSBT(mp, XFS_DIR2_DATA_OFFSET)) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk); - error = xfs_dir3_block_read(sc->tp, sc->ip, &bp); + error = xfs_dir3_block_read(sc->tp, sc->ip, 0, &bp); } else { /* dir data format */ error = xfs_dir3_data_read(sc->tp, sc->ip, lblk, 0, &bp); diff --git a/fs/xfs/scrub/readdir.c b/fs/xfs/scrub/readdir.c index e51c1544be63..f0a727311632 100644 --- a/fs/xfs/scrub/readdir.c +++ b/fs/xfs/scrub/readdir.c @@ -101,7 +101,7 @@ xchk_dir_walk_block( unsigned int off, next_off, end; int error; - error = xfs_dir3_block_read(sc->tp, dp, &bp); + error = xfs_dir3_block_read(sc->tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c index 9f3ceb461515..dcdbd26e0402 100644 --- a/fs/xfs/xfs_dir2_readdir.c +++ b/fs/xfs/xfs_dir2_readdir.c @@ -149,6 +149,7 @@ xfs_dir2_block_getdents( struct xfs_da_geometry *geo = args->geo; unsigned int offset, next_offset; unsigned int end; + unsigned int flags = 0; /* * If the block number in the offset is out of range, we're done. @@ -156,7 +157,9 @@ xfs_dir2_block_getdents( if (xfs_dir2_dataptr_to_db(geo, ctx->pos) > geo->datablk) return 0; - error = xfs_dir3_block_read(args->trans, dp, &bp); + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) + flags |= XFS_DABUF_NOWAIT; + error = xfs_dir3_block_read(args->trans, dp, flags, &bp); if (error) return error; @@ -240,6 +243,7 @@ xfs_dir2_block_getdents( STATIC int xfs_dir2_leaf_readbuf( struct xfs_da_args *args, + struct dir_context *ctx, size_t bufsize, xfs_dir2_off_t *cur_off, xfs_dablk_t *ra_blk, @@ -258,10 +262,15 @@ xfs_dir2_leaf_readbuf( struct xfs_iext_cursor icur; int ra_want; int error = 0; - - error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); - if (error) - goto out; + unsigned int flags = 0; + + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) { + flags |= XFS_DABUF_NOWAIT; + } else { + error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); + if (error) + goto out; + } /* * Look for mapped directory blocks at or above the current offset. @@ -280,7 +289,7 @@ xfs_dir2_leaf_readbuf( new_off = xfs_dir2_da_to_byte(geo, map.br_startoff); if (new_off > *cur_off) *cur_off = new_off; - error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, 0, &bp); + error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, flags, &bp); if (error) goto out; @@ -360,6 +369,7 @@ xfs_dir2_leaf_getdents( int byteoff; /* offset in current block */ unsigned int offset = 0; int error = 0; /* error return value */ + int written = 0; /* * If the offset is at or past the largest allowed value, @@ -391,10 +401,17 @@ xfs_dir2_leaf_getdents( bp = NULL; } - if (*lock_mode == 0) - *lock_mode = xfs_ilock_data_map_shared(dp); - error = xfs_dir2_leaf_readbuf(args, bufsize, &curoff, - &rablk, &bp); + if (*lock_mode == 0) { + *lock_mode = + xfs_ilock_data_map_shared_generic(dp, + ctx->flags & DIR_CONTEXT_F_NOWAIT); + if (!*lock_mode) { + error = -EAGAIN; + break; + } + } + error = xfs_dir2_leaf_readbuf(args, ctx, bufsize, + &curoff, &rablk, &bp); if (error || !bp) break; @@ -479,6 +496,7 @@ xfs_dir2_leaf_getdents( */ offset += length; curoff += length; + written += length; /* bufsize may have just been a guess; don't go negative */ bufsize = bufsize > length ? bufsize - length : 0; } @@ -492,6 +510,8 @@ xfs_dir2_leaf_getdents( ctx->pos = xfs_dir2_byte_to_dataptr(curoff) & 0x7fffffff; if (bp) xfs_trans_brelse(args->trans, bp); + if (error == -EAGAIN && written > 0) + error = 0; return error; } @@ -514,6 +534,7 @@ xfs_readdir( unsigned int lock_mode; bool isblock; int error; + bool nowait; trace_xfs_readdir(dp); @@ -531,7 +552,11 @@ xfs_readdir( if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) return xfs_dir2_sf_getdents(&args, ctx); - lock_mode = xfs_ilock_data_map_shared(dp); + nowait = ctx->flags & DIR_CONTEXT_F_NOWAIT; + lock_mode = xfs_ilock_data_map_shared_generic(dp, nowait); + if (!lock_mode) + return -EAGAIN; + error = xfs_dir2_isblock(&args, &isblock); if (error) goto out_unlock; @@ -546,5 +571,7 @@ xfs_readdir( out_unlock: if (lock_mode) xfs_iunlock(dp, lock_mode); + if (error == -EAGAIN) + ASSERT(nowait); return error; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 9e62cc500140..d088f7d0c23a 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -120,6 +120,33 @@ xfs_ilock_data_map_shared( return lock_mode; } +/* + * Similar to xfs_ilock_data_map_shared(), except that it will only try to lock + * the inode in shared mode if the extents are already in memory. If it fails to + * get the lock or has to do IO to read the extent list, fail the operation by + * returning 0 as the lock mode. + */ +uint +xfs_ilock_data_map_shared_nowait( + struct xfs_inode *ip) +{ + if (xfs_need_iread_extents(&ip->i_df)) + return 0; + if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED)) + return 0; + return XFS_ILOCK_SHARED; +} + +int +xfs_ilock_data_map_shared_generic( + struct xfs_inode *dp, + bool nowait) +{ + if (nowait) + return xfs_ilock_data_map_shared_nowait(dp); + return xfs_ilock_data_map_shared(dp); +} + uint xfs_ilock_attr_map_shared( struct xfs_inode *ip) diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 7547caf2f2ab..ea206a5a27df 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -490,13 +490,16 @@ int xfs_rename(struct mnt_idmap *idmap, struct xfs_name *target_name, struct xfs_inode *target_ip, unsigned int flags); -void xfs_ilock(xfs_inode_t *, uint); -int xfs_ilock_nowait(xfs_inode_t *, uint); -void xfs_iunlock(xfs_inode_t *, uint); -void xfs_ilock_demote(xfs_inode_t *, uint); -bool xfs_isilocked(struct xfs_inode *, uint); -uint xfs_ilock_data_map_shared(struct xfs_inode *); -uint xfs_ilock_attr_map_shared(struct xfs_inode *); +void xfs_ilock(struct xfs_inode *ip, uint lockmode); +int xfs_ilock_nowait(struct xfs_inode *ip, uint lockmode); +void xfs_iunlock(struct xfs_inode *ip, uint lockmode); +void xfs_ilock_demote(struct xfs_inode *ip, uint lockmode); +bool xfs_isilocked(struct xfs_inode *ip, uint lockmode); +uint xfs_ilock_data_map_shared(struct xfs_inode *ip); +uint xfs_ilock_data_map_shared_nowait(struct xfs_inode *ip); +int xfs_ilock_data_map_shared_generic(struct xfs_inode *ip, + bool nowait); +uint xfs_ilock_attr_map_shared(struct xfs_inode *ip); uint xfs_ip2xflags(struct xfs_inode *); int xfs_ifree(struct xfs_trans *, struct xfs_inode *); -- 2.25.1 ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/