* Re: [PATCH] [RFC] xfs: reduce AGF hold times during fstrim operations
2023-08-29 6:57 [PATCH] [RFC] xfs: reduce AGF hold times during fstrim operations Dave Chinner
@ 2023-08-29 9:27 ` kernel test robot
2023-08-29 15:02 ` kernel test robot
2023-08-29 15:50 ` Darrick J. Wong
2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-08-29 9:27 UTC (permalink / raw)
To: Dave Chinner; +Cc: oe-kbuild-all
Hi Dave,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:
[auto build test WARNING on xfs-linux/for-next]
[also build test WARNING on linus/master v6.5 next-20230828]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Dave-Chinner/xfs-reduce-AGF-hold-times-during-fstrim-operations/20230829-145931
base: https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git for-next
patch link: https://lore.kernel.org/r/20230829065710.938039-1-david%40fromorbit.com
patch subject: [PATCH] [RFC] xfs: reduce AGF hold times during fstrim operations
config: s390-defconfig (https://download.01.org/0day-ci/archive/20230829/202308291718.fmi9ZE7N-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230829/202308291718.fmi9ZE7N-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308291718.fmi9ZE7N-lkp@intel.com/
All warnings (new ones prefixed by >>):
fs/xfs/xfs_discard.c: In function 'xfs_trim_gather_extents':
>> fs/xfs/xfs_discard.c:93:34: warning: variable 'agf' set but not used [-Wunused-but-set-variable]
93 | struct xfs_agf *agf;
| ^~~
vim +/agf +93 fs/xfs/xfs_discard.c
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 80
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 81 static int
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 82 xfs_trim_gather_extents(
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 83 struct xfs_perag *pag,
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 84 xfs_daddr_t start,
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 85 xfs_daddr_t end,
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 86 xfs_daddr_t minlen,
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 87 xfs_extlen_t *longest,
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 88 struct xfs_trim_work *twork)
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 89 {
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 90 struct xfs_mount *mp = pag->pag_mount;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 91 struct xfs_btree_cur *cur;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 92 struct xfs_buf *agbp;
9798f615ad2be48 fs/xfs/xfs_discard.c Christoph Hellwig 2020-03-10 @93 struct xfs_agf *agf;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 94 int error;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 95 int i;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 96 int batch = 100;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 97
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 98 /*
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 99 * Force out the log. This means any transactions that might have freed
8c81dd46ef3c416 fs/xfs/xfs_discard.c Carlos Maiolino 2018-04-10 100 * space before we take the AGF buffer lock are now on disk, and the
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 101 * volatile disk cache is flushed.
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 102 */
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 103 xfs_log_force(mp, XFS_LOG_SYNC);
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 104
08d3e84feeb8cb8 fs/xfs/xfs_discard.c Dave Chinner 2022-07-07 105 error = xfs_alloc_read_agf(pag, NULL, 0, &agbp);
706b8c5bc70391b fs/xfs/xfs_discard.c Darrick J. Wong 2020-01-23 106 if (error)
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 107 return error;
9798f615ad2be48 fs/xfs/xfs_discard.c Christoph Hellwig 2020-03-10 108 agf = agbp->b_addr;
8c81dd46ef3c416 fs/xfs/xfs_discard.c Carlos Maiolino 2018-04-10 109
289d38d22cd8896 fs/xfs/xfs_discard.c Dave Chinner 2021-06-02 110 cur = xfs_allocbt_init_cursor(mp, NULL, agbp, pag, XFS_BTNUM_CNT);
8c81dd46ef3c416 fs/xfs/xfs_discard.c Carlos Maiolino 2018-04-10 111
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 112 /*
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 113 * Look up the extent length requested in the AGF and start with it.
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 114 *
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 115 * XXX: continuations really want a lt lookup here, so we get the
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 116 * largest extent adjacent to the size finished off in the last batch.
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 117 * The ge search here results in the extent discarded in the last batch
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 118 * being discarded again before we move on to the smaller size...
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 119 */
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 120 error = xfs_alloc_lookup_ge(cur, 0, *longest, &i);
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 121 if (error)
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 122 goto out_del_cursor;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 123 if (i == 0) {
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 124 /* nothing of that length left in the AG, we are done */
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 125 *longest = 0;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 126 goto out_del_cursor;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 127 }
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 128
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 129 /*
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 130 * Loop until we are done with all extents that are large
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 131 * enough to be worth discarding or we hit batch limits.
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 132 */
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 133 while (i && batch-- > 0) {
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 134 xfs_agblock_t fbno;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 135 xfs_extlen_t flen;
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 136 xfs_daddr_t dbno;
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 137 xfs_extlen_t dlen;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 138
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 139 error = xfs_alloc_get_rec(cur, &fbno, &flen, &i);
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 140 if (error)
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 141 break;
f9e0370648b9f99 fs/xfs/xfs_discard.c Darrick J. Wong 2019-11-11 142 if (XFS_IS_CORRUPT(mp, i != 1)) {
f9e0370648b9f99 fs/xfs/xfs_discard.c Darrick J. Wong 2019-11-11 143 error = -EFSCORRUPTED;
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 144 break;
f9e0370648b9f99 fs/xfs/xfs_discard.c Darrick J. Wong 2019-11-11 145 }
9798f615ad2be48 fs/xfs/xfs_discard.c Christoph Hellwig 2020-03-10 146 ASSERT(flen <= be32_to_cpu(agf->agf_longest));
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 147
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 148 /*
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 149 * Keep going on this batch until we hit the record size
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 150 * changes. That way we will start the next batch with the new
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 151 * extent size and we don't get stuck on an extent size when
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 152 * there are more extents of that size than the batch size.
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 153 */
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 154 if (batch == 0) {
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 155 if (flen != *longest)
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 156 break;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 157 batch++;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 158 } else {
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 159 *longest = flen;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 160 }
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 161
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 162 /*
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 163 * use daddr format for all range/len calculations as that is
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 164 * the format the range/len variables are supplied in by
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 165 * userspace.
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 166 */
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 167 dbno = XFS_AGB_TO_DADDR(mp, pag->pag_agno, fbno);
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 168 dlen = XFS_FSB_TO_BB(mp, flen);
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 169
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 170 /*
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 171 * Too small? Give up.
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 172 */
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 173 if (dlen < minlen) {
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 174 trace_xfs_discard_toosmall(mp, pag->pag_agno, fbno, flen);
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 175 *longest = 0;
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 176 break;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 177 }
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 178
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 179 /*
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 180 * If the extent is entirely outside of the range we are
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 181 * supposed to discard skip it. Do not bother to trim
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 182 * down partially overlapping ranges for now.
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 183 */
a66d636385d621e fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 184 if (dbno + dlen < start || dbno > end) {
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 185 trace_xfs_discard_exclude(mp, pag->pag_agno, fbno, flen);
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 186 goto next_extent;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 187 }
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 188
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 189 /*
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 190 * If any blocks in the range are still busy, skip the
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 191 * discard and try again the next time.
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 192 */
45d0662117565e6 fs/xfs/xfs_discard.c Dave Chinner 2021-06-02 193 if (xfs_extent_busy_search(mp, pag, fbno, flen)) {
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 194 trace_xfs_discard_busy(mp, pag->pag_agno, fbno, flen);
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 195 goto next_extent;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 196 }
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 197
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 198 xfs_extent_busy_insert_discard(pag, fbno, flen,
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 199 &twork->busy_extents);
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 200 twork->blocks_trimmed += flen;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 201 next_extent:
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 202 error = xfs_btree_decrement(cur, 0, &i);
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 203 if (error)
35bf2b1abc9a753 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 204 break;
a46db60834883c1 fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 205
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 206 /*
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 207 * If there's no more records in the tree, we are done. Set
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 208 * longest to 0 to indicate to the caller that there is no more
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 209 * extents to search.
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 210 */
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 211 if (i == 0)
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 212 *longest = 0;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 213 }
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 214
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 215 /*
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 216 * If there was an error, release all the gathered busy extents because
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 217 * we aren't going to issue a discard on them any more. If we ran out of
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 218 * records, set *longest to zero to tell the caller there is nothing
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 219 * left in this AG.
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 220 */
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 221 if (error)
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 222 xfs_extent_busy_clear(pag->pag_mount, &twork->busy_extents,
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 223 false);
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 224 out_del_cursor:
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 225 xfs_btree_del_cursor(cur, error);
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 226 xfs_buf_relse(agbp);
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 227 return error;
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 228 }
99affa6f39376f6 fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 229
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] [RFC] xfs: reduce AGF hold times during fstrim operations
2023-08-29 6:57 [PATCH] [RFC] xfs: reduce AGF hold times during fstrim operations Dave Chinner
2023-08-29 9:27 ` kernel test robot
@ 2023-08-29 15:02 ` kernel test robot
2023-08-29 15:50 ` Darrick J. Wong
2 siblings, 0 replies; 5+ messages in thread
From: kernel test robot @ 2023-08-29 15:02 UTC (permalink / raw)
To: Dave Chinner; +Cc: llvm, oe-kbuild-all
Hi Dave,
[This is a private test report for your RFC patch.]
kernel test robot noticed the following build warnings:
[auto build test WARNING on xfs-linux/for-next]
[also build test WARNING on linus/master v6.5 next-20230829]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Dave-Chinner/xfs-reduce-AGF-hold-times-during-fstrim-operations/20230829-145931
base: https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git for-next
patch link: https://lore.kernel.org/r/20230829065710.938039-1-david%40fromorbit.com
patch subject: [PATCH] [RFC] xfs: reduce AGF hold times during fstrim operations
config: i386-buildonly-randconfig-003-20230829 (https://download.01.org/0day-ci/archive/20230829/202308292252.LIRc5GoQ-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230829/202308292252.LIRc5GoQ-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308292252.LIRc5GoQ-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> fs/xfs/xfs_discard.c:93:19: warning: variable 'agf' set but not used [-Wunused-but-set-variable]
struct xfs_agf *agf;
^
1 warning generated.
vim +/agf +93 fs/xfs/xfs_discard.c
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 80
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 81 static int
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 82 xfs_trim_gather_extents(
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 83 struct xfs_perag *pag,
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 84 xfs_daddr_t start,
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 85 xfs_daddr_t end,
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 86 xfs_daddr_t minlen,
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 87 xfs_extlen_t *longest,
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 88 struct xfs_trim_work *twork)
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 89 {
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 90 struct xfs_mount *mp = pag->pag_mount;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 91 struct xfs_btree_cur *cur;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 92 struct xfs_buf *agbp;
9798f615ad2be4 fs/xfs/xfs_discard.c Christoph Hellwig 2020-03-10 @93 struct xfs_agf *agf;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 94 int error;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 95 int i;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 96 int batch = 100;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 97
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 98 /*
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 99 * Force out the log. This means any transactions that might have freed
8c81dd46ef3c41 fs/xfs/xfs_discard.c Carlos Maiolino 2018-04-10 100 * space before we take the AGF buffer lock are now on disk, and the
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 101 * volatile disk cache is flushed.
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 102 */
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 103 xfs_log_force(mp, XFS_LOG_SYNC);
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 104
08d3e84feeb8cb fs/xfs/xfs_discard.c Dave Chinner 2022-07-07 105 error = xfs_alloc_read_agf(pag, NULL, 0, &agbp);
706b8c5bc70391 fs/xfs/xfs_discard.c Darrick J. Wong 2020-01-23 106 if (error)
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 107 return error;
9798f615ad2be4 fs/xfs/xfs_discard.c Christoph Hellwig 2020-03-10 108 agf = agbp->b_addr;
8c81dd46ef3c41 fs/xfs/xfs_discard.c Carlos Maiolino 2018-04-10 109
289d38d22cd889 fs/xfs/xfs_discard.c Dave Chinner 2021-06-02 110 cur = xfs_allocbt_init_cursor(mp, NULL, agbp, pag, XFS_BTNUM_CNT);
8c81dd46ef3c41 fs/xfs/xfs_discard.c Carlos Maiolino 2018-04-10 111
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 112 /*
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 113 * Look up the extent length requested in the AGF and start with it.
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 114 *
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 115 * XXX: continuations really want a lt lookup here, so we get the
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 116 * largest extent adjacent to the size finished off in the last batch.
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 117 * The ge search here results in the extent discarded in the last batch
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 118 * being discarded again before we move on to the smaller size...
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 119 */
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 120 error = xfs_alloc_lookup_ge(cur, 0, *longest, &i);
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 121 if (error)
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 122 goto out_del_cursor;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 123 if (i == 0) {
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 124 /* nothing of that length left in the AG, we are done */
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 125 *longest = 0;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 126 goto out_del_cursor;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 127 }
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 128
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 129 /*
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 130 * Loop until we are done with all extents that are large
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 131 * enough to be worth discarding or we hit batch limits.
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 132 */
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 133 while (i && batch-- > 0) {
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 134 xfs_agblock_t fbno;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 135 xfs_extlen_t flen;
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 136 xfs_daddr_t dbno;
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 137 xfs_extlen_t dlen;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 138
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 139 error = xfs_alloc_get_rec(cur, &fbno, &flen, &i);
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 140 if (error)
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 141 break;
f9e0370648b9f9 fs/xfs/xfs_discard.c Darrick J. Wong 2019-11-11 142 if (XFS_IS_CORRUPT(mp, i != 1)) {
f9e0370648b9f9 fs/xfs/xfs_discard.c Darrick J. Wong 2019-11-11 143 error = -EFSCORRUPTED;
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 144 break;
f9e0370648b9f9 fs/xfs/xfs_discard.c Darrick J. Wong 2019-11-11 145 }
9798f615ad2be4 fs/xfs/xfs_discard.c Christoph Hellwig 2020-03-10 146 ASSERT(flen <= be32_to_cpu(agf->agf_longest));
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 147
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 148 /*
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 149 * Keep going on this batch until we hit the record size
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 150 * changes. That way we will start the next batch with the new
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 151 * extent size and we don't get stuck on an extent size when
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 152 * there are more extents of that size than the batch size.
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 153 */
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 154 if (batch == 0) {
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 155 if (flen != *longest)
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 156 break;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 157 batch++;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 158 } else {
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 159 *longest = flen;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 160 }
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 161
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 162 /*
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 163 * use daddr format for all range/len calculations as that is
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 164 * the format the range/len variables are supplied in by
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 165 * userspace.
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 166 */
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 167 dbno = XFS_AGB_TO_DADDR(mp, pag->pag_agno, fbno);
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 168 dlen = XFS_FSB_TO_BB(mp, flen);
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 169
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 170 /*
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 171 * Too small? Give up.
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 172 */
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 173 if (dlen < minlen) {
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 174 trace_xfs_discard_toosmall(mp, pag->pag_agno, fbno, flen);
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 175 *longest = 0;
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 176 break;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 177 }
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 178
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 179 /*
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 180 * If the extent is entirely outside of the range we are
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 181 * supposed to discard skip it. Do not bother to trim
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 182 * down partially overlapping ranges for now.
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 183 */
a66d636385d621 fs/xfs/xfs_discard.c Dave Chinner 2012-03-22 184 if (dbno + dlen < start || dbno > end) {
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 185 trace_xfs_discard_exclude(mp, pag->pag_agno, fbno, flen);
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 186 goto next_extent;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 187 }
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 188
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 189 /*
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 190 * If any blocks in the range are still busy, skip the
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 191 * discard and try again the next time.
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 192 */
45d0662117565e fs/xfs/xfs_discard.c Dave Chinner 2021-06-02 193 if (xfs_extent_busy_search(mp, pag, fbno, flen)) {
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 194 trace_xfs_discard_busy(mp, pag->pag_agno, fbno, flen);
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 195 goto next_extent;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 196 }
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 197
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 198 xfs_extent_busy_insert_discard(pag, fbno, flen,
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 199 &twork->busy_extents);
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 200 twork->blocks_trimmed += flen;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 201 next_extent:
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 202 error = xfs_btree_decrement(cur, 0, &i);
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 203 if (error)
35bf2b1abc9a75 fs/xfs/xfs_discard.c Dave Chinner 2023-02-13 204 break;
a46db60834883c fs/xfs/linux-2.6/xfs_discard.c Christoph Hellwig 2011-01-07 205
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 206 /*
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 207 * If there's no more records in the tree, we are done. Set
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 208 * longest to 0 to indicate to the caller that there is no more
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 209 * extents to search.
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 210 */
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 211 if (i == 0)
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 212 *longest = 0;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 213 }
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 214
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 215 /*
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 216 * If there was an error, release all the gathered busy extents because
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 217 * we aren't going to issue a discard on them any more. If we ran out of
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 218 * records, set *longest to zero to tell the caller there is nothing
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 219 * left in this AG.
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 220 */
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 221 if (error)
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 222 xfs_extent_busy_clear(pag->pag_mount, &twork->busy_extents,
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 223 false);
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 224 out_del_cursor:
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 225 xfs_btree_del_cursor(cur, error);
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 226 xfs_buf_relse(agbp);
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 227 return error;
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 228 }
99affa6f39376f fs/xfs/xfs_discard.c Dave Chinner 2023-08-29 229
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] [RFC] xfs: reduce AGF hold times during fstrim operations
2023-08-29 6:57 [PATCH] [RFC] xfs: reduce AGF hold times during fstrim operations Dave Chinner
2023-08-29 9:27 ` kernel test robot
2023-08-29 15:02 ` kernel test robot
@ 2023-08-29 15:50 ` Darrick J. Wong
2023-08-30 6:32 ` Dave Chinner
2 siblings, 1 reply; 5+ messages in thread
From: Darrick J. Wong @ 2023-08-29 15:50 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-xfs
On Tue, Aug 29, 2023 at 04:57:10PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> fstrim will hold the AGF lock for as long as it takes to walk and
> discard all the free space in the AG that meets the userspace trim
> criteria.i For AGs with lots of free space extents (e.g. millions)
> or the underlying device is really slow at processing discard
> requests (e.g. Ceph RBD), this means the AGF hold time is often
> measured in minutes to hours, not a few milliseconds as we normal
> see with non-discard based operations.
>
> THis can result in the entire filesystem hanging whilst the
> long-running fstrim is in progress. We can have transactions get
> stuck waiting for the AGF lock (data or metadata extent allocation
> and freeing), and then more transactions get stuck waiting on the
> locks those transactions hold. We can get to the point where fstrim
> blocks an extent allocation or free operation long enough that it
> ends up pinning the tail of the log and the log then runs out of
> space. At this point, every modification in the filesystem gets
> blocked. This includes read operations, if atime updates need to be
> made.
>
> To fix this problem, we need to be able to discard free space
> extents safely without holding the AGF lock. Fortunately, we already
> do this with online discard via busy extents. We can makr free space
s/makr/mark/
> extents as "busy being discarded" under the AGF lock and then unlock
> the AGF, knowing that nobody will be able to allocate that free
> space extent until we remove it from the busy tree.
>
> Modify xfs_trim_extents to use the same asynchronous discard
> mechanism backed by busy extents as is used with online discard.
> This results in the AGF only needing to be held for short periods of
> time and it is never held while we issue discards. Hence if discard
> submission gets throttled because it is slow and/or there are lots
> of them, we aren't preventing other operations from being performed
> on AGF while we wait for discards to complete...
Oh good, because FSTRIM is dog slow even when you don't have other
threads impatiently waiting on the AGF. :)
(Hello, QVO...)
> This is an RFC because it's just the patch I've written to implement
> the functionality and smoke test it. It isn't polished, I haven't
> broken it up into fine-grained patches, etc. It's just a chunk of
> code that fixes the problem so people can comment on the approach
> and maybe spot something wrong that I haven't. IOWs, I'm looking for
> comments on the functionality change, not the code itself....
IOWs I don't have to wade through six patches of code golf just to
figure out what this does. Much easier to review, thank you.
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_discard.c | 274 ++++++++++++++++++++++++++++++++++++---
> fs/xfs/xfs_extent_busy.c | 33 ++++-
> fs/xfs/xfs_extent_busy.h | 4 +
> 3 files changed, 286 insertions(+), 25 deletions(-)
>
> diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c
> index afc4c78b9eed..c2eec29c02d1 100644
> --- a/fs/xfs/xfs_discard.c
> +++ b/fs/xfs/xfs_discard.c
> @@ -1,6 +1,6 @@
> // SPDX-License-Identifier: GPL-2.0
> /*
> - * Copyright (C) 2010 Red Hat, Inc.
> + * Copyright (C) 2010, 2023 Red Hat, Inc.
> * All Rights Reserved.
> */
> #include "xfs.h"
> @@ -19,21 +19,81 @@
> #include "xfs_log.h"
> #include "xfs_ag.h"
>
> -STATIC int
> -xfs_trim_extents(
> +/*
> + * Notes on an efficient, low latency fstrim algorithm
> + *
> + * We need to walk the filesystem free space and issue discards on the free
> + * space that meet the search criteria (size and location). We cannot issue
> + * discards on extents that might be in use, or are so recently in use they are
> + * still marked as busy. To serialise against extent state changes whilst we are
> + * gathering extents to trim, we must hold the AGF lock to lock out other
> + * allocations and extent free operations that might change extent state.
> + *
> + * However, we cannot just hold the AGF for the entire AG free space walk whilst
> + * we issue discards on each free space that is found. Storage devices can have
> + * extremely slow discard implementations (e.g. ceph RBD) and so walking a
> + * couple of million free extents and issuing synchronous discards on each
> + * extent can take a *long* time. Whilst we are doing this walk, nothing else
> + * can access the AGF, and we can stall transactions and hence the log whilst
> + * modifications wait for the AGF lock to be released. This can lead hung tasks
> + * kicking the hung task timer and rebooting the system. This is bad.
> + *
> + * Hence we need to take a leaf from the bulkstat playbook. It takes the AGI
> + * lock, gathers a range of inode cluster buffers that are allocated, drops the
> + * AGI lock and then reads all the inode cluster buffers and processes them. It
> + * loops doing this, using a cursor to keep track of where it is up to in the AG
> + * for each iteration to restart the INOBT lookup from.
> + *
> + * We can't do this exactly with free space - once we drop the AGF lock, the
> + * state of the free extent is out of our control and we cannot run a discard
> + * safely on it in this situation. Unless, of course, we've marked the free
> + * extent as busy and undergoing a discard operation whilst we held the AGF
> + * locked.
> + *
> + * This is exactly how online discard works - free extents are marked busy when
> + * they are freed, and once the extent free has been committed to the journal,
> + * the busy extent record is marked as "undergoing discard" and the discard is
> + * then issued on the free extent. Once the discard completes, the busy extent
> + * record is removed and the extent is able to be allocated again.
> + *
> + * In the context of fstrim, if we find a free extent we need to discard, we
> + * don't have to discard it immediately. All we need to do it record that free
> + * extent as being busy and under discard, and all the allocation routines will
> + * now avoid trying to allocate it. Hence if we mark the extent as busy under
> + * the AGF lock, we can safely discard it without holding the AGF lock because
> + * nothing will attempt to allocate that free space until the discard completes.
...and if the cntbt search encounters a busy extent, it'll skip it. I
think that means that two FITRIM invocations running in lockstep can
miss the extents being discarded by the other, right?
I think this can happen with your patch?
T0: T1:
xfs_alloc_read_agf
walk cntbt,
add free space to busy list
relse agf
issue discards
xfs_alloc_read_agf
...still waiting...
walk cntbt,
see all free space on busy list
relse agf
...still waiting...
"done" (despite discard in progress)
...still waiting...
io completion
done
Whereas currently I think T1 will stubbornly wait for the AGF and then
re-discard everything again. I wonder, should FITRIM call
xfs_extent_busy_flush if it finds already-busy extents?
> + *
> + * Hence we can makr fstrim behave much more like bulkstat and online discard
s/makr/make/
> + * w.r.t. AG header locking. By keeping the batch size low, we can minimise the
> + * AGF lock holdoffs whilst still safely being able to issue discards similar to
> + * bulkstat. We can also issue discards asynchronously like we do with online
Can we rearrange that "Similar to bulkstat's inode batching behavior, we
can minimise the AGF lock holdoffs [hold times?] whilst safely issuing
discards."
My hot take on that sentence was "bulkstat doesn't do discards" :)
> + * discard, and so for fast devices fstrim will run much faster as we can
> + * pipeline the free extent search with in flight discard IO.
> + */
> +
> +struct xfs_trim_work {
> + struct xfs_perag *pag;
> + struct list_head busy_extents;
> + uint64_t blocks_trimmed;
> + struct work_struct discard_endio_work;
> +};
> +
> +static int
> +xfs_trim_gather_extents(
> struct xfs_perag *pag,
> xfs_daddr_t start,
> xfs_daddr_t end,
> xfs_daddr_t minlen,
> - uint64_t *blocks_trimmed)
> + xfs_extlen_t *longest,
What is this @longest value? Is that a cursor for however far we've
walked through the cntbt?
> + struct xfs_trim_work *twork)
> {
> struct xfs_mount *mp = pag->pag_mount;
> - struct block_device *bdev = mp->m_ddev_targp->bt_bdev;
> struct xfs_btree_cur *cur;
> struct xfs_buf *agbp;
> struct xfs_agf *agf;
> int error;
> int i;
> + int batch = 100;
>
> /*
> * Force out the log. This means any transactions that might have freed
> @@ -50,17 +110,27 @@ xfs_trim_extents(
> cur = xfs_allocbt_init_cursor(mp, NULL, agbp, pag, XFS_BTNUM_CNT);
>
> /*
> - * Look up the longest btree in the AGF and start with it.
> + * Look up the extent length requested in the AGF and start with it.
> + *
> + * XXX: continuations really want a lt lookup here, so we get the
> + * largest extent adjacent to the size finished off in the last batch.
> + * The ge search here results in the extent discarded in the last batch
> + * being discarded again before we move on to the smaller size...
> */
> - error = xfs_alloc_lookup_ge(cur, 0, be32_to_cpu(agf->agf_longest), &i);
> + error = xfs_alloc_lookup_ge(cur, 0, *longest, &i);
Aha, it /is/ a cursor for the cntbt walk. In that case, why not pass
around a xfs_alloc_rec_incore_t as the cursor, since cntbt lookups are
capable of searching by blockcount and startblock?
Then you'd initialize it with
struct xfs_alloc_rec_incore tcur = {
.ar_blockcount = pag->pagf_longest;
};
and the XXX above turns into:
if (!tcur->ar_startblock)
error = xfs_alloc_lookup_ge(cur, 0, tcur->ar_blockcount, &i);
else
error = xfs_alloc_lookup_lt(cur, tcur->ar_startblock,
tcur->ar_blockcount, &i);
> if (error)
> goto out_del_cursor;
> + if (i == 0) {
> + /* nothing of that length left in the AG, we are done */
> + *longest = 0;
> + goto out_del_cursor;
> + }
>
> /*
> * Loop until we are done with all extents that are large
> - * enough to be worth discarding.
> + * enough to be worth discarding or we hit batch limits.
> */
> - while (i) {
> + while (i && batch-- > 0) {
> xfs_agblock_t fbno;
> xfs_extlen_t flen;
> xfs_daddr_t dbno;
> @@ -75,6 +145,20 @@ xfs_trim_extents(
> }
> ASSERT(flen <= be32_to_cpu(agf->agf_longest));
>
> + /*
> + * Keep going on this batch until we hit the record size
> + * changes. That way we will start the next batch with the new
> + * extent size and we don't get stuck on an extent size when
> + * there are more extents of that size than the batch size.
> + */
> + if (batch == 0) {
> + if (flen != *longest)
> + break;
> + batch++;
Hmm. So if the cntbt records are:
[N1, 100000]
[N2, 1]
[N3, 1]
...
[N100001, 1]
Does that mean batch 1 is:
[N1, 100000]
and batch 2 is:
<100,000 single block extents>
(where presumably N1..N100001 do not overlap)?
That seems poorly balanced, especially for (bad) SSDs whose per-discard
runtime is y = mx + b with a small m and huge b.
(Yes I have an SSD like that...)
I think if you changed the cursor to a cntbt record, then you could
bound the batch size by number of blocks, or number of busy extents,
or both, right?
> + } else {
> + *longest = flen;
> + }
> +
> /*
> * use daddr format for all range/len calculations as that is
> * the format the range/len variables are supplied in by
> @@ -88,6 +172,7 @@ xfs_trim_extents(
> */
> if (dlen < minlen) {
> trace_xfs_discard_toosmall(mp, pag->pag_agno, fbno, flen);
> + *longest = 0;
> break;
> }
>
> @@ -110,29 +195,180 @@ xfs_trim_extents(
> goto next_extent;
> }
>
> - trace_xfs_discard_extent(mp, pag->pag_agno, fbno, flen);
> - error = blkdev_issue_discard(bdev, dbno, dlen, GFP_NOFS);
> - if (error)
> - break;
> - *blocks_trimmed += flen;
> -
> + xfs_extent_busy_insert_discard(pag, fbno, flen,
> + &twork->busy_extents);
> + twork->blocks_trimmed += flen;
> next_extent:
> error = xfs_btree_decrement(cur, 0, &i);
> if (error)
> break;
>
> - if (fatal_signal_pending(current)) {
> - error = -ERESTARTSYS;
> - break;
> - }
> + /*
> + * If there's no more records in the tree, we are done. Set
> + * longest to 0 to indicate to the caller that there is no more
> + * extents to search.
> + */
> + if (i == 0)
> + *longest = 0;
> }
>
> + /*
> + * If there was an error, release all the gathered busy extents because
> + * we aren't going to issue a discard on them any more. If we ran out of
> + * records, set *longest to zero to tell the caller there is nothing
> + * left in this AG.
> + */
> + if (error)
> + xfs_extent_busy_clear(pag->pag_mount, &twork->busy_extents,
> + false);
> out_del_cursor:
> xfs_btree_del_cursor(cur, error);
> xfs_buf_relse(agbp);
> return error;
> }
>
> +static void
> +xfs_trim_discard_endio_work(
> + struct work_struct *work)
> +{
> + struct xfs_trim_work *twork =
> + container_of(work, struct xfs_trim_work, discard_endio_work);
> + struct xfs_perag *pag = twork->pag;
> +
> + xfs_extent_busy_clear(pag->pag_mount, &twork->busy_extents, false);
> + kmem_free(twork);
> + xfs_perag_rele(pag);
> +}
> +
> +/*
> + * Queue up the actual completion to a thread to avoid IRQ-safe locking for
> + * pagb_lock.
> + */
> +static void
> +xfs_trim_discard_endio(
> + struct bio *bio)
> +{
> + struct xfs_trim_work *twork = bio->bi_private;
> +
> + INIT_WORK(&twork->discard_endio_work, xfs_trim_discard_endio_work);
> + queue_work(xfs_discard_wq, &twork->discard_endio_work);
> + bio_put(bio);
> +}
> +
> +/*
> + * Walk the discard list and issue discards on all the busy extents in the
> + * list. We plug and chain the bios so that we only need a single completion
> + * call to clear all the busy extents once the discards are complete.
> + *
> + * XXX: This is largely a copy of xlog_discard_busy_extents(), opportunity for
> + * a common implementation there.
Indeed. :)
> + */
> +static int
> +xfs_trim_discard_extents(
> + struct xfs_perag *pag,
> + struct xfs_trim_work *twork)
> +{
> + struct xfs_mount *mp = pag->pag_mount;
> + struct xfs_extent_busy *busyp;
> + struct bio *bio = NULL;
> + struct blk_plug plug;
> + int error = 0;
> +
> + blk_start_plug(&plug);
> + list_for_each_entry(busyp, &twork->busy_extents, list) {
> + trace_xfs_discard_extent(mp, busyp->agno, busyp->bno,
> + busyp->length);
> +
> + error = __blkdev_issue_discard(mp->m_ddev_targp->bt_bdev,
> + XFS_AGB_TO_DADDR(mp, busyp->agno, busyp->bno),
> + XFS_FSB_TO_BB(mp, busyp->length),
> + GFP_NOFS, &bio);
> + if (error && error != -EOPNOTSUPP) {
> + xfs_info(mp,
> + "discard failed for extent [0x%llx,%u], error %d",
> + (unsigned long long)busyp->bno,
> + busyp->length,
> + error);
> + break;
> + }
> + }
> +
> + if (bio) {
> + bio->bi_private = twork;
> + bio->bi_end_io = xfs_trim_discard_endio;
> + submit_bio(bio);
> + } else {
> + xfs_trim_discard_endio_work(&twork->discard_endio_work);
> + }
> + blk_finish_plug(&plug);
> +
> + return error;
> +}
> +
> +/*
> + * Iterate the free list gathering extents and discarding them. We need a cursor
> + * for the repeated iteration of gather/discard loop, so use the longest extent
> + * we found in the last batch as the key to start the next.
> + */
> +static int
> +xfs_trim_extents(
> + struct xfs_perag *pag,
> + xfs_daddr_t start,
> + xfs_daddr_t end,
> + xfs_daddr_t minlen,
> + uint64_t *blocks_trimmed)
> +{
> + struct xfs_trim_work *twork;
> + xfs_extlen_t longest = pag->pagf_longest;
> + int error = 0;
> +
> + do {
> + LIST_HEAD(extents);
> +
> + twork = kzalloc(sizeof(*twork), GFP_KERNEL);
> + if (!twork) {
> + error = -ENOMEM;
> + break;
> + }
> +
> + atomic_inc(&pag->pag_active_ref);
> + twork->pag = pag;
twork->pag = xfs_perag_hold(pag); ?
> + INIT_LIST_HEAD(&twork->busy_extents);
> +
> + error = xfs_trim_gather_extents(pag, start, end, minlen,
> + &longest, twork);
> + if (error) {
> + kfree(twork);
> + xfs_perag_rele(pag);
> + break;
> + }
> +
> + /*
> + * We hand the trim work to the discard function here so that
> + * the busy list can be walked when the discards complete and
> + * removed from the busy extent list. This allows the discards
> + * to run asynchronously with gathering the next round of
> + * extents to discard.
> + *
> + * However, we must ensure that we do not reference twork after
> + * this function call, as it may have been freed by the time it
> + * returns control to us.
> + */
> + *blocks_trimmed += twork->blocks_trimmed;
> + error = xfs_trim_discard_extents(pag, twork);
> + if (error)
> + break;
> +
> + if (fatal_signal_pending(current)) {
> + error = -ERESTARTSYS;
> + break;
> + }
> + } while (longest != 0);
> +
> + return error;
> +
> +}
> +
> /*
> * trim a range of the filesystem.
> *
> diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
> index 7c2fdc71e42d..53c49b47daca 100644
> --- a/fs/xfs/xfs_extent_busy.c
> +++ b/fs/xfs/xfs_extent_busy.c
> @@ -19,13 +19,13 @@
> #include "xfs_log.h"
> #include "xfs_ag.h"
>
> -void
> -xfs_extent_busy_insert(
> - struct xfs_trans *tp,
> +static void
> +xfs_extent_busy_insert_list(
> struct xfs_perag *pag,
> xfs_agblock_t bno,
> xfs_extlen_t len,
> - unsigned int flags)
> + unsigned int flags,
> + struct list_head *busy_list)
> {
> struct xfs_extent_busy *new;
> struct xfs_extent_busy *busyp;
> @@ -40,7 +40,7 @@ xfs_extent_busy_insert(
> new->flags = flags;
>
> /* trace before insert to be able to see failed inserts */
> - trace_xfs_extent_busy(tp->t_mountp, pag->pag_agno, bno, len);
> + trace_xfs_extent_busy(pag->pag_mount, pag->pag_agno, bno, len);
>
> spin_lock(&pag->pagb_lock);
> rbp = &pag->pagb_tree.rb_node;
> @@ -62,10 +62,31 @@ xfs_extent_busy_insert(
> rb_link_node(&new->rb_node, parent, rbp);
> rb_insert_color(&new->rb_node, &pag->pagb_tree);
>
> - list_add(&new->list, &tp->t_busy);
> + list_add(&new->list, busy_list);
> spin_unlock(&pag->pagb_lock);
> }
>
> +void
> +xfs_extent_busy_insert(
> + struct xfs_trans *tp,
> + struct xfs_perag *pag,
> + xfs_agblock_t bno,
> + xfs_extlen_t len,
> + unsigned int flags)
> +{
> + xfs_extent_busy_insert_list(pag, bno, len, flags, &tp->t_busy);
> +}
> +
> +void xfs_extent_busy_insert_discard(
Function name on newline.
> + struct xfs_perag *pag,
> + xfs_agblock_t bno,
> + xfs_extlen_t len,
> + struct list_head *busy_list)
> +{
> + xfs_extent_busy_insert_list(pag, bno, len, XFS_EXTENT_BUSY_DISCARDED,
> + busy_list);
> +}
> +
> /*
> * Search for a busy extent within the range of the extent we are about to
> * allocate. You need to be holding the busy extent tree lock when calling
> diff --git a/fs/xfs/xfs_extent_busy.h b/fs/xfs/xfs_extent_busy.h
> index c37bf87e6781..f99073208770 100644
> --- a/fs/xfs/xfs_extent_busy.h
> +++ b/fs/xfs/xfs_extent_busy.h
> @@ -35,6 +35,10 @@ void
> xfs_extent_busy_insert(struct xfs_trans *tp, struct xfs_perag *pag,
> xfs_agblock_t bno, xfs_extlen_t len, unsigned int flags);
>
> +void
> +xfs_extent_busy_insert_discard(struct xfs_perag *pag, xfs_agblock_t bno,
> + xfs_extlen_t len, struct list_head *busy_list);
> +
> void
> xfs_extent_busy_clear(struct xfs_mount *mp, struct list_head *list,
> bool do_discard);
> --
> 2.40.1
>
^ permalink raw reply [flat|nested] 5+ messages in thread