From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F5B01A9B40 for ; Thu, 24 Apr 2025 05:27:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.173 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745472466; cv=none; b=fF93w0LNH51tPJsdpp808is1XXmYzlPhBPohtRgIQv/iO/4xzdr4w9huDJWU7ULn77Gqh7KgCkhf6ElGDlP+r86OkcwLIZZ18SurxzBMrXay0oKZHJT5HdALSmp+tclHPrjE/KliU28Wlri4zMKNmMVahKJzZOSz/03XY1IPnCM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745472466; c=relaxed/simple; bh=6fARLgnVs1jsq8jb4+ckeQM0NWINZg1NH84p6cyTT6I=; h=Message-ID:Date:MIME-Version:Subject:To:References:From: In-Reply-To:Content-Type; b=Vxf369sGTzj24DDlHfzC2XdkaOdyRwHMiNb2bD0IXyI8cwxSnviEaiqXC95bYo9/5+M49S4iYmAMTGESBp+1Af1UL6Gd6LNn83CdTFBZys/zrdN8tCupG/OCfpU1ejh9YOm/JklqKK+ukA4FVBusDXu6qk8OuNseqBHhggWtGWU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ErG7EldT; arc=none smtp.client-ip=209.85.222.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ErG7EldT" Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-7c548db0aa0so8718385a.0 for ; Wed, 23 Apr 2025 22:27:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745472463; x=1746077263; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=ObL2ZQn7rrxsM9rul6STUUwQC5WGxwK0k1EmnzX0CG8=; b=ErG7EldTJo1Toqrjx9QPN5Lse1j+e5TPvzBMJDsCXGTtmvZvJ8CfGGFZcBbswaPOll qBN4KobzLw42XFi4Oox3bkO9jlJ1UiZYDRvARbRoDRk/ZEVfS86kQ8MajQ9evHSG5Z9e jz2kLX92yX3xOlZPzdUaWS87bb0AsZLSX00bXbr+9TpV/SwaVzD2c1pNlVedw5AY8zOf hOlFbnnxmWnmmdqMouSYdlQnqhW4Esuf9oSUYlXp7MqoNtYKEb9kZFUiZeF15FrhGHFC z0rSI+KrEa0iIU9mNK0CZT17DOIRt7JHtakrHnd6SlHkcbrqxZ1szQtjRb0a1VLM6s2y SYPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745472463; x=1746077263; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ObL2ZQn7rrxsM9rul6STUUwQC5WGxwK0k1EmnzX0CG8=; b=UQEMkLUwgm9cKYTmg7quPYthvTMMLTAGPplfEzVJmLdNsBZMlF6YOg6MeVcYxc8vx7 sWw1VoYROGYkbABfFOQDJQOvIoYXmgpGtIo6uTBWODjhwjqO8pMpOPWzvhuBvob7LIZS iernj1iWOeCAhAFyeBVfEylNQpYQ0wL2WXox6xblXQKfqJfifMjSjqZA9+nhe9sTlvto 08Q6uSOYb8m9PYNVYVzxtiU0o2SvrCoFK8S/5ZyqjhDTdtwbLnH+P0Ktw0vNyc7IVBmq T6oLtQhzhDvkglaAibBcIX8wOAN/AvwJiBm1sFyNWXb4UZ3fJtlkYXwCTL+KCOMPt+QZ kgXg== X-Forwarded-Encrypted: i=1; AJvYcCVzK7zmdRLf1LKEKZESmfBPkjNWX9VuBpSPWRju+R/xwnG0sd+thYdQTdvX8QUtxTC1fOs=@vger.kernel.org X-Gm-Message-State: AOJu0YzuTfUr9wOx5DwA46aF7/eUYbwzJvoloE5slMxIcnDtubZqs0Nf x+dVfTkD9JyH3t1tODw4mkFwzgksZ6ydmmhDzotZG9sHsrahGGOH X-Gm-Gg: ASbGnctLEGI3iXmXRU/n7hrD/Wk9+RXpztA11jIjkGW8jNBBSeeUxHm07RN5Bs+r+0K TXmq9vmYRYg5p+ljUm4VAtoZ8HePO+ngWdPCjHE9MAE77Ruk90afxWQmXtlNDNP6qs5s/T9JSis yRQWEJIYZZ4fH1DmgbMoXBDBnomnjYykRm+axH8wgED6z5LRIiqNIPtleK4pV1D+mOWwSfJJd5N xtfHzBDbgkSNNGA2iw9rRmoIJqNXDUbJzS3pbsUUax153nbm+dl/FSwaUrQ55b3uHL4JEwP63+8 O1s7O+XAeyLLoLaEoMASGxLMybiJwYBC5AyLERYwVSrjETyxm7nTAKsvYCr29rpmKvaP3ZYB/pQ pNaaQ8OzpSA4qpPYTCJyAhXyWK/c= X-Google-Smtp-Source: AGHT+IFILeDVMkn4OVEuIs/oSvPkzNdRVaPI3Xd4EC6sxuQSaoJ9INjB3yZaFyYGPFZVHa7YbEOE1w== X-Received: by 2002:a05:622a:1209:b0:471:ea1a:d9e with SMTP id d75a77b69052e-47eb50c2a40mr7004421cf.12.1745472463264; Wed, 23 Apr 2025 22:27:43 -0700 (PDT) Received: from [192.168.1.201] (pool-108-48-176-137.washdc.fios.verizon.net. [108.48.176.137]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-47e9f7aade9sm7217521cf.40.2025.04.23.22.27.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 23 Apr 2025 22:27:42 -0700 (PDT) Message-ID: <949cb711-b20b-6bb5-6663-e12ca7d71cf2@gmail.com> Date: Thu, 24 Apr 2025 01:27:42 -0400 Precedence: bulk X-Mailing-List: fio@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [BUG] active zones exceeded error with max_open_zones Content-Language: en-US To: Damien Le Moal , Shin'ichiro Kawasaki , Jens Axboe , fio@vger.kernel.org References: <2b55d2f4-a093-d944-3d36-6efb5fb271ef@gmail.com> <2a232db8-280a-4a76-aade-916499fd524d@kernel.org> From: Sean Anderson In-Reply-To: <2a232db8-280a-4a76-aade-916499fd524d@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 4/23/25 23:10, Damien Le Moal wrote: > On 4/24/25 02:11, Sean Anderson wrote: >> Hi, >> >> I'm getting an "active zones exceeded" error when running fio with >> --rw=randwrite mode: >> >> # fio --bs=4k --rw=randwrite --norandommap --fsync=1 --number_ios=16384 --name=flushes --direct=1 --zonemode=zbd --max_open_zones=1978 --filename=/dev/my_zone_dev > > --max_open_zones=1978 is an extremely large value that likely exceeds the drive > capabilities, which is what fio is telling you. > What are your drive maximum open and active zones limits ? > > cat /sys/block/my_zone_dev/queue/max_open_zones > cat /sys/block/my_zone_dev/queue/max_active_zones This is correct for the drive. > fio will use the min_not_zero of these 2 values as the maximum number of zones > that can be written simultaneously. Especially if your drive has an active zone > limit, you *cannot* write to more zones than that limit at the same time. > fio will default to max_open_zones=min_not_zero(drive max open, drive max > active) and for a random write workload, it will: > - pick zones randomly up to max_open_zones > - direct write IOs to a randomly chosen zone in the current set of open zones > and when an open zone becomes full, randomly pick another zone to replace it. Well the issue is that it appears to just pick random zones, not random open zones. > For your workload, if you want to measure the maximum "random" write performance > of your disk, simply do NOT specify --max_open_zones=. fio will pick the best > possible number for you. Same issue. --Sean >> flushes: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1 >> fio-3.39 >> Starting 1 process >> active zones exceeded error, dev my_zone_dev, sector 189520 op 0x1:(WRITE) flags 0x8800 phys_seg 1 prio class 0 >> fio: io_u error on file /dev/my_zone_dev: Value too large for defined data type: write offset=97034240, buflen=4096 >> /dev/my_zone_dev: Exceeded max_active_zones limit. Check conditions of zones out of I/O ranges. >> fio: pid=2549, err=75/file:io_u.c:1976, func=io_u error, error=Value too large for defined data type >> >> flushes: (groupid=0, jobs=1): err=75 (file:io_u.c:1976, func=io_u error, error=Value too large for defined data type): pid=2549: Wed Apr 23 17:01:03 2025 >> write: IOPS=262, BW=1050KiB/s (1075kB/s)(9092KiB/8661msec); 0 zone resets >> clat (usec): min=983, max=20564, avg=3645.67, stdev=4347.94 >> lat (usec): min=984, max=20564, avg=3645.75, stdev=4347.94 >> clat percentiles (usec): >> | 1.00th=[ 996], 5.00th=[ 1012], 10.00th=[ 1029], 20.00th=[ 1418], >> | 30.00th=[ 1434], 40.00th=[ 1434], 50.00th=[ 1450], 60.00th=[ 1450], >> | 70.00th=[ 1467], 80.00th=[ 5669], 90.00th=[12256], 95.00th=[12780], >> | 99.00th=[15008], 99.50th=[15533], 99.90th=[16712], 99.95th=[17171], >> | 99.99th=[20579] >> bw ( KiB/s): min= 500, max= 1205, per=100.00%, avg=1052.88, stdev=195.04, samples=17 >> iops : min= 125, max= 301, avg=262.88, stdev=48.79, samples=17 >> lat (usec) : 1000=1.76% >> lat (msec) : 2=74.05%, 4=1.10%, 10=4.75%, 20=18.25%, 50=0.04% >> fsync/fdatasync/sync_file_range: >> sync (usec): min=50, max=11641, avg=160.03, stdev=798.31 >> sync percentiles (usec): >> | 1.00th=[ 53], 5.00th=[ 57], 10.00th=[ 66], 20.00th=[ 73], >> | 30.00th=[ 81], 40.00th=[ 82], 50.00th=[ 83], 60.00th=[ 84], >> | 70.00th=[ 85], 80.00th=[ 87], 90.00th=[ 178], 95.00th=[ 208], >> | 99.00th=[ 603], 99.50th=[ 1549], 99.90th=[11600], 99.95th=[11600], >> | 99.99th=[11600] >> cpu : usr=0.00%, sys=49.31%, ctx=2823, majf=0, minf=181 >> IO depths : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% >> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% >> complete : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% >> issued rwts: total=0,2274,0,2273 short=0,0,0,0 dropped=0,0,0,0 >> latency : target=0, window=0, percentile=100.00%, depth=1 >> >> Run status group 0 (all jobs): >> WRITE: bw=1050KiB/s (1075kB/s), 1050KiB/s-1050KiB/s (1075kB/s-1075kB/s), io=9092KiB (9310kB), run=8661-8661msec >> >> Disk stats (read/write): >> my_zone_dev: ios=170/4498, sectors=1336/17992, merge=0/0, ticks=0/118, in_queue=230, util=47.80% >> >> The issue seems to be that fio writes to a bunch of zones but never >> finishes them because they're not full yet: >> >> # blkzone report -c 16 /dev/my_block_dev >> start: 0x000000000, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000020, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000040, len 0x000020, cap 0x00001f, wptr 0x000008 reset:0 non-seq:0, zcond: 4(cl) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000060, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000080, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x0000000a0, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x0000000c0, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x0000000e0, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000100, len 0x000020, cap 0x00001f, wptr 0x000008 reset:0 non-seq:0, zcond: 4(cl) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000120, len 0x000020, cap 0x00001f, wptr 0x000008 reset:0 non-seq:0, zcond: 4(cl) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000140, len 0x000020, cap 0x00001f, wptr 0x000008 reset:0 non-seq:0, zcond: 4(cl) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000160, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x000000180, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x0000001a0, len 0x000020, cap 0x00001f, wptr 0x000010 reset:0 non-seq:0, zcond: 4(cl) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x0000001c0, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> start: 0x0000001e0, len 0x000020, cap 0x00001f, wptr 0x000000 reset:0 non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)] >> >> This issue doesn't seem to occur with --rw=write because sequential >> writes fill up zones and they get finished automatically. >> >> --Sean > >