From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C501BC83F15 for ; Wed, 30 Aug 2023 23:10:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239367AbjH3XKH (ORCPT ); Wed, 30 Aug 2023 19:10:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239347AbjH3XKH (ORCPT ); Wed, 30 Aug 2023 19:10:07 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EBF8283 for ; Wed, 30 Aug 2023 16:10:04 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1c1e3a4a06fso1314685ad.3 for ; Wed, 30 Aug 2023 16:10:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1693437004; x=1694041804; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=pY8RbnhSGlMiv7I3aFkTawjz67jUc9YTIpapEZLnNrI=; b=Fe28DdH+s6pW3uayHXyXJjFc4ZpZVRGFhQ1hSKCHJqXwfoa3zOxn2gvlsNO4IuTIR2 B8xEIGZ2anbDyEMQUgqzHYBlJMozcBu6fKLCogsR5wJAQ9SIA0X5/vM+q+jUlYPEZSrp 8uzDGqh8C42qJvUJtITFEihfdWS9qs/fCXBEzD/VvcrSL5k47FKyBMN2wO5wDPYgMMnj hvjy88/sJLqtAEKZpLgH8P8hG4V1b92rXHWI8jn5MNZGkchgFAJyr/ARJ19oDQkFm5Lr 71zBR6E4N3y6zTJ8ycqguk4xMjqMXjTLY3p98r2htV2luuA9Krxw/6luZDCiOWmvAaS8 8B9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693437004; x=1694041804; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=pY8RbnhSGlMiv7I3aFkTawjz67jUc9YTIpapEZLnNrI=; b=JoggvXuU5kZVbgdt9ukGAjuSEGraZjCQR5AmuQg5e9J/DFdNmXS15jfrwLAA9ioS2M ZGmrSOu6qsQKX6ke75MeKTnffwWsiTyYCa569RxnR63AAfgAUkPEX4ZEo2kMB9Itfkez LAdHVZapOU2liMZNPDTUGFsRqxRLvSEzrMKehb/SMVsh8OxsUx6jdqGw+uGOyL+XX3LH NpbUbp8/jhqcrXlndi2JRhN5OHUWphbdibINMEZ35bVuSIFlu66GByg3D/hDxZdvGLBY fQ4t6HWyiLnb0K049HqLmrukwRI0G8MxKXjgVMROA06FfHzLgIM6e40iRsJYlkEWgquv HNTw== X-Gm-Message-State: AOJu0YxG7mnGPBvL9AS8OPqPQ2inQGhNiLKLfQm1Xy7sVIA/CpoWF4hY m0k+GPdlwgNgfKBNCsBg0GiMxA== X-Google-Smtp-Source: AGHT+IH/kekC0GSQlZUGKo09NMAEckPgzzEnROIAm1WRV42FYGa4fZu0zCr9r+/SsFMApspPwx5Nvg== X-Received: by 2002:a17:902:cec4:b0:1bf:feb:100f with SMTP id d4-20020a170902cec400b001bf0feb100fmr4251162plg.9.1693437004415; Wed, 30 Aug 2023 16:10:04 -0700 (PDT) Received: from dread.disaster.area (pa49-195-66-88.pa.nsw.optusnet.com.au. [49.195.66.88]) by smtp.gmail.com with ESMTPSA id jd4-20020a170903260400b001c0774d9327sm35530plb.91.2023.08.30.16.10.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 16:10:03 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1qbUK0-008iPR-2Z; Thu, 31 Aug 2023 09:10:00 +1000 Date: Thu, 31 Aug 2023 09:10:00 +1000 From: Dave Chinner To: "Darrick J. Wong" Cc: zlang@redhat.com, linux-xfs@vger.kernel.org, fstests@vger.kernel.org, guan@eryu.me Subject: Re: [PATCH 2/2] generic: only enable io_uring in fsstress explicitly Message-ID: References: <169335094811.3534600.13011878728080983620.stgit@frogsfrogsfrogs> <169335095953.3534600.16325849760213190849.stgit@frogsfrogsfrogs> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <169335095953.3534600.16325849760213190849.stgit@frogsfrogsfrogs> Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org On Tue, Aug 29, 2023 at 04:15:59PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong > > Don't enable io_uring in fsstress unless someone asks for it explicitly, > just like fsx. I think both tools should require explicit opt-in to > facilitate A/B testing between the old IO paths and this new one. > > While I was playing with fstests+io_uring, I noticed quite a few > regressions in fstests, which fell into two classes: > > The first class is umount failing with EBUSY. Apparently this is due to > the kernel uring code hanging on to file references even after the > userspace program exits. Tests that run fsstress and immediately > unmount now fail sporadically due to the EBUSY. Unfortunately, the > metadata update stress tests, the recovery loop tests, the xfs online > fsck functional tests, and the xfs fuzz tests make heavy use of > "fsstress; umount" and they fail all over the place now. > > Something's broken, Jens and Christian said it should get fixed, but in > the meantime this is getting in the way of me testing my own code. I'm not seeing regular problems with io_uring on my test machines. Occasionally there will be a filesystem unmount issue, but that's not causing anything but a single test here or there to fail. It's not a big deal. > The second problem I noticed is that fsstress now lodges complaints > about sporadic heap corruption. I /think/ this is due to some kind of > memory mishandling bug when uring is active but IO requests fail, but I > haven't had the time to go figure out what's up with that. Yes, I've seen that happen in ~6.4 kernels, but current TOT doesn't seem to do that anymore on my test machines. Regardless, I don't think turning off io_uring support by default is the right thing to do. That's just shooting the messenger. We really do need this code to be exercised as much as possible because it is so full of bugs. Sure, add a flag to turn it off if you need it off (and add it to FSSTRESS_AVOID for your test environments), but otherwise we really should be exercising io_uring. Ignorance doesn't prevent bugs or CVEs.... Realistically, what we actually need is to require io_uring developers to focus on testing io_uring functionality with filesystems and fsstress and *to fix the regressions* rather than endlessly adding more features and complexity that create more bugs. Turning the code off certainly won't help us acheive that.... Cheers, Dave. -- Dave Chinner david@fromorbit.com