From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4CB76CD8C90 for ; Wed, 10 Jun 2026 04:39:26 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [127.0.0.1]) by lists.ozlabs.org (Postfix) with ESMTP id 4gZtND4T8Tz2xl6; Wed, 10 Jun 2026 14:39:24 +1000 (AEST) Authentication-Results: lists.ozlabs.org; arc=none smtp.remote-ip="2607:f8b0:4864:20::1061" ARC-Seal: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1781005725; cv=none; b=H3eJ+dOOuWlN1tz9KWPmY6SudODK440IE05vOHu0mQWDnKgYFrMIBrrQlNa5gbQbNXDwjen/YbJuT3GN+PVX4ZtBBVuZ64DnKFA9taU6EZDNZaym5M5HFaH4m0a+ZEPXmyUuQp9JfpS6zR14CnFzIJ2tnPkVDi0Y+9OPC53J0bxE8eOMagfcyueH7LVA7rGgE0OS0F+3RtD6Gbb4Y4Bep2sSOan1VhAeLgYmFRuC2ovRuMkA71Fu/jDNQhJfbOy2MEyQvQfHv8sNXPqrqvm14XwU+u/JS7LNFWSvrBhIlFmqquKy1G628RCeYq4UwadbpK+tVk/852Au6nTqQd0ACA== ARC-Message-Signature: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1781005725; c=relaxed/relaxed; bh=2GtwyftxDgDxHf00TNmN+7bsJSzsnvSjUZVIyovfOvI=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=HIBLXOnT6Q5CZEDkfWWc2oJ27lkovPMuCbrQqLDjE/ekDmThOOnH1v7+/nPK2KJYYXIjOiKnvuuBOXOx/YvgER9MvHMgqTM3sQlvY0k7xFF0fp7ZGfu1L+U8mCjhnt4QQDws6WjKCvw7viusyKJBrHLpU7eg7nIdI8S4uGItHuaU0utJkt5MnkA3CB8ew5Rs8WbQFuBEw9tCGmcGRV2r06yZGduRXY9KG9rxaUDWh1RBMHPoUUXsoE64mkEUcYk8xd4g5LMDkSeVfGCezG89RB6mYqTPwtW8D4AM+RB7rgspWD5D/MZ3qu6WaHOOxUKmY4jyKGt6W2iE8AFlhxaNqQ== ARC-Authentication-Results: i=1; lists.ozlabs.org; dmarc=pass (p=reject dis=none) header.from=broadcom.com; dkim=pass (1024-bit key; unprotected) header.d=broadcom.com header.i=@broadcom.com header.a=rsa-sha256 header.s=google header.b=Mv3lNaG5; dkim-atps=neutral; spf=pass (client-ip=2607:f8b0:4864:20::1061; helo=mail-pj1-x1061.google.com; envelope-from=sumit.saxena@broadcom.com; receiver=lists.ozlabs.org) smtp.mailfrom=broadcom.com Authentication-Results: lists.ozlabs.org; dmarc=pass (p=reject dis=none) header.from=broadcom.com Authentication-Results: lists.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=broadcom.com header.i=@broadcom.com header.a=rsa-sha256 header.s=google header.b=Mv3lNaG5; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=broadcom.com (client-ip=2607:f8b0:4864:20::1061; helo=mail-pj1-x1061.google.com; envelope-from=sumit.saxena@broadcom.com; receiver=lists.ozlabs.org) Received: from mail-pj1-x1061.google.com (mail-pj1-x1061.google.com [IPv6:2607:f8b0:4864:20::1061]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4gZRy34w23z2ySf for ; Tue, 09 Jun 2026 21:48:42 +1000 (AEST) Received: by mail-pj1-x1061.google.com with SMTP id 98e67ed59e1d1-36ba706ab46so3657333a91.1 for ; Tue, 09 Jun 2026 04:48:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781005720; x=1781610520; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dkim-signature:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2GtwyftxDgDxHf00TNmN+7bsJSzsnvSjUZVIyovfOvI=; b=bmJx3HHVzijwxh3thBgNiJVsSw3RsnV8wMilM4GGKdJGbnvv7hKi1YSgMEmRcOIidQ GGQgmjJ1jvoYJH+k40VnNI3zMEBbHH9ChBJkoFXBz5RCHFDe2ghnGbB+30njoVPBrN0f SaDrTftmQWgctQSJyFnRAfk1yIWgC7iNPQZrw1UE+AayG9Td/yuJ6zqfWTdyAFTzA0c3 zfUtw3bokuTb0d4DXdCz7YVJGiieumYd4GVWIWAg6ANma9q28gta31Q+AUexiaes3Aao NVmGLkHGBAXpVQR82YGKblvHGYa1L0sH+n0YD8aA18nrCDZ73ysFo6eKa1wg9FmawZEN dSAQ== X-Forwarded-Encrypted: i=1; AFNElJ+quhcSQjcazgFTyNUh0IWQMpk7ag+PKxsPmHgFcKeM5R9Nx66gdgd4lKSPjV8BlkheRDiUlu1YlC7pvgE=@lists.ozlabs.org X-Gm-Message-State: AOJu0YybjNz7AixuqnTb4EgNeF4GTCZw2Sztve+VM0f/7ME7ZvE0h+Fv VSeosEuwE4aC/lKnVS+P7xtLVliSwaHE5kbkZjkIrjrEXsvpWctif6LnOb7OUR0yXOI/kG1Pz3i m2O2ctoXC//k9nzWB3ZpLzlNiyBd1ansZi2SsqPNB0EBUzR6vPn6tAJdJDT7hOc0d/G5r0s8qNr O6Mnt9fe9SPjMpKS5OaAsXJUM7MZoe3BY2zsk42Evl1BYNBdX2PN+Q0ey7z7mE/JKJrfaHf2H0j sEYBpug31zdXZEs/qJ1 X-Gm-Gg: Acq92OGc0fEbCCTptssrc49L2O8JeadoaEdu8NSx9PQD44kzngqG5IyuH73nrb1MmxL GnBwaw0WV2KxAprdWphYjsFXnkvRjMKq0pDGbgnpcVuNbobqHoEKB8+H3R7DO0s9lMXx5KTDnGR YD/EIeJ4prdCxlEqaCQ056s9DG1M8YohrVcszOrxCw0Lk8L9AWE7ATYbqz9h4Vzd+/t3FQZoxwI tPcNePimr8KX78r9F7UTLTRgZ5UwkTHqUviKOUIg9FdfT7RyPl6MXEPnyDVQ7/lAdz0BhcCG02g 2fOFyiUzBECNkusv2c3EzzXrHCseDZkysyeGgL4B5cR0Z4FoDZKxafqVpBcGqzKURiwI26IN0dX kl4zBs+5QEIJ+W79abnhX0wBHj8XYWnGzFIGUsoWrA/lYytRldAYx3FR87FUGVxhq2p4/PkK8UA X9+4E+nCRdt6aOiuiHUfdy90V3xVjzs1I7UNHiixU8lWhigYxOQ3xhMPEd8eoWAndc7LpTcg== X-Received: by 2002:a17:90b:3f88:b0:36d:e051:7b6a with SMTP id 98e67ed59e1d1-370ef1f3c52mr20436488a91.12.1781005719654; Tue, 09 Jun 2026 04:48:39 -0700 (PDT) Received: from smtp-us-east1-p01-i01-si01.dlp.protect.broadcom.com (address-144-49-247-118.dlp.protect.broadcom.com. [144.49.247.118]) by smtp-relay.gmail.com with ESMTPS id 98e67ed59e1d1-36f6c377847sm1901208a91.2.2026.06.09.04.48.39 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 09 Jun 2026 04:48:39 -0700 (PDT) X-Relaying-Domain: broadcom.com X-CFilter-Loop: Reflected Received: by mail-pj1-f71.google.com with SMTP id 98e67ed59e1d1-36bbcd40642so4149103a91.0 for ; Tue, 09 Jun 2026 04:48:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1781005718; x=1781610518; darn=lists.ozlabs.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=2GtwyftxDgDxHf00TNmN+7bsJSzsnvSjUZVIyovfOvI=; b=Mv3lNaG5+9N7w5cNF2awB2Sua5HNdEBQxRWWcfC8B7UyQIbPSqBQUMmy3Wioal8T6Y 7MxZW/A0aKjIC1pZ2mT/LSlOosXPtZoSNOPEYkDsCf5it3YgaV9GX7/kewK2jx5MJ4p9 eEnyfIjpSFPjNmwxTI+XXP+jQzjpw3mSmsfJ8= X-Forwarded-Encrypted: i=1; AFNElJ8Q0MX9iqg/TOngJsjzdA/9AMmUbhBGMO7+jqRoTcROJthB1IC64Tv/s512mmmHVhsCIesXf7mWK8tFlyw=@lists.ozlabs.org X-Received: by 2002:a17:90b:5783:b0:36a:4074:9aa6 with SMTP id 98e67ed59e1d1-370ee82fcb9mr19842180a91.6.1781005717464; Tue, 09 Jun 2026 04:48:37 -0700 (PDT) X-Received: by 2002:a17:90b:5783:b0:36a:4074:9aa6 with SMTP id 98e67ed59e1d1-370ee82fcb9mr19842127a91.6.1781005716916; Tue, 09 Jun 2026 04:48:36 -0700 (PDT) Received: from sumit_ws.dhcp.broadcom.net ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-36f6bf903fasm18898075a91.2.2026.06.09.04.48.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Jun 2026 04:48:36 -0700 (PDT) From: Sumit Saxena To: "Martin K . Petersen" , Jens Axboe Cc: "James E . J . Bottomley" , linux-scsi@vger.kernel.org, linux-block@vger.kernel.org, Adam Radford , Khalid Aziz , Adaptec OEM Raid Solutions , Matthew Wilcox , Hannes Reinecke , "Juergen E . Fischer" , Russell King , linux-arm-kernel@lists.infradead.org, Finn Thain , Michael Schmitz , Anil Gurumurthy , Sudarsana Kalluru , Oliver Neukum , Ali Akcaagac , Jamie Lenehan , Ram Vegesna , target-devel@vger.kernel.org, Bradley Grove , Satish Kharat , Sesidhar Baddela , Karan Tilak Kumar , Yihang Li , Don Brace , storagedev@microchip.com, HighPoint Linux Team , Tyrel Datwyler , Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Christophe Leroy , linuxppc-dev@lists.ozlabs.org, Brian King , Lee Duncan , Chris Leech , Mike Christie , open-iscsi@googlegroups.com, Justin Tee , Paul Ely , Kashyap Desai , Shivasharan S , Chandrakanth Patil , megaraidlinux.pdl@broadcom.com, Sathya Prakash Veerichetty , Sreekanth Reddy , mpi3mr-linuxdrv.pdl@broadcom.com, Suganath Prabu Subramani , Ranjan Kumar , MPT-FusionLinux.pdl@broadcom.com, Daniel Palmer , GOTO Masanori , YOKOTA Hiroshi , Jack Wang , Geoff Levand , Michael Reed , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Narsimhulu Musini , "K . Y . Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , Long Li , linux-hyperv@vger.kernel.org, "Michael S . Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , Eugenio Perez , virtualization@lists.linux.dev, Vishal Bhakta , bcm-kernel-feedback-list@broadcom.com, Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , xen-devel@lists.xenproject.org, Sumit Saxena Subject: [PATCH v3 0/4] scsi/block: NUMA-local scan allocations, shared-tag path cleanup, and SCSI I/O counters Date: Tue, 9 Jun 2026 17:47:59 +0530 Message-ID: <20260609121806.2121755-1-sumit.saxena@broadcom.com> X-Mailer: git-send-email 2.43.7 X-Mailing-List: linuxppc-dev@lists.ozlabs.org List-Id: List-Help: List-Owner: List-Post: List-Archive: , List-Subscribe: , , List-Unsubscribe: Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-DetectorID-Processed: b00c1d49-9d2e-4205-b15f-d015386d3d5e This series contains three performance improvements targeting the SCSI and block layers on multi-socket NUMA and heavily loaded SMP systems. On multi-socket NUMA systems we observed extreme I/O throughput variance of 50-60% between runs. This series identifies and fixes two root causes: cross-node memory accesses due to NUMA-unaware allocations in the scan path, and false sharing between hot atomic counters in struct request_queue and struct scsi_device. Performance notes: Tested on a dual-socket NUMA system (2x 32-core, 256 GB/socket) with an mpi3mr HBA, running fio (random read, 4K, QD 64, 16 jobs, 60 s, direct I/O). IOPS figures are in KIOPS (thousands of IOPS): Configuration Avg KIOPS Range (KIOPS) Spread Baseline 6,255 4,200 - 6,700 ~37% Baseline + all patches 7,350 7,000 - 7,700 ~10% Key findings: These patches combinedly reduces the observed 50-60% run-to-run variance to under 10%, significantly improving workload predictability and improves IOPs by 16-18%. No functional regressions observed. Changes in v3 ------------- -Handled feedback from Bart Van Assche and John Garry. -Added a patch for shost local NUMA allocation. -Converted ioerr_cnt and iotmo_cnt atomic counters into per-cpu counters. Changes in v2 -------------- Patch 1 — Same functional goal as v1 patch 1: NUMA-local scsi_device / scsi_target allocations in the scan path so steady-state I/O does not habitually touch remote memory when the host has a fixed DMA/NUMA affinity. Patch 2 — Replaces v1’s ____cacheline_aligned_in_smp on nr_active_requests_shared_tags with removal of the shared-tag fairness throttling machinery (including hctx_may_queue(), blk_mq_hw_ctx.nr_active, and request_queue.nr_active_requests_shared_tags and their updates). This follows the earlier standalone proposal by Bart Van Assche [1], rebased for the current tree; it removes the high-frequency atomic accounting that motivated the v1 false-sharing workaround and, in our testing, improves IOPS on the order of roughly 16–18% for the shared-tag workload exercised. Patch 3 — Replaces v1’s cache-line padding of iodone_cnt with percpu_counter for both iorequest_cnt and iodone_cnt, so submission and completion paths mostly update CPU-local state instead of bouncing a single cache line, without inflating struct scsi_device for SMP alignment. Merge / review hints -------------------- Patch 3 touches the block layer and should have block maintainer review; rest of patches are SCSI-oriented. Please route or Ack as your subsystem workflow requires. Bart Van Assche (1): block: drop shared-tag fairness throttling James Rizzo (1): scsi: scan: allocate sdev and starget on the NUMA node of the host adapter Sumit Saxena (2): scsi: host: allocate struct Scsi_Host on the NUMA node of the host adapter scsi: use percpu counters for iostat counters in struct scsi_device block/blk-core.c | 2 - block/blk-mq-debugfs.c | 22 ++++- block/blk-mq-tag.c | 4 - block/blk-mq.c | 17 +--- block/blk-mq.h | 100 ---------------------- drivers/scsi/3w-9xxx.c | 2 +- drivers/scsi/3w-sas.c | 2 +- drivers/scsi/3w-xxxx.c | 2 +- drivers/scsi/53c700.c | 2 +- drivers/scsi/BusLogic.c | 2 +- drivers/scsi/a100u2w.c | 2 +- drivers/scsi/a2091.c | 2 +- drivers/scsi/a3000.c | 2 +- drivers/scsi/aacraid/linit.c | 2 +- drivers/scsi/advansys.c | 6 +- drivers/scsi/aha152x.c | 2 +- drivers/scsi/aha1542.c | 2 +- drivers/scsi/aha1740.c | 2 +- drivers/scsi/aic7xxx/aic79xx_osm.c | 2 +- drivers/scsi/aic7xxx/aic7xxx_osm.c | 2 +- drivers/scsi/aic94xx/aic94xx_init.c | 2 +- drivers/scsi/am53c974.c | 2 +- drivers/scsi/arcmsr/arcmsr_hba.c | 3 +- drivers/scsi/arm/acornscsi.c | 2 +- drivers/scsi/arm/arxescsi.c | 2 +- drivers/scsi/arm/cumana_1.c | 2 +- drivers/scsi/arm/cumana_2.c | 2 +- drivers/scsi/arm/eesox.c | 2 +- drivers/scsi/arm/oak.c | 2 +- drivers/scsi/arm/powertec.c | 2 +- drivers/scsi/atari_scsi.c | 2 +- drivers/scsi/atp870u.c | 2 +- drivers/scsi/bfa/bfad_im.c | 2 +- drivers/scsi/csiostor/csio_init.c | 4 +- drivers/scsi/dc395x.c | 2 +- drivers/scsi/dmx3191d.c | 2 +- drivers/scsi/elx/efct/efct_xport.c | 4 +- drivers/scsi/esas2r/esas2r_main.c | 2 +- drivers/scsi/fdomain.c | 2 +- drivers/scsi/fnic/fnic_main.c | 2 +- drivers/scsi/g_NCR5380.c | 2 +- drivers/scsi/gvp11.c | 2 +- drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 2 +- drivers/scsi/hosts.c | 6 +- drivers/scsi/hpsa.c | 2 +- drivers/scsi/hptiop.c | 2 +- drivers/scsi/ibmvscsi/ibmvfc.c | 2 +- drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +- drivers/scsi/imm.c | 2 +- drivers/scsi/initio.c | 2 +- drivers/scsi/ipr.c | 2 +- drivers/scsi/ips.c | 2 +- drivers/scsi/isci/init.c | 2 +- drivers/scsi/jazz_esp.c | 2 +- drivers/scsi/libiscsi.c | 2 +- drivers/scsi/lpfc/lpfc_init.c | 2 +- drivers/scsi/mac53c94.c | 2 +- drivers/scsi/mac_esp.c | 2 +- drivers/scsi/mac_scsi.c | 2 +- drivers/scsi/megaraid.c | 2 +- drivers/scsi/megaraid/megaraid_mbox.c | 2 +- drivers/scsi/megaraid/megaraid_sas_base.c | 2 +- drivers/scsi/mesh.c | 2 +- drivers/scsi/mpi3mr/mpi3mr_os.c | 2 +- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 +- drivers/scsi/mvme147.c | 2 +- drivers/scsi/mvsas/mv_init.c | 2 +- drivers/scsi/mvumi.c | 2 +- drivers/scsi/myrb.c | 2 +- drivers/scsi/myrs.c | 2 +- drivers/scsi/ncr53c8xx.c | 2 +- drivers/scsi/nsp32.c | 2 +- drivers/scsi/pcmcia/nsp_cs.c | 2 +- drivers/scsi/pcmcia/qlogic_stub.c | 2 +- drivers/scsi/pcmcia/sym53c500_cs.c | 2 +- drivers/scsi/pm8001/pm8001_init.c | 2 +- drivers/scsi/pmcraid.c | 2 +- drivers/scsi/ppa.c | 2 +- drivers/scsi/ps3rom.c | 2 +- drivers/scsi/qla1280.c | 2 +- drivers/scsi/qla2xxx/qla_mid.c | 2 +- drivers/scsi/qla2xxx/qla_os.c | 2 +- drivers/scsi/qlogicfas.c | 2 +- drivers/scsi/qlogicpti.c | 2 +- drivers/scsi/scsi_debug.c | 2 +- drivers/scsi/scsi_error.c | 4 +- drivers/scsi/scsi_lib.c | 10 +-- drivers/scsi/scsi_scan.c | 15 +++- drivers/scsi/scsi_sysfs.c | 23 +++-- drivers/scsi/sd.c | 2 +- drivers/scsi/sgiwd93.c | 2 +- drivers/scsi/smartpqi/smartpqi_init.c | 2 +- drivers/scsi/snic/snic_main.c | 2 +- drivers/scsi/stex.c | 2 +- drivers/scsi/storvsc_drv.c | 2 +- drivers/scsi/sun3_scsi.c | 2 +- drivers/scsi/sun3x_esp.c | 2 +- drivers/scsi/sun_esp.c | 2 +- drivers/scsi/sym53c8xx_2/sym_glue.c | 2 +- drivers/scsi/virtio_scsi.c | 2 +- drivers/scsi/vmw_pvscsi.c | 2 +- drivers/scsi/wd719x.c | 2 +- drivers/scsi/xen-scsifront.c | 2 +- drivers/scsi/zorro_esp.c | 2 +- include/linux/blk-mq.h | 6 -- include/linux/blkdev.h | 2 - include/scsi/libfc.h | 2 +- include/scsi/scsi_device.h | 9 +- include/scsi/scsi_host.h | 3 +- 110 files changed, 168 insertions(+), 258 deletions(-) -- 2.43.7