* Re: [PATCH v7] block: propagate in_flight to whole disk on partition I/O
From: Yizhou Tang @ 2026-06-09 15:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Tang Yizhou, axboe, kbusch, yukuai, linux-block, linux-kernel,
Leon Hwang
In-Reply-To: <20260527141057.GA13345@lst.de>
On Wed, May 27, 2026 at 10:11 PM Christoph Hellwig <hch@lst.de> wrote:
>
> Looks good:
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
Hi Jens,
This patch has been on the list for over two weeks without any new
review comments. If you have no objections, please consider picking it
up for the next merge window.
Best regards,
Yi
>
^ permalink raw reply
* Re: [PATCH 0/5] blk-cgroup: fix blkg list and policy data races
From: Yizhou Tang @ 2026-06-09 15:15 UTC (permalink / raw)
To: Yu Kuai
Cc: Jens Axboe, Tejun Heo, Josef Bacik, Ming Lei, Nilay Shroff,
Bart Van Assche, linux-block, cgroups, linux-kernel
In-Reply-To: <cover.1780492756.git.yukuai@fygo.io>
On Wed, Jun 3, 2026 at 9:35 PM Yu Kuai <yukuai@fygo.io> wrote:
>
> This small series fixes races between blkg destruction, q->blkg_list
> iteration, and blkcg policy activation.
>
> The first two patches serialize q->blkg_list walks in blkg_destroy_all()
> and BFQ writeback weight-raising teardown with blkcg_mutex. The next two
> patches close policy activation races with concurrent blkg destruction,
> including skipping blkgs that are already dying. The final patch factors
> the common policy data teardown loop.
>
> This uses blkcg_mutex rather than extending queue_lock coverage because
> the races are about blkg list visibility and policy-data lifetime, not
> request-queue dispatch state. blkg_free_workfn() already uses
> blkcg_mutex to serialize policy-data freeing with policy deactivation
> and removes blkgs from q->blkg_list only after that teardown. Taking the
> same mutex around the remaining q->blkg_list walkers gives one sleepable
> serialization point for blkg lifetime, avoids adding more queue_lock
> nesting, and prepares the follow-up conversion that removes queue_lock
> from blkcg list protection entirely.
LGTM.
I had noticed some time ago that blkcg_activate_policy() did not take
q->blkcg_mutex, and this patchset nicely resolves my concern.
Reviewed-by: Tang Yizhou <yizhou.tang@shopee.com>
Best regards,
Yi
>
> Yu Kuai (2):
> blk-cgroup: protect q->blkg_list iteration in blkg_destroy_all() with
> blkcg_mutex
> bfq: protect q->blkg_list iteration in bfq_end_wr_async() with
> blkcg_mutex
>
> Zheng Qixing (3):
> blk-cgroup: fix race between policy activation and blkg destruction
> blk-cgroup: skip dying blkg in blkcg_activate_policy()
> blk-cgroup: factor policy pd teardown loop into helper
>
> block/bfq-cgroup.c | 3 ++-
> block/bfq-iosched.c | 6 +++++
> block/blk-cgroup.c | 65 ++++++++++++++++++++++++---------------------
> 3 files changed, 43 insertions(+), 31 deletions(-)
>
> --
> 2.51.0
>
^ permalink raw reply
* Re: [PATCH v2] virtio-blk: clamp zone report to the report buffer capacity
From: Stefan Hajnoczi @ 2026-06-09 15:09 UTC (permalink / raw)
To: Michael Bommarito
Cc: Michael S . Tsirkin, Jason Wang, Jens Axboe, Xuan Zhuo,
virtualization, linux-block, linux-kernel
In-Reply-To: <20260607124834.3059944-1-michael.bommarito@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1826 bytes --]
On Sun, Jun 07, 2026 at 08:48:34AM -0400, Michael Bommarito wrote:
> virtblk_report_zones() trusts the device-reported number of zones when
> walking the report buffer:
>
> nz = min_t(u64, virtio64_to_cpu(vblk->vdev, report->nr_zones),
> nr_zones);
> ...
> for (i = 0; i < nz && zone_idx < nr_zones; i++) {
> ret = virtblk_parse_zone(vblk, &report->zones[i], ...);
>
> The buffer is allocated by virtblk_alloc_report_buffer(), whose size is
> capped by the queue's max hardware sectors and max segments and can
> therefore hold fewer descriptors than nr_zones. nz is bounded only by
> the device-supplied report->nr_zones and the requested nr_zones, never
> by the buffer's descriptor capacity. At probe time the request count is
> unbounded (blk_revalidate_disk_zones() calls report_zones() with
> nr_zones == UINT_MAX), so the device-supplied report->nr_zones is the
> sole gate: a device that reports more zones than fit in the buffer
> drives the loop to read report->zones[i] past the end of the allocation.
>
> A malicious or buggy virtio-blk device that reports an inflated nr_zones
> triggers this during zone revalidation at probe. KASAN reports a
> vmalloc-out-of-bounds read in virtblk_report_zones() against the report
> buffer allocated a few lines earlier.
>
> Clamp nz to the number of descriptors that actually fit in the report
> buffer.
>
> Fixes: 95bfec41bd3d ("virtio-blk: add support for zoned block devices")
> Assisted-by: Claude:claude-opus-4-8
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
> ---
> v2: drop the explanatory comment per Michael S. Tsirkin's review; the
> clamp itself is unchanged.
>
> drivers/block/virtio_blk.c | 2 ++
> 1 file changed, 2 insertions(+)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH] block: fix arg type in `blk_mq_update_nr_hw_queues`
From: Jens Axboe @ 2026-06-09 15:05 UTC (permalink / raw)
To: Andreas Hindborg, Bart Van Assche; +Cc: linux-block, linux-kernel
In-Reply-To: <87qzmggn43.fsf@kernel.org>
On 6/9/26 1:59 AM, Andreas Hindborg wrote:
>> If this patch does not change the behavior of the code for any practical
>> use case that would be good to mention.
>
> The patch does not change behavior of the code as long as no overflows
> occur, and they are not likely to occur. I will be sure to add that to
> the commit message.
>
> I came across this while designing the Rust API for these functions.
> Rust is kind of particular about integer types, so it gave me some
> checks/casts on the Rust side that really should no be required. I was
> thinking we might as well change the type on the C side to unsigned when
> the data it carries really should be unsigned.
The change itself is fine imho, but I agree with Bart that the commit
messages should be phrased very differently. It's not a fix, it's a
cleanup that makes your life simpler on the rust side.
--
Jens Axboe
^ permalink raw reply
* Re: [PATCH] rust: block: require `Sync` for `Operations::QueueData`
From: Miguel Ojeda @ 2026-06-09 14:20 UTC (permalink / raw)
To: Andreas Hindborg
Cc: Boqun Feng, Miguel Ojeda, Gary Guo, Björn Roy Baron,
Benno Lossin, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Jens Axboe, Daniel Almeida, linux-block, rust-for-linux,
linux-kernel
In-Reply-To: <20260608-queue-data-sync-v1-1-0efff051aaf3@kernel.org>
On Mon, Jun 8, 2026 at 10:25 AM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>
> Fixes: 90d952fac8ac ("rust: block: add `GenDisk` private data support")
Should this be Cc: stable given the hash is old? i.e. 6.18.y+
Cheers,
Miguel
^ permalink raw reply
* Re: [PATCH v17 00/10] rust: add `Ownable` trait and `Owned` type
From: Miguel Ojeda @ 2026-06-09 14:11 UTC (permalink / raw)
To: Andreas Hindborg
Cc: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, linux-kernel, rust-for-linux, linux-block,
linux-security-module, dri-devel, linux-fsdevel, linux-mm,
linux-pm, linux-pci, driver-core, Asahi Lina, Oliver Mangold,
Viresh Kumar
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
On Thu, Jun 4, 2026 at 10:13 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>
> rust: page: update formatting of `use` statements
> rust: aref: update formatting of use statements
Applied these to `rust-next` to get them out of the way already since
I have another similar one for `str` I am applying too -- thanks!
[ Picked from larger series and reworded. - Miguel ]
Cheers,
Miguel
^ permalink raw reply
* Re: [PATCH v3 1/4] scsi: scan: allocate sdev and starget on the NUMA node of the host adapter
From: John Garry @ 2026-06-09 13:49 UTC (permalink / raw)
To: Sumit Saxena, Martin K . Petersen, Jens Axboe
Cc: James E . J . Bottomley, linux-scsi, linux-block, James Rizzo
In-Reply-To: <20260609121806.2121755-2-sumit.saxena@broadcom.com>
- huge trim
On 09/06/2026 13:18, Sumit Saxena wrote:
> From: James Rizzo<james.rizzo@broadcom.com>
>
> When a host adapter is attached to a specific NUMA node, allocating
> scsi_device and scsi_target via kzalloc() may place them on a remote
> node. All hot-path I/O accesses to these structures then cross the NUMA
> interconnect, adding latency and consuming inter-node bandwidth.
>
> Use kzalloc_node() with dev_to_node(shost->dma_dev) so allocations land
> on the same node as the HBA,
kzalloc_node() does not guarantee local NUMA node allocations, only tries
> reducing cross-node traffic and improving
> I/O performance on NUMA systems.
>
> Signed-off-by: James Rizzo<james.rizzo@broadcom.com>
> Signed-off-by: Sumit Saxena<sumit.saxena@broadcom.com>
FWIW,
Reviewed-by: John Garry <john.g.garry@oracle.com>
^ permalink raw reply
* Re: [PATCH v3 2/4] scsi: host: allocate struct Scsi_Host on the NUMA node of the host adapter
From: John Garry @ 2026-06-09 13:03 UTC (permalink / raw)
To: Sumit Saxena, Martin K . Petersen, Jens Axboe
Cc: James E . J . Bottomley, linux-scsi, linux-block, Adam Radford,
Khalid Aziz, Adaptec OEM Raid Solutions, Matthew Wilcox,
Hannes Reinecke, Juergen E . Fischer, Russell King,
linux-arm-kernel, Finn Thain, Michael Schmitz, Anil Gurumurthy,
Sudarsana Kalluru, Oliver Neukum, Ali Akcaagac, Jamie Lenehan,
Ram Vegesna, target-devel, Bradley Grove, Satish Kharat,
Sesidhar Baddela, Karan Tilak Kumar, Yihang Li, Don Brace,
storagedev, HighPoint Linux Team, Tyrel Datwyler,
Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, linuxppc-dev, Brian King, Lee Duncan,
Chris Leech, Mike Christie, open-iscsi, Justin Tee, Paul Ely,
Kashyap Desai, Shivasharan S, Chandrakanth Patil,
megaraidlinux.pdl, Sathya Prakash Veerichetty, Sreekanth Reddy,
mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani, Ranjan Kumar,
MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori, YOKOTA Hiroshi,
Jack Wang, Geoff Levand, Michael Reed, Nilesh Javali,
GR-QLogic-Storage-Upstream, Narsimhulu Musini, K . Y . Srinivasan,
Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, linux-hyperv,
Michael S . Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
Eugenio Perez, virtualization, Vishal Bhakta,
bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, xen-devel
In-Reply-To: <20260609121806.2121755-3-sumit.saxena@broadcom.com>
On 09/06/2026 13:18, Sumit Saxena wrote:
> scsi_host_alloc() used kzalloc(), which always picks an arbitrary node.
> Extend the function to accept a 'struct device *dev' parameter and use
> kzalloc_node() with dev_to_node(dev) so the Scsi_Host struct lands on
> the same NUMA node as the HBA, mirroring the treatment already applied
> to struct scsi_device, struct scsi_target, and shost_data.
>
> When dev is NULL (legacy ISA/platform drivers without a dma_dev) the
> allocation falls back to NUMA_NO_NODE, preserving existing behaviour.
>
> Update all in-tree callers:
> - PCI-based HBA drivers pass &pdev->dev (or the equivalent struct
> member such as &phba->pcidev->dev, &h->pdev->dev, &ha->pdev->dev)
> so their host struct is placed on the adapter's node.
> - Non-PCI drivers (ISA, Amiga, ARM PCMCIA, virtio, Hyper-V, PS3, …)
> pass NULL.
> - libfc's libfc_host_alloc() inline helper passes NULL; FC drivers
> that want NUMA awareness can open-code the call with their pdev.
>
> Suggested-by: John Garry <john.g.garry@oracle.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Wow ... I was not expecting such a large change, but admittedly I did
not consider the implementation.
I did mention that pci-based adapters should already be effectively
doing kzalloc_node() since the adapter driver is probed on the local
NUMA node (and kmalloc first tries local NUMA allocations).
> ---
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index e047747d4ecf..e1f42be79729 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -403,12 +403,14 @@ static const struct device_type scsi_host_type = {
> * Return value:
> * Pointer to a new Scsi_Host
> **/
> -struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht, int privsize)
> +struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht, int privsize,
> + struct device *dev)
> {
> struct Scsi_Host *shost;
> int index;
>
> - shost = kzalloc(sizeof(struct Scsi_Host) + privsize, GFP_KERNEL);
> + shost = kzalloc_node(sizeof(struct Scsi_Host) + privsize, GFP_KERNEL,
> + dev ? dev_to_node(dev) : NUMA_NO_NODE);
> if (!shost)
> return NULL;
>
> -extern struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *, int);
> +extern struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht,
> + int privsize, struct device *dev);
> extern int __must_check scsi_add_host_with_dma(struct Scsi_Host *,
> struct device *,
> struct device *);
scsi_add_host_with_dma() and scsi_add_host() do assignment of
shost->dma_dev, so I think that could be moved to scsi_host_alloc().
I can imagine that we always know dev and dma_dev at Scsi_Host alloc
time (and not just scsi_add_host()) time. However those would be very
intrusive changes.
Let me consider this more. Maybe we can have a platform device version
of shost alloc, as I can't imagine that we care about much more. Thanks!
^ permalink raw reply
* Re: [Kernel Bug] INFO: rcu detected stall in disk_check_events
From: Jens Axboe @ 2026-06-09 12:41 UTC (permalink / raw)
To: Longxing Li, syzkaller, linux-block, linux-kernel
In-Reply-To: <CAHPqNmzE6bcNb3T+nJavn-mg=w2B0F1YCv47qv-Tsbn4xOrGmw@mail.gmail.com>
On 6/9/26 5:52 AM, Longxing Li wrote:
> Dear Linux kernel developers and maintainers,
>
> We would like to report a new kernel bug found by our tool. INFO: rcu
> detected stall in disk_check_events. Details are as follows.
>
> Kernel commit: v7.0.6
> Kernel config: see attachment
> report: see attachment
>
> We are currently analyzing the root cause and working on a
> reproducible PoC. We will provide further updates in this thread as
> soon as we have more information.
Please include pertinent information in the email rather than some
shared drive somewhere.
--
Jens Axboe
^ permalink raw reply
* [PATCH v3 1/4] scsi: scan: allocate sdev and starget on the NUMA node of the host adapter
From: Sumit Saxena @ 2026-06-09 12:18 UTC (permalink / raw)
To: Martin K . Petersen, Jens Axboe
Cc: James E . J . Bottomley, linux-scsi, linux-block, Adam Radford,
Khalid Aziz, Adaptec OEM Raid Solutions, Matthew Wilcox,
Hannes Reinecke, Juergen E . Fischer, Russell King,
linux-arm-kernel, Finn Thain, Michael Schmitz, Anil Gurumurthy,
Sudarsana Kalluru, Oliver Neukum, Ali Akcaagac, Jamie Lenehan,
Ram Vegesna, target-devel, Bradley Grove, Satish Kharat,
Sesidhar Baddela, Karan Tilak Kumar, Yihang Li, Don Brace,
storagedev, HighPoint Linux Team, Tyrel Datwyler,
Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, linuxppc-dev, Brian King, Lee Duncan,
Chris Leech, Mike Christie, open-iscsi, Justin Tee, Paul Ely,
Kashyap Desai, Shivasharan S, Chandrakanth Patil,
megaraidlinux.pdl, Sathya Prakash Veerichetty, Sreekanth Reddy,
mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani, Ranjan Kumar,
MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori, YOKOTA Hiroshi,
Jack Wang, Geoff Levand, Michael Reed, Nilesh Javali,
GR-QLogic-Storage-Upstream, Narsimhulu Musini, K . Y . Srinivasan,
Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, linux-hyperv,
Michael S . Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
Eugenio Perez, virtualization, Vishal Bhakta,
bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, xen-devel, James Rizzo, Sumit Saxena
In-Reply-To: <20260609121806.2121755-1-sumit.saxena@broadcom.com>
From: James Rizzo <james.rizzo@broadcom.com>
When a host adapter is attached to a specific NUMA node, allocating
scsi_device and scsi_target via kzalloc() may place them on a remote
node. All hot-path I/O accesses to these structures then cross the NUMA
interconnect, adding latency and consuming inter-node bandwidth.
Use kzalloc_node() with dev_to_node(shost->dma_dev) so allocations land
on the same node as the HBA, reducing cross-node traffic and improving
I/O performance on NUMA systems.
Signed-off-by: James Rizzo <james.rizzo@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
drivers/scsi/scsi_scan.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index e27da038603a..121a14d5fdb8 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -34,6 +34,7 @@
#include <linux/kthread.h>
#include <linux/spinlock.h>
#include <linux/async.h>
+#include <linux/topology.h>
#include <linux/slab.h>
#include <linux/unaligned.h>
@@ -287,8 +288,8 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
struct queue_limits lim;
- sdev = kzalloc(sizeof(*sdev) + shost->transportt->device_size,
- GFP_KERNEL);
+ sdev = kzalloc_node(sizeof(*sdev) + shost->transportt->device_size,
+ GFP_KERNEL, dev_to_node(shost->dma_dev));
if (!sdev)
goto out;
@@ -502,7 +503,7 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
struct scsi_target *found_target;
int error, ref_got;
- starget = kzalloc(size, GFP_KERNEL);
+ starget = kzalloc_node(size, GFP_KERNEL, dev_to_node(shost->dma_dev));
if (!starget) {
printk(KERN_ERR "%s: allocation failure\n", __func__);
return NULL;
--
2.43.7
^ permalink raw reply related
* [Kernel Bug] INFO: rcu detected stall in disk_check_events
From: Longxing Li @ 2026-06-09 11:52 UTC (permalink / raw)
To: syzkaller, axboe, linux-block, linux-kernel
Dear Linux kernel developers and maintainers,
We would like to report a new kernel bug found by our tool. INFO: rcu
detected stall in disk_check_events. Details are as follows.
Kernel commit: v7.0.6
Kernel config: see attachment
report: see attachment
We are currently analyzing the root cause and working on a
reproducible PoC. We will provide further updates in this thread as
soon as we have more information.
Best regards,
Longxing Li
==================================================================
https://drive.google.com/file/d/1Bx2unEf-QntjVi8g6Zw7QNO6OP4cjGO_/view?usp=drive_link
https://drive.google.com/file/d/1t8TOOI_sDqLxje1iolIJqeIxa-N6GUeI/view?usp=drive_link
^ permalink raw reply
* [PATCH v3 4/4] scsi: use percpu counters for iostat counters in struct scsi_device
From: Sumit Saxena @ 2026-06-09 12:18 UTC (permalink / raw)
To: Martin K . Petersen, Jens Axboe
Cc: James E . J . Bottomley, linux-scsi, linux-block, Adam Radford,
Khalid Aziz, Adaptec OEM Raid Solutions, Matthew Wilcox,
Hannes Reinecke, Juergen E . Fischer, Russell King,
linux-arm-kernel, Finn Thain, Michael Schmitz, Anil Gurumurthy,
Sudarsana Kalluru, Oliver Neukum, Ali Akcaagac, Jamie Lenehan,
Ram Vegesna, target-devel, Bradley Grove, Satish Kharat,
Sesidhar Baddela, Karan Tilak Kumar, Yihang Li, Don Brace,
storagedev, HighPoint Linux Team, Tyrel Datwyler,
Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, linuxppc-dev, Brian King, Lee Duncan,
Chris Leech, Mike Christie, open-iscsi, Justin Tee, Paul Ely,
Kashyap Desai, Shivasharan S, Chandrakanth Patil,
megaraidlinux.pdl, Sathya Prakash Veerichetty, Sreekanth Reddy,
mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani, Ranjan Kumar,
MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori, YOKOTA Hiroshi,
Jack Wang, Geoff Levand, Michael Reed, Nilesh Javali,
GR-QLogic-Storage-Upstream, Narsimhulu Musini, K . Y . Srinivasan,
Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, linux-hyperv,
Michael S . Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
Eugenio Perez, virtualization, Vishal Bhakta,
bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, xen-devel, Sumit Saxena, John Garry
In-Reply-To: <20260609121806.2121755-1-sumit.saxena@broadcom.com>
iorequest_cnt and iodone_cnt are updated on every command dispatch and
completion, often from different CPUs on high queue depth workloads.
Using adjacent atomic_t fields causes cache line contention between the
submission and completion paths.
Extend the same treatment to ioerr_cnt and iotmo_cnt so all four iostat
counters in struct scsi_device use struct percpu_counter.
Suggested-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
drivers/scsi/scsi_error.c | 4 ++--
drivers/scsi/scsi_lib.c | 10 +++++-----
drivers/scsi/scsi_scan.c | 8 ++++++++
drivers/scsi/scsi_sysfs.c | 23 ++++++++++++++---------
drivers/scsi/sd.c | 2 +-
include/scsi/scsi_device.h | 9 +++++----
6 files changed, 35 insertions(+), 21 deletions(-)
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 147127fb4db9..b1aa7da2ba7c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -349,7 +349,7 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
trace_scsi_dispatch_cmd_timeout(scmd);
scsi_log_completion(scmd, TIMEOUT_ERROR);
- atomic_inc(&scmd->device->iotmo_cnt);
+ percpu_counter_inc(&scmd->device->iotmo_cnt);
if (host->eh_deadline != -1 && !host->last_reset)
host->last_reset = jiffies;
@@ -370,7 +370,7 @@ enum blk_eh_timer_return scsi_timeout(struct request *req)
*/
if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
return BLK_EH_DONE;
- atomic_inc(&scmd->device->iodone_cnt);
+ percpu_counter_inc(&scmd->device->iodone_cnt);
if (scsi_abort_command(scmd) != SUCCESS) {
set_host_byte(scmd, DID_TIME_OUT);
scsi_eh_scmd_add(scmd);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6e8c7a42603e..979fdace33ac 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1554,9 +1554,9 @@ static void scsi_complete(struct request *rq)
INIT_LIST_HEAD(&cmd->eh_entry);
- atomic_inc(&cmd->device->iodone_cnt);
+ percpu_counter_inc(&cmd->device->iodone_cnt);
if (cmd->result)
- atomic_inc(&cmd->device->ioerr_cnt);
+ percpu_counter_inc(&cmd->device->ioerr_cnt);
disposition = scsi_decide_disposition(cmd);
if (disposition != SUCCESS && scsi_cmd_runtime_exceeced(cmd))
@@ -1592,7 +1592,7 @@ static enum scsi_qc_status scsi_dispatch_cmd(struct scsi_cmnd *cmd)
struct Scsi_Host *host = cmd->device->host;
int rtn = 0;
- atomic_inc(&cmd->device->iorequest_cnt);
+ percpu_counter_inc(&cmd->device->iorequest_cnt);
/* check if the device is still usable */
if (unlikely(cmd->device->sdev_state == SDEV_DEL)) {
@@ -1614,7 +1614,7 @@ static enum scsi_qc_status scsi_dispatch_cmd(struct scsi_cmnd *cmd)
*/
SCSI_LOG_MLQUEUE(3, scmd_printk(KERN_INFO, cmd,
"queuecommand : device blocked\n"));
- atomic_dec(&cmd->device->iorequest_cnt);
+ percpu_counter_dec(&cmd->device->iorequest_cnt);
return SCSI_MLQUEUE_DEVICE_BUSY;
}
@@ -1647,7 +1647,7 @@ static enum scsi_qc_status scsi_dispatch_cmd(struct scsi_cmnd *cmd)
trace_scsi_dispatch_cmd_start(cmd);
rtn = host->hostt->queuecommand(host, cmd);
if (rtn) {
- atomic_dec(&cmd->device->iorequest_cnt);
+ percpu_counter_dec(&cmd->device->iorequest_cnt);
trace_scsi_dispatch_cmd_error(cmd, rtn);
if (rtn != SCSI_MLQUEUE_DEVICE_BUSY &&
rtn != SCSI_MLQUEUE_TARGET_BUSY)
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 121a14d5fdb8..bc885c72f01e 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -350,6 +350,14 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
scsi_sysfs_device_initialize(sdev);
+ if (percpu_counter_init(&sdev->iorequest_cnt, 0, GFP_KERNEL) ||
+ percpu_counter_init(&sdev->iodone_cnt, 0, GFP_KERNEL) ||
+ percpu_counter_init(&sdev->ioerr_cnt, 0, GFP_KERNEL) ||
+ percpu_counter_init(&sdev->iotmo_cnt, 0, GFP_KERNEL)) {
+ ret = -ENOMEM;
+ goto out_device_destroy;
+ }
+
if (scsi_device_is_pseudo_dev(sdev))
return sdev;
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index dfc3559e7e04..f652edd16497 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -516,6 +516,10 @@ static void scsi_device_dev_release(struct device *dev)
if (vpd_pgb7)
kfree_rcu(vpd_pgb7, rcu);
kfree(sdev->inquiry);
+ percpu_counter_destroy(&sdev->iotmo_cnt);
+ percpu_counter_destroy(&sdev->ioerr_cnt);
+ percpu_counter_destroy(&sdev->iodone_cnt);
+ percpu_counter_destroy(&sdev->iorequest_cnt);
kfree(sdev);
if (parent)
@@ -936,26 +940,27 @@ static ssize_t
show_iostat_counterbits(struct device *dev, struct device_attribute *attr,
char *buf)
{
- return snprintf(buf, 20, "%d\n", (int)sizeof(atomic_t) * 8);
+ /* iostat counters are per-CPU sums (s64). Report width for tools. */
+ return sysfs_emit(buf, "%zu\n", sizeof(s64) * 8);
}
static DEVICE_ATTR(iocounterbits, S_IRUGO, show_iostat_counterbits, NULL);
-#define show_sdev_iostat(field) \
+#define show_sdev_iostat_percpu(field) \
static ssize_t \
show_iostat_##field(struct device *dev, struct device_attribute *attr, \
char *buf) \
{ \
struct scsi_device *sdev = to_scsi_device(dev); \
- unsigned long long count = atomic_read(&sdev->field); \
- return snprintf(buf, 20, "0x%llx\n", count); \
+ unsigned long long count = percpu_counter_sum(&sdev->field); \
+ return sysfs_emit(buf, "0x%llx\n", count); \
} \
-static DEVICE_ATTR(field, S_IRUGO, show_iostat_##field, NULL)
+static DEVICE_ATTR(field, 0444, show_iostat_##field, NULL)
-show_sdev_iostat(iorequest_cnt);
-show_sdev_iostat(iodone_cnt);
-show_sdev_iostat(ioerr_cnt);
-show_sdev_iostat(iotmo_cnt);
+show_sdev_iostat_percpu(iorequest_cnt);
+show_sdev_iostat_percpu(iodone_cnt);
+show_sdev_iostat_percpu(ioerr_cnt);
+show_sdev_iostat_percpu(iotmo_cnt);
static ssize_t
sdev_show_modalias(struct device *dev, struct device_attribute *attr, char *buf)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index adc3fa55ca2c..b7ce01de17b3 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -4043,7 +4043,7 @@ static int sd_probe(struct scsi_device *sdp)
sdkp->index = index;
sdkp->max_retries = SD_MAX_RETRIES;
atomic_set(&sdkp->openers, 0);
- atomic_set(&sdkp->device->ioerr_cnt, 0);
+ percpu_counter_set(&sdkp->device->ioerr_cnt, 0);
if (!sdp->request_queue->rq_timeout) {
if (sdp->type != TYPE_MOD)
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 029f5115b2ea..4be36bf2a475 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -9,6 +9,7 @@
#include <scsi/scsi.h>
#include <scsi/scsi_common.h>
#include <linux/atomic.h>
+#include <linux/percpu_counter.h>
#include <linux/sbitmap.h>
struct bsg_device;
@@ -272,10 +273,10 @@ struct scsi_device {
unsigned int max_device_blocked; /* what device_blocked counts down from */
#define SCSI_DEFAULT_DEVICE_BLOCKED 3
- atomic_t iorequest_cnt;
- atomic_t iodone_cnt;
- atomic_t ioerr_cnt;
- atomic_t iotmo_cnt;
+ struct percpu_counter iorequest_cnt;
+ struct percpu_counter iodone_cnt;
+ struct percpu_counter ioerr_cnt;
+ struct percpu_counter iotmo_cnt;
struct device sdev_gendev,
sdev_dev;
--
2.43.7
^ permalink raw reply related
* [PATCH v3 3/4] block: drop shared-tag fairness throttling
From: Sumit Saxena @ 2026-06-09 12:18 UTC (permalink / raw)
To: Martin K . Petersen, Jens Axboe
Cc: James E . J . Bottomley, linux-scsi, linux-block, Adam Radford,
Khalid Aziz, Adaptec OEM Raid Solutions, Matthew Wilcox,
Hannes Reinecke, Juergen E . Fischer, Russell King,
linux-arm-kernel, Finn Thain, Michael Schmitz, Anil Gurumurthy,
Sudarsana Kalluru, Oliver Neukum, Ali Akcaagac, Jamie Lenehan,
Ram Vegesna, target-devel, Bradley Grove, Satish Kharat,
Sesidhar Baddela, Karan Tilak Kumar, Yihang Li, Don Brace,
storagedev, HighPoint Linux Team, Tyrel Datwyler,
Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, linuxppc-dev, Brian King, Lee Duncan,
Chris Leech, Mike Christie, open-iscsi, Justin Tee, Paul Ely,
Kashyap Desai, Shivasharan S, Chandrakanth Patil,
megaraidlinux.pdl, Sathya Prakash Veerichetty, Sreekanth Reddy,
mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani, Ranjan Kumar,
MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori, YOKOTA Hiroshi,
Jack Wang, Geoff Levand, Michael Reed, Nilesh Javali,
GR-QLogic-Storage-Upstream, Narsimhulu Musini, K . Y . Srinivasan,
Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, linux-hyperv,
Michael S . Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
Eugenio Perez, virtualization, Vishal Bhakta,
bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, xen-devel, Bart Van Assche, Sumit Saxena
In-Reply-To: <20260609121806.2121755-1-sumit.saxena@broadcom.com>
From: Bart Van Assche <bvanassche@acm.org>
Original patch [1] by Bart Van Assche; this version is rebased onto the
current tree. In testing it improves IOPS by roughly 16-18% by removing
the fair-sharing throttle on shared tag queues.
This patch removes the following code and structure members:
- The function hctx_may_queue().
- blk_mq_hw_ctx.nr_active and request_queue.nr_active_requests_shared_tags
and also all the code that modifies these two member variables.
[1]: https://lore.kernel.org/linux-block/20240529213921.3166462-1-bvanassche@acm.org/
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
block/blk-core.c | 2 -
block/blk-mq-debugfs.c | 22 ++++++++-
block/blk-mq-tag.c | 4 --
block/blk-mq.c | 17 +------
block/blk-mq.h | 100 -----------------------------------------
include/linux/blk-mq.h | 6 ---
include/linux/blkdev.h | 2 -
7 files changed, 22 insertions(+), 131 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 17450058ea6d..129acc1b27e5 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -421,8 +421,6 @@ struct request_queue *blk_alloc_queue(struct queue_limits *lim, int node_id)
q->node = node_id;
- atomic_set(&q->nr_active_requests_shared_tags, 0);
-
timer_setup(&q->timeout, blk_rq_timed_out_timer, 0);
INIT_WORK(&q->timeout_work, blk_timeout_work);
INIT_LIST_HEAD(&q->icq_list);
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 047ec887456b..8b85a7f8e987 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -468,11 +468,31 @@ static int hctx_sched_tags_bitmap_show(void *data, struct seq_file *m)
return 0;
}
+struct count_active_params {
+ struct blk_mq_hw_ctx *hctx;
+ int *active;
+};
+
+static bool hctx_count_active(struct request *rq, void *data)
+{
+ const struct count_active_params *params = data;
+
+ if (rq->mq_hctx == params->hctx)
+ (*params->active)++;
+
+ return true;
+}
+
static int hctx_active_show(void *data, struct seq_file *m)
{
struct blk_mq_hw_ctx *hctx = data;
+ int active = 0;
+ struct count_active_params params = { .hctx = hctx, .active = &active };
+
+ blk_mq_all_tag_iter(hctx->sched_tags ?: hctx->tags, hctx_count_active,
+ ¶ms);
- seq_printf(m, "%d\n", __blk_mq_active_requests(hctx));
+ seq_printf(m, "%d\n", active);
return 0;
}
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 33946cdb5716..bfd27cc6249b 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -109,10 +109,6 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
static int __blk_mq_get_tag(struct blk_mq_alloc_data *data,
struct sbitmap_queue *bt)
{
- if (!data->q->elevator && !(data->flags & BLK_MQ_REQ_RESERVED) &&
- !hctx_may_queue(data->hctx, bt))
- return BLK_MQ_NO_TAG;
-
if (data->shallow_depth)
return sbitmap_queue_get_shallow(bt, data->shallow_depth);
else
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4c5c16cce4f8..bbac59a06044 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -489,8 +489,6 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data)
}
} while (data->nr_tags > nr);
- if (!(data->rq_flags & RQF_SCHED_TAGS))
- blk_mq_add_active_requests(data->hctx, nr);
/* caller already holds a reference, add for remainder */
percpu_ref_get_many(&data->q->q_usage_counter, nr - 1);
data->nr_tags -= nr;
@@ -587,8 +585,6 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data)
goto retry;
}
- if (!(data->rq_flags & RQF_SCHED_TAGS))
- blk_mq_inc_active_requests(data->hctx);
rq = blk_mq_rq_ctx_init(data, blk_mq_tags_from_data(data), tag);
blk_mq_rq_time_init(rq, alloc_time_ns);
return rq;
@@ -763,8 +759,6 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
tag = blk_mq_get_tag(&data);
if (tag == BLK_MQ_NO_TAG)
goto out_queue_exit;
- if (!(data.rq_flags & RQF_SCHED_TAGS))
- blk_mq_inc_active_requests(data.hctx);
rq = blk_mq_rq_ctx_init(&data, blk_mq_tags_from_data(&data), tag);
blk_mq_rq_time_init(rq, alloc_time_ns);
rq->__data_len = 0;
@@ -807,10 +801,8 @@ static void __blk_mq_free_request(struct request *rq)
blk_pm_mark_last_busy(rq);
rq->mq_hctx = NULL;
- if (rq->tag != BLK_MQ_NO_TAG) {
- blk_mq_dec_active_requests(hctx);
+ if (rq->tag != BLK_MQ_NO_TAG)
blk_mq_put_tag(hctx->tags, ctx, rq->tag);
- }
if (sched_tag != BLK_MQ_NO_TAG)
blk_mq_put_tag(hctx->sched_tags, ctx, sched_tag);
blk_mq_sched_restart(hctx);
@@ -1188,8 +1180,6 @@ static inline void blk_mq_flush_tag_batch(struct blk_mq_hw_ctx *hctx,
{
struct request_queue *q = hctx->queue;
- blk_mq_sub_active_requests(hctx, nr_tags);
-
blk_mq_put_tags(hctx->tags, tag_array, nr_tags);
percpu_ref_put_many(&q->q_usage_counter, nr_tags);
}
@@ -1875,9 +1865,6 @@ bool __blk_mq_alloc_driver_tag(struct request *rq)
if (blk_mq_tag_is_reserved(rq->mq_hctx->sched_tags, rq->internal_tag)) {
bt = &rq->mq_hctx->tags->breserved_tags;
tag_offset = 0;
- } else {
- if (!hctx_may_queue(rq->mq_hctx, bt))
- return false;
}
tag = __sbitmap_queue_get(bt);
@@ -1885,7 +1872,6 @@ bool __blk_mq_alloc_driver_tag(struct request *rq)
return false;
rq->tag = tag + tag_offset;
- blk_mq_inc_active_requests(rq->mq_hctx);
return true;
}
@@ -4058,7 +4044,6 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set,
if (!zalloc_cpumask_var_node(&hctx->cpumask, gfp, node))
goto free_hctx;
- atomic_set(&hctx->nr_active, 0);
if (node == NUMA_NO_NODE)
node = set->numa_node;
hctx->numa_node = node;
diff --git a/block/blk-mq.h b/block/blk-mq.h
index aa15d31aaae9..8dfb67c55f5d 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -291,70 +291,9 @@ static inline int blk_mq_get_rq_budget_token(struct request *rq)
return -1;
}
-static inline void __blk_mq_add_active_requests(struct blk_mq_hw_ctx *hctx,
- int val)
-{
- if (blk_mq_is_shared_tags(hctx->flags))
- atomic_add(val, &hctx->queue->nr_active_requests_shared_tags);
- else
- atomic_add(val, &hctx->nr_active);
-}
-
-static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx)
-{
- __blk_mq_add_active_requests(hctx, 1);
-}
-
-static inline void __blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx,
- int val)
-{
- if (blk_mq_is_shared_tags(hctx->flags))
- atomic_sub(val, &hctx->queue->nr_active_requests_shared_tags);
- else
- atomic_sub(val, &hctx->nr_active);
-}
-
-static inline void __blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx)
-{
- __blk_mq_sub_active_requests(hctx, 1);
-}
-
-static inline void blk_mq_add_active_requests(struct blk_mq_hw_ctx *hctx,
- int val)
-{
- if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
- __blk_mq_add_active_requests(hctx, val);
-}
-
-static inline void blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx)
-{
- if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
- __blk_mq_inc_active_requests(hctx);
-}
-
-static inline void blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx,
- int val)
-{
- if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
- __blk_mq_sub_active_requests(hctx, val);
-}
-
-static inline void blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx)
-{
- if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
- __blk_mq_dec_active_requests(hctx);
-}
-
-static inline int __blk_mq_active_requests(struct blk_mq_hw_ctx *hctx)
-{
- if (blk_mq_is_shared_tags(hctx->flags))
- return atomic_read(&hctx->queue->nr_active_requests_shared_tags);
- return atomic_read(&hctx->nr_active);
-}
static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx,
struct request *rq)
{
- blk_mq_dec_active_requests(hctx);
blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag);
rq->tag = BLK_MQ_NO_TAG;
}
@@ -396,45 +335,6 @@ static inline void blk_mq_free_requests(struct list_head *list)
}
}
-/*
- * For shared tag users, we track the number of currently active users
- * and attempt to provide a fair share of the tag depth for each of them.
- */
-static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
- struct sbitmap_queue *bt)
-{
- unsigned int depth, users;
-
- if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED))
- return true;
-
- /*
- * Don't try dividing an ant
- */
- if (bt->sb.depth == 1)
- return true;
-
- if (blk_mq_is_shared_tags(hctx->flags)) {
- struct request_queue *q = hctx->queue;
-
- if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
- return true;
- } else {
- if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
- return true;
- }
-
- users = READ_ONCE(hctx->tags->active_queues);
- if (!users)
- return true;
-
- /*
- * Allow at least some tags
- */
- depth = max((bt->sb.depth + users - 1) / users, 4U);
- return __blk_mq_active_requests(hctx) < depth;
-}
-
/* run the code block in @dispatch_ops with rcu/srcu read lock held */
#define __blk_mq_run_dispatch_ops(q, check_sleep, dispatch_ops) \
do { \
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 18a2388ba581..ccbb07559402 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -432,12 +432,6 @@ struct blk_mq_hw_ctx {
/** @queue_num: Index of this hardware queue. */
unsigned int queue_num;
- /**
- * @nr_active: Number of active requests. Only used when a tag set is
- * shared across request queues.
- */
- atomic_t nr_active;
-
/** @cpuhp_online: List to store request if CPU is going to die */
struct hlist_node cpuhp_online;
/** @cpuhp_dead: List to store request if some CPU die. */
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 890128cdea1c..95525b1d7b74 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -567,8 +567,6 @@ struct request_queue {
struct timer_list timeout;
struct work_struct timeout_work;
- atomic_t nr_active_requests_shared_tags;
-
struct blk_mq_tags *sched_shared_tags;
struct list_head icq_list;
--
2.43.7
^ permalink raw reply related
* [PATCH v3 2/4] scsi: host: allocate struct Scsi_Host on the NUMA node of the host adapter
From: Sumit Saxena @ 2026-06-09 12:18 UTC (permalink / raw)
To: Martin K . Petersen, Jens Axboe
Cc: James E . J . Bottomley, linux-scsi, linux-block, Adam Radford,
Khalid Aziz, Adaptec OEM Raid Solutions, Matthew Wilcox,
Hannes Reinecke, Juergen E . Fischer, Russell King,
linux-arm-kernel, Finn Thain, Michael Schmitz, Anil Gurumurthy,
Sudarsana Kalluru, Oliver Neukum, Ali Akcaagac, Jamie Lenehan,
Ram Vegesna, target-devel, Bradley Grove, Satish Kharat,
Sesidhar Baddela, Karan Tilak Kumar, Yihang Li, Don Brace,
storagedev, HighPoint Linux Team, Tyrel Datwyler,
Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, linuxppc-dev, Brian King, Lee Duncan,
Chris Leech, Mike Christie, open-iscsi, Justin Tee, Paul Ely,
Kashyap Desai, Shivasharan S, Chandrakanth Patil,
megaraidlinux.pdl, Sathya Prakash Veerichetty, Sreekanth Reddy,
mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani, Ranjan Kumar,
MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori, YOKOTA Hiroshi,
Jack Wang, Geoff Levand, Michael Reed, Nilesh Javali,
GR-QLogic-Storage-Upstream, Narsimhulu Musini, K . Y . Srinivasan,
Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, linux-hyperv,
Michael S . Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
Eugenio Perez, virtualization, Vishal Bhakta,
bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, xen-devel, Sumit Saxena, John Garry
In-Reply-To: <20260609121806.2121755-1-sumit.saxena@broadcom.com>
scsi_host_alloc() used kzalloc(), which always picks an arbitrary node.
Extend the function to accept a 'struct device *dev' parameter and use
kzalloc_node() with dev_to_node(dev) so the Scsi_Host struct lands on
the same NUMA node as the HBA, mirroring the treatment already applied
to struct scsi_device, struct scsi_target, and shost_data.
When dev is NULL (legacy ISA/platform drivers without a dma_dev) the
allocation falls back to NUMA_NO_NODE, preserving existing behaviour.
Update all in-tree callers:
- PCI-based HBA drivers pass &pdev->dev (or the equivalent struct
member such as &phba->pcidev->dev, &h->pdev->dev, &ha->pdev->dev)
so their host struct is placed on the adapter's node.
- Non-PCI drivers (ISA, Amiga, ARM PCMCIA, virtio, Hyper-V, PS3, …)
pass NULL.
- libfc's libfc_host_alloc() inline helper passes NULL; FC drivers
that want NUMA awareness can open-code the call with their pdev.
Suggested-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
---
drivers/scsi/3w-9xxx.c | 2 +-
drivers/scsi/3w-sas.c | 2 +-
drivers/scsi/3w-xxxx.c | 2 +-
drivers/scsi/53c700.c | 2 +-
drivers/scsi/BusLogic.c | 2 +-
drivers/scsi/a100u2w.c | 2 +-
drivers/scsi/a2091.c | 2 +-
drivers/scsi/a3000.c | 2 +-
drivers/scsi/aacraid/linit.c | 2 +-
drivers/scsi/advansys.c | 6 +++---
drivers/scsi/aha152x.c | 2 +-
drivers/scsi/aha1542.c | 2 +-
drivers/scsi/aha1740.c | 2 +-
drivers/scsi/aic7xxx/aic79xx_osm.c | 2 +-
drivers/scsi/aic7xxx/aic7xxx_osm.c | 2 +-
drivers/scsi/aic94xx/aic94xx_init.c | 2 +-
drivers/scsi/am53c974.c | 2 +-
drivers/scsi/arcmsr/arcmsr_hba.c | 3 ++-
drivers/scsi/arm/acornscsi.c | 2 +-
drivers/scsi/arm/arxescsi.c | 2 +-
drivers/scsi/arm/cumana_1.c | 2 +-
drivers/scsi/arm/cumana_2.c | 2 +-
drivers/scsi/arm/eesox.c | 2 +-
drivers/scsi/arm/oak.c | 2 +-
drivers/scsi/arm/powertec.c | 2 +-
drivers/scsi/atari_scsi.c | 2 +-
drivers/scsi/atp870u.c | 2 +-
drivers/scsi/bfa/bfad_im.c | 2 +-
drivers/scsi/csiostor/csio_init.c | 4 ++--
drivers/scsi/dc395x.c | 2 +-
drivers/scsi/dmx3191d.c | 2 +-
drivers/scsi/elx/efct/efct_xport.c | 4 ++--
drivers/scsi/esas2r/esas2r_main.c | 2 +-
drivers/scsi/fdomain.c | 2 +-
drivers/scsi/fnic/fnic_main.c | 2 +-
drivers/scsi/g_NCR5380.c | 2 +-
drivers/scsi/gvp11.c | 2 +-
drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 2 +-
drivers/scsi/hosts.c | 6 ++++--
drivers/scsi/hpsa.c | 2 +-
drivers/scsi/hptiop.c | 2 +-
drivers/scsi/ibmvscsi/ibmvfc.c | 2 +-
drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +-
drivers/scsi/imm.c | 2 +-
drivers/scsi/initio.c | 2 +-
drivers/scsi/ipr.c | 2 +-
drivers/scsi/ips.c | 2 +-
drivers/scsi/isci/init.c | 2 +-
drivers/scsi/jazz_esp.c | 2 +-
drivers/scsi/libiscsi.c | 2 +-
drivers/scsi/lpfc/lpfc_init.c | 2 +-
drivers/scsi/mac53c94.c | 2 +-
drivers/scsi/mac_esp.c | 2 +-
drivers/scsi/mac_scsi.c | 2 +-
drivers/scsi/megaraid.c | 2 +-
drivers/scsi/megaraid/megaraid_mbox.c | 2 +-
drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
drivers/scsi/mesh.c | 2 +-
drivers/scsi/mpi3mr/mpi3mr_os.c | 2 +-
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 ++--
drivers/scsi/mvme147.c | 2 +-
drivers/scsi/mvsas/mv_init.c | 2 +-
drivers/scsi/mvumi.c | 2 +-
drivers/scsi/myrb.c | 2 +-
drivers/scsi/myrs.c | 2 +-
drivers/scsi/ncr53c8xx.c | 2 +-
drivers/scsi/nsp32.c | 2 +-
drivers/scsi/pcmcia/nsp_cs.c | 2 +-
drivers/scsi/pcmcia/qlogic_stub.c | 2 +-
drivers/scsi/pcmcia/sym53c500_cs.c | 2 +-
drivers/scsi/pm8001/pm8001_init.c | 2 +-
drivers/scsi/pmcraid.c | 2 +-
drivers/scsi/ppa.c | 2 +-
drivers/scsi/ps3rom.c | 2 +-
drivers/scsi/qla1280.c | 2 +-
drivers/scsi/qla2xxx/qla_mid.c | 2 +-
drivers/scsi/qla2xxx/qla_os.c | 2 +-
drivers/scsi/qlogicfas.c | 2 +-
drivers/scsi/qlogicpti.c | 2 +-
drivers/scsi/scsi_debug.c | 2 +-
drivers/scsi/sgiwd93.c | 2 +-
drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
drivers/scsi/snic/snic_main.c | 2 +-
drivers/scsi/stex.c | 2 +-
drivers/scsi/storvsc_drv.c | 2 +-
drivers/scsi/sun3_scsi.c | 2 +-
drivers/scsi/sun3x_esp.c | 2 +-
drivers/scsi/sun_esp.c | 2 +-
drivers/scsi/sym53c8xx_2/sym_glue.c | 2 +-
drivers/scsi/virtio_scsi.c | 2 +-
drivers/scsi/vmw_pvscsi.c | 2 +-
drivers/scsi/wd719x.c | 2 +-
drivers/scsi/xen-scsifront.c | 2 +-
drivers/scsi/zorro_esp.c | 2 +-
include/scsi/libfc.h | 2 +-
include/scsi/scsi_host.h | 3 ++-
97 files changed, 107 insertions(+), 103 deletions(-)
diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c
index 9b93a2440af8..444578ee8070 100644
--- a/drivers/scsi/3w-9xxx.c
+++ b/drivers/scsi/3w-9xxx.c
@@ -2021,7 +2021,7 @@ static int twa_probe(struct pci_dev *pdev, const struct pci_device_id *dev_id)
goto out_disable_device;
}
- host = scsi_host_alloc(&driver_template, sizeof(TW_Device_Extension));
+ host = scsi_host_alloc(&driver_template, sizeof(TW_Device_Extension), &pdev->dev);
if (!host) {
TW_PRINTK(host, TW_DRIVER, 0x24, "Failed to allocate memory for device extension");
retval = -ENOMEM;
diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c
index 52dc1aa639f7..d063d39faf4f 100644
--- a/drivers/scsi/3w-sas.c
+++ b/drivers/scsi/3w-sas.c
@@ -1576,7 +1576,7 @@ static int twl_probe(struct pci_dev *pdev, const struct pci_device_id *dev_id)
goto out_disable_device;
}
- host = scsi_host_alloc(&driver_template, sizeof(TW_Device_Extension));
+ host = scsi_host_alloc(&driver_template, sizeof(TW_Device_Extension), &pdev->dev);
if (!host) {
TW_PRINTK(host, TW_DRIVER, 0x19, "Failed to allocate memory for device extension");
retval = -ENOMEM;
diff --git a/drivers/scsi/3w-xxxx.c b/drivers/scsi/3w-xxxx.c
index c68678fa72c1..0ccb5f1f8805 100644
--- a/drivers/scsi/3w-xxxx.c
+++ b/drivers/scsi/3w-xxxx.c
@@ -2268,7 +2268,7 @@ static int tw_probe(struct pci_dev *pdev, const struct pci_device_id *dev_id)
goto out_disable_device;
}
- host = scsi_host_alloc(&driver_template, sizeof(TW_Device_Extension));
+ host = scsi_host_alloc(&driver_template, sizeof(TW_Device_Extension), &pdev->dev);
if (!host) {
printk(KERN_WARNING "3w-xxxx: Failed to allocate memory for device extension.");
retval = -ENOMEM;
diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c
index c78f74b8f45c..e30d55ab5dea 100644
--- a/drivers/scsi/53c700.c
+++ b/drivers/scsi/53c700.c
@@ -341,7 +341,7 @@ NCR_700_detect(struct scsi_host_template *tpnt,
if(tpnt->proc_name == NULL)
tpnt->proc_name = "53c700";
- host = scsi_host_alloc(tpnt, 4);
+ host = scsi_host_alloc(tpnt, 4, NULL);
if (!host)
return NULL;
memset(hostdata->slots, 0, sizeof(struct NCR_700_command_slot)
diff --git a/drivers/scsi/BusLogic.c b/drivers/scsi/BusLogic.c
index 5304d2febd63..f865fdec4136 100644
--- a/drivers/scsi/BusLogic.c
+++ b/drivers/scsi/BusLogic.c
@@ -2302,7 +2302,7 @@ static int __init blogic_init(void)
*/
host = scsi_host_alloc(&blogic_template,
- sizeof(struct blogic_adapter));
+ sizeof(struct blogic_adapter), NULL);
if (host == NULL) {
release_region(myadapter->io_addr,
myadapter->addr_count);
diff --git a/drivers/scsi/a100u2w.c b/drivers/scsi/a100u2w.c
index 4365b896f5c4..9124c6103902 100644
--- a/drivers/scsi/a100u2w.c
+++ b/drivers/scsi/a100u2w.c
@@ -1106,7 +1106,7 @@ static int inia100_probe_one(struct pci_dev *pdev,
bios = inw(port + 0x50);
- shost = scsi_host_alloc(&inia100_template, sizeof(struct orc_host));
+ shost = scsi_host_alloc(&inia100_template, sizeof(struct orc_host), &pdev->dev);
if (!shost)
goto out_release_region;
diff --git a/drivers/scsi/a2091.c b/drivers/scsi/a2091.c
index 204448bfd04b..51effb2edefb 100644
--- a/drivers/scsi/a2091.c
+++ b/drivers/scsi/a2091.c
@@ -214,7 +214,7 @@ static int a2091_probe(struct zorro_dev *z, const struct zorro_device_id *ent)
return -EBUSY;
instance = scsi_host_alloc(&a2091_scsi_template,
- sizeof(struct a2091_hostdata));
+ sizeof(struct a2091_hostdata), NULL);
if (!instance) {
error = -ENOMEM;
goto fail_alloc;
diff --git a/drivers/scsi/a3000.c b/drivers/scsi/a3000.c
index bf054dd7682b..5b3d25b8ad37 100644
--- a/drivers/scsi/a3000.c
+++ b/drivers/scsi/a3000.c
@@ -235,7 +235,7 @@ static int __init amiga_a3000_scsi_probe(struct platform_device *pdev)
return -EBUSY;
instance = scsi_host_alloc(&amiga_a3000_scsi_template,
- sizeof(struct a3000_hostdata));
+ sizeof(struct a3000_hostdata), NULL);
if (!instance) {
error = -ENOMEM;
goto fail_alloc;
diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 2fa8f7ddb703..d003667007f7 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -1636,7 +1636,7 @@ static int aac_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
pci_set_master(pdev);
- shost = scsi_host_alloc(&aac_driver_template, sizeof(struct aac_dev));
+ shost = scsi_host_alloc(&aac_driver_template, sizeof(struct aac_dev), &pdev->dev);
if (!shost) {
error = -ENOMEM;
goto out_disable_pdev;
diff --git a/drivers/scsi/advansys.c b/drivers/scsi/advansys.c
index 5cdbf2bdb13d..e7ef433778a1 100644
--- a/drivers/scsi/advansys.c
+++ b/drivers/scsi/advansys.c
@@ -11237,7 +11237,7 @@ static int advansys_vlb_probe(struct device *dev, unsigned int id)
goto release_region;
err = -ENOMEM;
- shost = scsi_host_alloc(&advansys_template, sizeof(*board));
+ shost = scsi_host_alloc(&advansys_template, sizeof(*board), NULL);
if (!shost)
goto release_region;
@@ -11345,7 +11345,7 @@ static int advansys_eisa_probe(struct device *dev)
irq = advansys_eisa_irq_no(edev);
err = -ENOMEM;
- shost = scsi_host_alloc(&advansys_template, sizeof(*board));
+ shost = scsi_host_alloc(&advansys_template, sizeof(*board), NULL);
if (!shost)
goto release_region;
@@ -11462,7 +11462,7 @@ static int advansys_pci_probe(struct pci_dev *pdev,
ioport = pci_resource_start(pdev, 0);
err = -ENOMEM;
- shost = scsi_host_alloc(&advansys_template, sizeof(*board));
+ shost = scsi_host_alloc(&advansys_template, sizeof(*board), &pdev->dev);
if (!shost)
goto release_region;
diff --git a/drivers/scsi/aha152x.c b/drivers/scsi/aha152x.c
index e3ccb6bb62c0..d82ce80de098 100644
--- a/drivers/scsi/aha152x.c
+++ b/drivers/scsi/aha152x.c
@@ -734,7 +734,7 @@ struct Scsi_Host *aha152x_probe_one(struct aha152x_setup *setup)
{
struct Scsi_Host *shpnt;
- shpnt = scsi_host_alloc(&aha152x_driver_template, sizeof(struct aha152x_hostdata));
+ shpnt = scsi_host_alloc(&aha152x_driver_template, sizeof(struct aha152x_hostdata), NULL);
if (!shpnt) {
printk(KERN_ERR "aha152x: scsi_host_alloc failed\n");
return NULL;
diff --git a/drivers/scsi/aha1542.c b/drivers/scsi/aha1542.c
index fd766282d4a4..1a109c850785 100644
--- a/drivers/scsi/aha1542.c
+++ b/drivers/scsi/aha1542.c
@@ -752,7 +752,7 @@ static struct Scsi_Host *aha1542_hw_init(const struct scsi_host_template *tpnt,
if (!request_region(base_io, AHA1542_REGION_SIZE, "aha1542"))
return NULL;
- sh = scsi_host_alloc(tpnt, sizeof(struct aha1542_hostdata));
+ sh = scsi_host_alloc(tpnt, sizeof(struct aha1542_hostdata), NULL);
if (!sh)
goto release;
aha1542 = shost_priv(sh);
diff --git a/drivers/scsi/aha1740.c b/drivers/scsi/aha1740.c
index c435769359f2..31a52edf0748 100644
--- a/drivers/scsi/aha1740.c
+++ b/drivers/scsi/aha1740.c
@@ -583,7 +583,7 @@ static int aha1740_probe (struct device *dev)
printk(KERN_INFO "aha174x: Extended translation %sabled.\n",
translation ? "en" : "dis");
shpnt = scsi_host_alloc(&aha1740_template,
- sizeof(struct aha1740_hostdata));
+ sizeof(struct aha1740_hostdata), NULL);
if(shpnt == NULL)
goto err_release_region;
diff --git a/drivers/scsi/aic7xxx/aic79xx_osm.c b/drivers/scsi/aic7xxx/aic79xx_osm.c
index feb1707feb7e..76e30b0784b9 100644
--- a/drivers/scsi/aic7xxx/aic79xx_osm.c
+++ b/drivers/scsi/aic7xxx/aic79xx_osm.c
@@ -1214,7 +1214,7 @@ ahd_linux_register_host(struct ahd_softc *ahd, struct scsi_host_template *templa
int retval;
template->name = ahd->description;
- host = scsi_host_alloc(template, sizeof(struct ahd_softc *));
+ host = scsi_host_alloc(template, sizeof(struct ahd_softc *), NULL);
if (host == NULL)
return (ENOMEM);
diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm.c b/drivers/scsi/aic7xxx/aic7xxx_osm.c
index d93b522695eb..0169509abd76 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm.c
@@ -1083,7 +1083,7 @@ ahc_linux_register_host(struct ahc_softc *ahc, struct scsi_host_template *templa
int retval;
template->name = ahc->description;
- host = scsi_host_alloc(template, sizeof(struct ahc_softc *));
+ host = scsi_host_alloc(template, sizeof(struct ahc_softc *), NULL);
if (host == NULL)
return -ENOMEM;
diff --git a/drivers/scsi/aic94xx/aic94xx_init.c b/drivers/scsi/aic94xx/aic94xx_init.c
index 4400a3661d90..1336e5e38f8d 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -704,7 +704,7 @@ static int asd_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
err = -ENOMEM;
- shost = scsi_host_alloc(&aic94xx_sht, sizeof(void *));
+ shost = scsi_host_alloc(&aic94xx_sht, sizeof(void *), &dev->dev);
if (!shost)
goto Err;
diff --git a/drivers/scsi/am53c974.c b/drivers/scsi/am53c974.c
index f972a3c90a2f..4ca73e801232 100644
--- a/drivers/scsi/am53c974.c
+++ b/drivers/scsi/am53c974.c
@@ -388,7 +388,7 @@ static int pci_esp_probe_one(struct pci_dev *pdev,
goto fail_disable_device;
}
- shost = scsi_host_alloc(hostt, sizeof(struct esp));
+ shost = scsi_host_alloc(hostt, sizeof(struct esp), &pdev->dev);
if (!shost) {
dev_printk(KERN_INFO, &pdev->dev,
"failed to allocate scsi host\n");
diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index 8aa948f06cac..f0cc59e756dc 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -1087,7 +1087,8 @@ static int arcmsr_probe(struct pci_dev *pdev, const struct pci_device_id *id)
if(error){
return -ENODEV;
}
- host = scsi_host_alloc(&arcmsr_scsi_host_template, sizeof(struct AdapterControlBlock));
+ host = scsi_host_alloc(&arcmsr_scsi_host_template,
+ sizeof(struct AdapterControlBlock), &pdev->dev);
if(!host){
goto pci_disable_dev;
}
diff --git a/drivers/scsi/arm/acornscsi.c b/drivers/scsi/arm/acornscsi.c
index 79d7d7336b6a..97e3db7e6a7c 100644
--- a/drivers/scsi/arm/acornscsi.c
+++ b/drivers/scsi/arm/acornscsi.c
@@ -2806,7 +2806,7 @@ static int acornscsi_probe(struct expansion_card *ec, const struct ecard_id *id)
if (ret)
goto out;
- host = scsi_host_alloc(&acornscsi_template, sizeof(AS_Host));
+ host = scsi_host_alloc(&acornscsi_template, sizeof(AS_Host), NULL);
if (!host) {
ret = -ENOMEM;
goto out_release;
diff --git a/drivers/scsi/arm/arxescsi.c b/drivers/scsi/arm/arxescsi.c
index 925d0bd68aa5..32f0a3aefb44 100644
--- a/drivers/scsi/arm/arxescsi.c
+++ b/drivers/scsi/arm/arxescsi.c
@@ -272,7 +272,7 @@ static int arxescsi_probe(struct expansion_card *ec, const struct ecard_id *id)
goto out_region;
}
- host = scsi_host_alloc(&arxescsi_template, sizeof(struct arxescsi_info));
+ host = scsi_host_alloc(&arxescsi_template, sizeof(struct arxescsi_info), NULL);
if (!host) {
ret = -ENOMEM;
goto out_region;
diff --git a/drivers/scsi/arm/cumana_1.c b/drivers/scsi/arm/cumana_1.c
index d1a2a22ffe8c..d47ff9353c1b 100644
--- a/drivers/scsi/arm/cumana_1.c
+++ b/drivers/scsi/arm/cumana_1.c
@@ -238,7 +238,7 @@ static int cumanascsi1_probe(struct expansion_card *ec,
if (ret)
goto out;
- host = scsi_host_alloc(&cumanascsi_template, sizeof(struct NCR5380_hostdata));
+ host = scsi_host_alloc(&cumanascsi_template, sizeof(struct NCR5380_hostdata), NULL);
if (!host) {
ret = -ENOMEM;
goto out_release;
diff --git a/drivers/scsi/arm/cumana_2.c b/drivers/scsi/arm/cumana_2.c
index e460068f6834..e35afe3a1fe4 100644
--- a/drivers/scsi/arm/cumana_2.c
+++ b/drivers/scsi/arm/cumana_2.c
@@ -394,7 +394,7 @@ static int cumanascsi2_probe(struct expansion_card *ec,
}
host = scsi_host_alloc(&cumanascsi2_template,
- sizeof(struct cumanascsi2_info));
+ sizeof(struct cumanascsi2_info), NULL);
if (!host) {
ret = -ENOMEM;
goto out_region;
diff --git a/drivers/scsi/arm/eesox.c b/drivers/scsi/arm/eesox.c
index 99be9da8757f..de4d457f8ce7 100644
--- a/drivers/scsi/arm/eesox.c
+++ b/drivers/scsi/arm/eesox.c
@@ -510,7 +510,7 @@ static int eesoxscsi_probe(struct expansion_card *ec, const struct ecard_id *id)
}
host = scsi_host_alloc(&eesox_template,
- sizeof(struct eesoxscsi_info));
+ sizeof(struct eesoxscsi_info), NULL);
if (!host) {
ret = -ENOMEM;
goto out_region;
diff --git a/drivers/scsi/arm/oak.c b/drivers/scsi/arm/oak.c
index d69245007096..b2ff8616f963 100644
--- a/drivers/scsi/arm/oak.c
+++ b/drivers/scsi/arm/oak.c
@@ -126,7 +126,7 @@ static int oakscsi_probe(struct expansion_card *ec, const struct ecard_id *id)
if (ret)
goto out;
- host = scsi_host_alloc(&oakscsi_template, sizeof(struct NCR5380_hostdata));
+ host = scsi_host_alloc(&oakscsi_template, sizeof(struct NCR5380_hostdata), NULL);
if (!host) {
ret = -ENOMEM;
goto release;
diff --git a/drivers/scsi/arm/powertec.c b/drivers/scsi/arm/powertec.c
index 823c65ff6c12..045f35e50eff 100644
--- a/drivers/scsi/arm/powertec.c
+++ b/drivers/scsi/arm/powertec.c
@@ -318,7 +318,7 @@ static int powertecscsi_probe(struct expansion_card *ec,
}
host = scsi_host_alloc(&powertecscsi_template,
- sizeof (struct powertec_info));
+ sizeof(struct powertec_info), NULL);
if (!host) {
ret = -ENOMEM;
goto out_region;
diff --git a/drivers/scsi/atari_scsi.c b/drivers/scsi/atari_scsi.c
index 85055677666c..9a469cf3991f 100644
--- a/drivers/scsi/atari_scsi.c
+++ b/drivers/scsi/atari_scsi.c
@@ -785,7 +785,7 @@ static int __init atari_scsi_probe(struct platform_device *pdev)
}
instance = scsi_host_alloc(&atari_scsi_template,
- sizeof(struct NCR5380_hostdata));
+ sizeof(struct NCR5380_hostdata), NULL);
if (!instance) {
error = -ENOMEM;
goto fail_alloc;
diff --git a/drivers/scsi/atp870u.c b/drivers/scsi/atp870u.c
index 67459d81f479..57f0b4a11ba7 100644
--- a/drivers/scsi/atp870u.c
+++ b/drivers/scsi/atp870u.c
@@ -1579,7 +1579,7 @@ static int atp870u_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
pci_set_master(pdev);
err = -ENOMEM;
- shpnt = scsi_host_alloc(&atp870u_template, sizeof(struct atp_unit));
+ shpnt = scsi_host_alloc(&atp870u_template, sizeof(struct atp_unit), &pdev->dev);
if (!shpnt)
goto release_region;
diff --git a/drivers/scsi/bfa/bfad_im.c b/drivers/scsi/bfa/bfad_im.c
index 97990b285e17..bd14aee64886 100644
--- a/drivers/scsi/bfa/bfad_im.c
+++ b/drivers/scsi/bfa/bfad_im.c
@@ -740,7 +740,7 @@ bfad_scsi_host_alloc(struct bfad_im_port_s *im_port, struct bfad_s *bfad)
sht->sg_tablesize = bfad->cfg_data.io_max_sge;
- return scsi_host_alloc(sht, sizeof(struct bfad_im_port_pointer));
+ return scsi_host_alloc(sht, sizeof(struct bfad_im_port_pointer), NULL);
}
void
diff --git a/drivers/scsi/csiostor/csio_init.c b/drivers/scsi/csiostor/csio_init.c
index 238431524801..a4bf1ba03248 100644
--- a/drivers/scsi/csiostor/csio_init.c
+++ b/drivers/scsi/csiostor/csio_init.c
@@ -606,11 +606,11 @@ csio_shost_init(struct csio_hw *hw, struct device *dev,
if (dev == &hw->pdev->dev)
shost = scsi_host_alloc(
&csio_fcoe_shost_template,
- sizeof(struct csio_lnode));
+ sizeof(struct csio_lnode), &hw->pdev->dev);
else
shost = scsi_host_alloc(
&csio_fcoe_shost_vport_template,
- sizeof(struct csio_lnode));
+ sizeof(struct csio_lnode), &hw->pdev->dev);
if (!shost)
goto err;
diff --git a/drivers/scsi/dc395x.c b/drivers/scsi/dc395x.c
index 6183ce05d8cf..16adeac93aac 100644
--- a/drivers/scsi/dc395x.c
+++ b/drivers/scsi/dc395x.c
@@ -3984,7 +3984,7 @@ static int dc395x_init_one(struct pci_dev *dev, const struct pci_device_id *id)
/* allocate scsi host information (includes out adapter) */
scsi_host = scsi_host_alloc(&dc395x_driver_template,
- sizeof(struct AdapterCtlBlk));
+ sizeof(struct AdapterCtlBlk), &dev->dev);
if (!scsi_host)
goto fail;
diff --git a/drivers/scsi/dmx3191d.c b/drivers/scsi/dmx3191d.c
index d6d091b2f3c7..8ba17e3eefe3 100644
--- a/drivers/scsi/dmx3191d.c
+++ b/drivers/scsi/dmx3191d.c
@@ -74,7 +74,7 @@ static int dmx3191d_probe_one(struct pci_dev *pdev,
}
shost = scsi_host_alloc(&dmx3191d_driver_template,
- sizeof(struct NCR5380_hostdata));
+ sizeof(struct NCR5380_hostdata), &pdev->dev);
if (!shost)
goto out_release_region;
diff --git a/drivers/scsi/elx/efct/efct_xport.c b/drivers/scsi/elx/efct/efct_xport.c
index 9dcaef6fc188..74ef76e00eb5 100644
--- a/drivers/scsi/elx/efct/efct_xport.c
+++ b/drivers/scsi/elx/efct/efct_xport.c
@@ -378,7 +378,7 @@ efct_scsi_new_device(struct efct *efct)
int error = 0;
struct efct_vport *vport = NULL;
- shost = scsi_host_alloc(&efct_template, sizeof(*vport));
+ shost = scsi_host_alloc(&efct_template, sizeof(*vport), NULL);
if (!shost) {
efc_log_err(efct, "failed to allocate Scsi_Host struct\n");
return -ENOMEM;
@@ -902,7 +902,7 @@ efct_scsi_new_vport(struct efct *efct, struct device *dev)
int error = 0;
struct efct_vport *vport = NULL;
- shost = scsi_host_alloc(&efct_template, sizeof(*vport));
+ shost = scsi_host_alloc(&efct_template, sizeof(*vport), NULL);
if (!shost) {
efc_log_err(efct, "failed to allocate Scsi_Host struct\n");
return NULL;
diff --git a/drivers/scsi/esas2r/esas2r_main.c b/drivers/scsi/esas2r/esas2r_main.c
index ada278c24c51..4aac1f6db5e9 100644
--- a/drivers/scsi/esas2r/esas2r_main.c
+++ b/drivers/scsi/esas2r/esas2r_main.c
@@ -382,7 +382,7 @@ static int esas2r_probe(struct pci_dev *pcid,
"after pci_enable_device() enable_cnt: %d",
pcid->enable_cnt.counter);
- host = scsi_host_alloc(&driver_template, host_alloc_size);
+ host = scsi_host_alloc(&driver_template, host_alloc_size, &pcid->dev);
if (host == NULL) {
esas2r_log(ESAS2R_LOG_CRIT, "scsi_host_alloc() FAIL");
return -ENODEV;
diff --git a/drivers/scsi/fdomain.c b/drivers/scsi/fdomain.c
index 22fbb0222f07..66ba4551def8 100644
--- a/drivers/scsi/fdomain.c
+++ b/drivers/scsi/fdomain.c
@@ -537,7 +537,7 @@ struct Scsi_Host *fdomain_create(int base, int irq, int this_id,
return NULL;
}
- sh = scsi_host_alloc(&fdomain_template, sizeof(struct fdomain));
+ sh = scsi_host_alloc(&fdomain_template, sizeof(struct fdomain), NULL);
if (!sh)
return NULL;
diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
index 24d62c0874ac..688d85bc3f01 100644
--- a/drivers/scsi/fnic/fnic_main.c
+++ b/drivers/scsi/fnic/fnic_main.c
@@ -847,7 +847,7 @@ static int fnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
{
host =
scsi_host_alloc(&fnic_host_template,
- sizeof(struct fnic *));
+ sizeof(struct fnic *), &pdev->dev);
if (!host) {
dev_err(&fnic->pdev->dev, "Unable to allocate scsi host\n");
err = -ENOMEM;
diff --git a/drivers/scsi/g_NCR5380.c b/drivers/scsi/g_NCR5380.c
index 270eae7ac427..8b9076d6a964 100644
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -312,7 +312,7 @@ static int generic_NCR5380_init_one(const struct scsi_host_template *tpnt,
goto out_release;
}
- instance = scsi_host_alloc(tpnt, sizeof(struct NCR5380_hostdata));
+ instance = scsi_host_alloc(tpnt, sizeof(struct NCR5380_hostdata), NULL);
if (instance == NULL) {
ret = -ENOMEM;
goto out_unmap;
diff --git a/drivers/scsi/gvp11.c b/drivers/scsi/gvp11.c
index 0420bfe9bd42..ad5052db5a2e 100644
--- a/drivers/scsi/gvp11.c
+++ b/drivers/scsi/gvp11.c
@@ -353,7 +353,7 @@ static int gvp11_probe(struct zorro_dev *z, const struct zorro_device_id *ent)
goto fail_check_or_alloc;
instance = scsi_host_alloc(&gvp11_scsi_template,
- sizeof(struct gvp11_hostdata));
+ sizeof(struct gvp11_hostdata), NULL);
if (!instance) {
error = -ENOMEM;
goto fail_check_or_alloc;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 944ce19ae2fc..5696da8da6c7 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -2483,7 +2483,7 @@ static struct Scsi_Host *hisi_sas_shost_alloc(struct platform_device *pdev,
struct device *dev = &pdev->dev;
int error;
- shost = scsi_host_alloc(hw->sht, sizeof(*hisi_hba));
+ shost = scsi_host_alloc(hw->sht, sizeof(*hisi_hba), NULL);
if (!shost) {
dev_err(dev, "scsi host alloc failed\n");
return NULL;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index c7430f7c4048..44e584496ed5 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -3469,7 +3469,7 @@ hisi_sas_shost_alloc_pci(struct pci_dev *pdev)
struct hisi_hba *hisi_hba;
struct device *dev = &pdev->dev;
- shost = scsi_host_alloc(&sht_v3_hw, sizeof(*hisi_hba));
+ shost = scsi_host_alloc(&sht_v3_hw, sizeof(*hisi_hba), &pdev->dev);
if (!shost) {
dev_err(dev, "shost alloc failed\n");
return NULL;
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index e047747d4ecf..e1f42be79729 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -403,12 +403,14 @@ static const struct device_type scsi_host_type = {
* Return value:
* Pointer to a new Scsi_Host
**/
-struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht, int privsize)
+struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht, int privsize,
+ struct device *dev)
{
struct Scsi_Host *shost;
int index;
- shost = kzalloc(sizeof(struct Scsi_Host) + privsize, GFP_KERNEL);
+ shost = kzalloc_node(sizeof(struct Scsi_Host) + privsize, GFP_KERNEL,
+ dev ? dev_to_node(dev) : NUMA_NO_NODE);
if (!shost)
return NULL;
diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index a1b116cd4723..b9f9f18bd985 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -5837,7 +5837,7 @@ static int hpsa_scsi_host_alloc(struct ctlr_info *h)
{
struct Scsi_Host *sh;
- sh = scsi_host_alloc(&hpsa_driver_template, sizeof(struct ctlr_info *));
+ sh = scsi_host_alloc(&hpsa_driver_template, sizeof(struct ctlr_info *), &h->pdev->dev);
if (sh == NULL) {
dev_err(&h->pdev->dev, "scsi_host_alloc failed\n");
return -ENOMEM;
diff --git a/drivers/scsi/hptiop.c b/drivers/scsi/hptiop.c
index 7083c14c5302..7d79357be265 100644
--- a/drivers/scsi/hptiop.c
+++ b/drivers/scsi/hptiop.c
@@ -1311,7 +1311,7 @@ static int hptiop_probe(struct pci_dev *pcidev, const struct pci_device_id *id)
goto disable_pci_device;
}
- host = scsi_host_alloc(&driver_template, sizeof(struct hptiop_hba));
+ host = scsi_host_alloc(&driver_template, sizeof(struct hptiop_hba), &pcidev->dev);
if (!host) {
printk(KERN_ERR "hptiop: fail to alloc scsi host\n");
goto free_pci_regions;
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 3dd2adda195e..b11d564a21d9 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -6325,7 +6325,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id)
unsigned int max_scsi_queues = min((unsigned int)IBMVFC_MAX_SCSI_QUEUES, online_cpus);
ENTER;
- shost = scsi_host_alloc(&driver_template, sizeof(*vhost));
+ shost = scsi_host_alloc(&driver_template, sizeof(*vhost), NULL);
if (!shost) {
dev_err(dev, "Couldn't allocate host data\n");
goto out;
diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 609bda730b3a..e8342e581246 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -2235,7 +2235,7 @@ static int ibmvscsi_probe(struct vio_dev *vdev, const struct vio_device_id *id)
dev_set_drvdata(&vdev->dev, NULL);
- host = scsi_host_alloc(&driver_template, sizeof(*hostdata));
+ host = scsi_host_alloc(&driver_template, sizeof(*hostdata), NULL);
if (!host) {
dev_err(&vdev->dev, "couldn't allocate host data\n");
goto scsi_host_alloc_failed;
diff --git a/drivers/scsi/imm.c b/drivers/scsi/imm.c
index 0535252e77e3..a6131f87fcaf 100644
--- a/drivers/scsi/imm.c
+++ b/drivers/scsi/imm.c
@@ -1221,7 +1221,7 @@ static int __imm_attach(struct parport *pb)
INIT_DELAYED_WORK(&dev->imm_tq, imm_interrupt);
err = -ENOMEM;
- host = scsi_host_alloc(&imm_template, sizeof(imm_struct *));
+ host = scsi_host_alloc(&imm_template, sizeof(imm_struct *), NULL);
if (!host)
goto out1;
host->io_port = pb->base;
diff --git a/drivers/scsi/initio.c b/drivers/scsi/initio.c
index 06fbe85dccfa..294f7f8d5dbb 100644
--- a/drivers/scsi/initio.c
+++ b/drivers/scsi/initio.c
@@ -2824,7 +2824,7 @@ static int initio_probe_one(struct pci_dev *pdev,
error = -ENODEV;
goto out_disable_device;
}
- shost = scsi_host_alloc(&initio_template, sizeof(struct initio_host));
+ shost = scsi_host_alloc(&initio_template, sizeof(struct initio_host), &pdev->dev);
if (!shost) {
printk(KERN_WARNING "initio: Could not allocate host structure.\n");
error = -ENOMEM;
diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index d207e5e81afe..85608804ff39 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -9379,7 +9379,7 @@ static int ipr_probe_ioa(struct pci_dev *pdev,
ENTER;
dev_info(&pdev->dev, "Found IOA with IRQ: %d\n", pdev->irq);
- host = scsi_host_alloc(&driver_template, sizeof(*ioa_cfg));
+ host = scsi_host_alloc(&driver_template, sizeof(*ioa_cfg), &pdev->dev);
if (!host) {
dev_err(&pdev->dev, "call to scsi_host_alloc failed!\n");
diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 41ed73966a48..709a2a799f3e 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -6638,7 +6638,7 @@ ips_register_scsi(int index)
{
struct Scsi_Host *sh;
ips_ha_t *ha, *oldha = ips_ha[index];
- sh = scsi_host_alloc(&ips_driver_template, sizeof (ips_ha_t));
+ sh = scsi_host_alloc(&ips_driver_template, sizeof(ips_ha_t), &oldha->pcidev->dev);
if (!sh) {
IPS_PRINTK(KERN_WARNING, oldha->pcidev,
"Unable to register controller with SCSI subsystem\n");
diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
index acf0c2038d20..7da06ace20ad 100644
--- a/drivers/scsi/isci/init.c
+++ b/drivers/scsi/isci/init.c
@@ -538,7 +538,7 @@ static struct isci_host *isci_host_alloc(struct pci_dev *pdev, int id)
INIT_LIST_HEAD(&idev->node);
}
- shost = scsi_host_alloc(&isci_sht, sizeof(void *));
+ shost = scsi_host_alloc(&isci_sht, sizeof(void *), &pdev->dev);
if (!shost)
return NULL;
diff --git a/drivers/scsi/jazz_esp.c b/drivers/scsi/jazz_esp.c
index 35137f5cfb3a..1817246e4cc6 100644
--- a/drivers/scsi/jazz_esp.c
+++ b/drivers/scsi/jazz_esp.c
@@ -110,7 +110,7 @@ static int esp_jazz_probe(struct platform_device *dev)
struct resource *res;
int err;
- host = scsi_host_alloc(tpnt, sizeof(struct esp));
+ host = scsi_host_alloc(tpnt, sizeof(struct esp), NULL);
err = -ENOMEM;
if (!host)
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 160f02f2f51d..458955dfc0aa 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -2903,7 +2903,7 @@ struct Scsi_Host *iscsi_host_alloc(const struct scsi_host_template *sht,
struct Scsi_Host *shost;
struct iscsi_host *ihost;
- shost = scsi_host_alloc(sht, sizeof(struct iscsi_host) + dd_data_size);
+ shost = scsi_host_alloc(sht, sizeof(struct iscsi_host) + dd_data_size, NULL);
if (!shost)
return NULL;
ihost = shost_priv(shost);
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 82af59c913e9..25264866075f 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -4745,7 +4745,7 @@ lpfc_create_port(struct lpfc_hba *phba, int instance, struct device *dev)
template->sg_tablesize = lpfc_get_sg_tablesize(phba);
}
- shost = scsi_host_alloc(template, sizeof(struct lpfc_vport));
+ shost = scsi_host_alloc(template, sizeof(struct lpfc_vport), &phba->pcidev->dev);
if (!shost)
goto out;
diff --git a/drivers/scsi/mac53c94.c b/drivers/scsi/mac53c94.c
index de2bd860b9d7..737e5f2fef6f 100644
--- a/drivers/scsi/mac53c94.c
+++ b/drivers/scsi/mac53c94.c
@@ -426,7 +426,7 @@ static int mac53c94_probe(struct macio_dev *mdev, const struct of_device_id *mat
return -EBUSY;
}
- host = scsi_host_alloc(&mac53c94_template, sizeof(struct fsc_state));
+ host = scsi_host_alloc(&mac53c94_template, sizeof(struct fsc_state), NULL);
if (host == NULL) {
printk(KERN_ERR "mac53c94: couldn't register host");
rc = -ENOMEM;
diff --git a/drivers/scsi/mac_esp.c b/drivers/scsi/mac_esp.c
index a0ceaa2428c2..c8652bfdb3b8 100644
--- a/drivers/scsi/mac_esp.c
+++ b/drivers/scsi/mac_esp.c
@@ -301,7 +301,7 @@ static int esp_mac_probe(struct platform_device *dev)
if (dev->id > 1)
return -ENODEV;
- host = scsi_host_alloc(tpnt, sizeof(struct esp));
+ host = scsi_host_alloc(tpnt, sizeof(struct esp), NULL);
err = -ENOMEM;
if (!host)
diff --git a/drivers/scsi/mac_scsi.c b/drivers/scsi/mac_scsi.c
index a86bd839d08e..eeb00ee30aaa 100644
--- a/drivers/scsi/mac_scsi.c
+++ b/drivers/scsi/mac_scsi.c
@@ -474,7 +474,7 @@ static int __init mac_scsi_probe(struct platform_device *pdev)
mac_scsi_template.sg_tablesize = 1;
instance = scsi_host_alloc(&mac_scsi_template,
- sizeof(struct NCR5380_hostdata));
+ sizeof(struct NCR5380_hostdata), NULL);
if (!instance)
return -ENOMEM;
diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 9476a0d2c72d..701e54843193 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -4203,7 +4203,7 @@ megaraid_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
}
/* Initialize SCSI Host structure */
- host = scsi_host_alloc(&megaraid_template, sizeof(adapter_t));
+ host = scsi_host_alloc(&megaraid_template, sizeof(adapter_t), &pdev->dev);
if (!host)
goto out_iounmap;
diff --git a/drivers/scsi/megaraid/megaraid_mbox.c b/drivers/scsi/megaraid/megaraid_mbox.c
index ce89032a5a74..17b015b3d35f 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -620,7 +620,7 @@ megaraid_io_attach(adapter_t *adapter)
struct Scsi_Host *host;
// Initialize SCSI Host structure
- host = scsi_host_alloc(&megaraid_template_g, 8);
+ host = scsi_host_alloc(&megaraid_template_g, 8, &pdev->dev);
if (!host) {
con_log(CL_ANN, (KERN_WARNING
"megaraid mbox: scsi_host_alloc failed\n"));
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index ecd365d78ae3..bae1070371d5 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -7512,7 +7512,7 @@ static int megasas_probe_one(struct pci_dev *pdev,
pci_set_master(pdev);
host = scsi_host_alloc(&megasas_template,
- sizeof(struct megasas_instance));
+ sizeof(struct megasas_instance), &pdev->dev);
if (!host) {
dev_printk(KERN_DEBUG, &pdev->dev, "scsi_host_alloc failed\n");
diff --git a/drivers/scsi/mesh.c b/drivers/scsi/mesh.c
index dc1402b321da..a4ba6bc49d23 100644
--- a/drivers/scsi/mesh.c
+++ b/drivers/scsi/mesh.c
@@ -1877,7 +1877,7 @@ static int mesh_probe(struct macio_dev *mdev, const struct of_device_id *match)
printk(KERN_ERR "mesh: unable to request memory resources");
return -EBUSY;
}
- mesh_host = scsi_host_alloc(&mesh_template, sizeof(struct mesh_state));
+ mesh_host = scsi_host_alloc(&mesh_template, sizeof(struct mesh_state), NULL);
if (mesh_host == NULL) {
printk(KERN_ERR "mesh: couldn't register host");
goto out_release;
diff --git a/drivers/scsi/mpi3mr/mpi3mr_os.c b/drivers/scsi/mpi3mr/mpi3mr_os.c
index 402d1f35d214..c74e2addc77d 100644
--- a/drivers/scsi/mpi3mr/mpi3mr_os.c
+++ b/drivers/scsi/mpi3mr/mpi3mr_os.c
@@ -5468,7 +5468,7 @@ mpi3mr_probe(struct pci_dev *pdev, const struct pci_device_id *id)
}
shost = scsi_host_alloc(&mpi3mr_driver_template,
- sizeof(struct mpi3mr_ioc));
+ sizeof(struct mpi3mr_ioc), &pdev->dev);
if (!shost) {
retval = -ENODEV;
goto shost_failed;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 6ff788557294..06c8df6261d4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -13367,7 +13367,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
PCIE_LINK_STATE_L1 | PCIE_LINK_STATE_CLKPM);
/* Use mpt2sas driver host template for SAS 2.0 HBA's */
shost = scsi_host_alloc(&mpt2sas_driver_template,
- sizeof(struct MPT3SAS_ADAPTER));
+ sizeof(struct MPT3SAS_ADAPTER), &pdev->dev);
if (!shost)
return -ENODEV;
ioc = shost_priv(shost);
@@ -13399,7 +13399,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id)
case MPI26_VERSION:
/* Use mpt3sas driver host template for SAS 3.0 HBA's */
shost = scsi_host_alloc(&mpt3sas_driver_template,
- sizeof(struct MPT3SAS_ADAPTER));
+ sizeof(struct MPT3SAS_ADAPTER), &pdev->dev);
if (!shost)
return -ENODEV;
ioc = shost_priv(shost);
diff --git a/drivers/scsi/mvme147.c b/drivers/scsi/mvme147.c
index 98b99c0f5bc7..4d61e25de2cf 100644
--- a/drivers/scsi/mvme147.c
+++ b/drivers/scsi/mvme147.c
@@ -97,7 +97,7 @@ static int __init mvme147_init(void)
return 0;
mvme147_shost = scsi_host_alloc(&mvme147_host_template,
- sizeof(struct WD33C93_hostdata));
+ sizeof(struct WD33C93_hostdata), NULL);
if (!mvme147_shost)
goto err_out;
mvme147_shost->base = 0xfffe4000;
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index 5abc17a2e261..fd90b5eec0b4 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -494,7 +494,7 @@ static int mvs_pci_init(struct pci_dev *pdev, const struct pci_device_id *ent)
if (rc)
goto err_out_regions;
- shost = scsi_host_alloc(&mvs_sht, sizeof(void *));
+ shost = scsi_host_alloc(&mvs_sht, sizeof(void *), &pdev->dev);
if (!shost) {
rc = -ENOMEM;
goto err_out_regions;
diff --git a/drivers/scsi/mvumi.c b/drivers/scsi/mvumi.c
index e70d336b4ab3..d12b33a32a09 100644
--- a/drivers/scsi/mvumi.c
+++ b/drivers/scsi/mvumi.c
@@ -2468,7 +2468,7 @@ static int mvumi_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
if (ret)
goto fail_set_dma_mask;
- host = scsi_host_alloc(&mvumi_template, sizeof(*mhba));
+ host = scsi_host_alloc(&mvumi_template, sizeof(*mhba), &pdev->dev);
if (!host) {
dev_err(&pdev->dev, "scsi_host_alloc failed\n");
ret = -ENOMEM;
diff --git a/drivers/scsi/myrb.c b/drivers/scsi/myrb.c
index 3678b66310ed..f28c29b41cf6 100644
--- a/drivers/scsi/myrb.c
+++ b/drivers/scsi/myrb.c
@@ -3401,7 +3401,7 @@ static struct myrb_hba *myrb_detect(struct pci_dev *pdev,
struct Scsi_Host *shost;
struct myrb_hba *cb = NULL;
- shost = scsi_host_alloc(&myrb_template, sizeof(struct myrb_hba));
+ shost = scsi_host_alloc(&myrb_template, sizeof(struct myrb_hba), &pdev->dev);
if (!shost) {
dev_err(&pdev->dev, "Unable to allocate Controller\n");
return NULL;
diff --git a/drivers/scsi/myrs.c b/drivers/scsi/myrs.c
index afd68225221a..a8ce488e6520 100644
--- a/drivers/scsi/myrs.c
+++ b/drivers/scsi/myrs.c
@@ -1937,7 +1937,7 @@ static struct myrs_hba *myrs_alloc_host(struct pci_dev *pdev,
struct Scsi_Host *shost;
struct myrs_hba *cs;
- shost = scsi_host_alloc(&myrs_template, sizeof(struct myrs_hba));
+ shost = scsi_host_alloc(&myrs_template, sizeof(struct myrs_hba), &pdev->dev);
if (!shost)
return NULL;
diff --git a/drivers/scsi/ncr53c8xx.c b/drivers/scsi/ncr53c8xx.c
index 5369ca3fe4fd..009d4c55054e 100644
--- a/drivers/scsi/ncr53c8xx.c
+++ b/drivers/scsi/ncr53c8xx.c
@@ -8108,7 +8108,7 @@ struct Scsi_Host * __init ncr_attach(struct scsi_host_template *tpnt,
printk(KERN_INFO "ncr53c720-%d: rev 0x%x irq %d\n",
unit, device->chip.revision_id, device->slot.irq);
- instance = scsi_host_alloc(tpnt, sizeof(*host_data));
+ instance = scsi_host_alloc(tpnt, sizeof(*host_data), NULL);
if (!instance)
goto attach_error;
host_data = (struct host_data *) instance->hostdata;
diff --git a/drivers/scsi/nsp32.c b/drivers/scsi/nsp32.c
index e893d5677241..681e1d554657 100644
--- a/drivers/scsi/nsp32.c
+++ b/drivers/scsi/nsp32.c
@@ -2556,7 +2556,7 @@ static int nsp32_detect(struct pci_dev *pdev)
/*
* register this HBA as SCSI device
*/
- host = scsi_host_alloc(&nsp32_template, sizeof(nsp32_hw_data));
+ host = scsi_host_alloc(&nsp32_template, sizeof(nsp32_hw_data), &pdev->dev);
if (host == NULL) {
nsp32_msg (KERN_ERR, "failed to scsi register");
goto err;
diff --git a/drivers/scsi/pcmcia/nsp_cs.c b/drivers/scsi/pcmcia/nsp_cs.c
index ae70fda96ae9..32ca7872b7f8 100644
--- a/drivers/scsi/pcmcia/nsp_cs.c
+++ b/drivers/scsi/pcmcia/nsp_cs.c
@@ -1326,7 +1326,7 @@ static struct Scsi_Host *nsp_detect(struct scsi_host_template *sht)
nsp_hw_data *data_b = &nsp_data_base, *data;
nsp_dbg(NSP_DEBUG_INIT, "this_id=%d", sht->this_id);
- host = scsi_host_alloc(&nsp_driver_template, sizeof(nsp_hw_data));
+ host = scsi_host_alloc(&nsp_driver_template, sizeof(nsp_hw_data), NULL);
if (host == NULL) {
nsp_dbg(NSP_DEBUG_INIT, "host failed");
return NULL;
diff --git a/drivers/scsi/pcmcia/qlogic_stub.c b/drivers/scsi/pcmcia/qlogic_stub.c
index 5d8a434d3f66..b417b39ab723 100644
--- a/drivers/scsi/pcmcia/qlogic_stub.c
+++ b/drivers/scsi/pcmcia/qlogic_stub.c
@@ -106,7 +106,7 @@ static struct Scsi_Host *qlogic_detect(struct scsi_host_template *host,
qlogicfas408_setup(qbase, qinitid, INT_TYPE);
host->name = qlogic_name;
- shost = scsi_host_alloc(host, sizeof(struct qlogicfas408_priv));
+ shost = scsi_host_alloc(host, sizeof(struct qlogicfas408_priv), NULL);
if (!shost)
goto err;
shost->io_port = qbase;
diff --git a/drivers/scsi/pcmcia/sym53c500_cs.c b/drivers/scsi/pcmcia/sym53c500_cs.c
index 1530c1ad5d36..83aab6c69a62 100644
--- a/drivers/scsi/pcmcia/sym53c500_cs.c
+++ b/drivers/scsi/pcmcia/sym53c500_cs.c
@@ -752,7 +752,7 @@ SYM53C500_config(struct pcmcia_device *link)
chip_init(port_base);
- host = scsi_host_alloc(tpnt, sizeof(struct sym53c500_data));
+ host = scsi_host_alloc(tpnt, sizeof(struct sym53c500_data), NULL);
if (!host) {
printk("SYM53C500: Unable to register host, giving up.\n");
goto err_release;
diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
index e93ea76b565e..873810c6853c 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -1142,7 +1142,7 @@ static int pm8001_pci_probe(struct pci_dev *pdev,
if (rc)
goto err_out_regions;
- shost = scsi_host_alloc(&pm8001_sht, sizeof(void *));
+ shost = scsi_host_alloc(&pm8001_sht, sizeof(void *), &pdev->dev);
if (!shost) {
rc = -ENOMEM;
goto err_out_regions;
diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c
index 942a99393204..a26c747806ef 100644
--- a/drivers/scsi/pmcraid.c
+++ b/drivers/scsi/pmcraid.c
@@ -5236,7 +5236,7 @@ static int pmcraid_probe(struct pci_dev *pdev,
}
host = scsi_host_alloc(&pmcraid_host_template,
- sizeof(struct pmcraid_instance));
+ sizeof(struct pmcraid_instance), &pdev->dev);
if (!host) {
dev_err(&pdev->dev, "scsi_host_alloc failed!\n");
diff --git a/drivers/scsi/ppa.c b/drivers/scsi/ppa.c
index 8a4e910d5758..40fe9c6acc3b 100644
--- a/drivers/scsi/ppa.c
+++ b/drivers/scsi/ppa.c
@@ -1101,7 +1101,7 @@ static int __ppa_attach(struct parport *pb)
INIT_DELAYED_WORK(&dev->ppa_tq, ppa_interrupt);
err = -ENOMEM;
- host = scsi_host_alloc(&ppa_template, sizeof(ppa_struct *));
+ host = scsi_host_alloc(&ppa_template, sizeof(ppa_struct *), NULL);
if (!host)
goto out1;
host->io_port = pb->base;
diff --git a/drivers/scsi/ps3rom.c b/drivers/scsi/ps3rom.c
index a9c727d22931..3542a35b137e 100644
--- a/drivers/scsi/ps3rom.c
+++ b/drivers/scsi/ps3rom.c
@@ -361,7 +361,7 @@ static int ps3rom_probe(struct ps3_system_bus_device *_dev)
goto fail_free_bounce;
host = scsi_host_alloc(&ps3rom_host_template,
- sizeof(struct ps3rom_private));
+ sizeof(struct ps3rom_private), NULL);
if (!host) {
dev_err(&dev->sbd.core, "%s:%u: scsi_host_alloc failed\n",
__func__, __LINE__);
diff --git a/drivers/scsi/qla1280.c b/drivers/scsi/qla1280.c
index cdd6fe002c32..f88f2e659baa 100644
--- a/drivers/scsi/qla1280.c
+++ b/drivers/scsi/qla1280.c
@@ -4142,7 +4142,7 @@ qla1280_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
pci_set_master(pdev);
error = -ENOMEM;
- host = scsi_host_alloc(&qla1280_driver_template, sizeof(*ha));
+ host = scsi_host_alloc(&qla1280_driver_template, sizeof(*ha), &pdev->dev);
if (!host) {
printk(KERN_WARNING
"qla1280: Failed to register host, aborting.\n");
diff --git a/drivers/scsi/qla2xxx/qla_mid.c b/drivers/scsi/qla2xxx/qla_mid.c
index c563133f751e..4bafc367e21d 100644
--- a/drivers/scsi/qla2xxx/qla_mid.c
+++ b/drivers/scsi/qla2xxx/qla_mid.c
@@ -502,7 +502,7 @@ qla24xx_create_vhost(struct fc_vport *fc_vport)
vha = qla2x00_create_host(sht, ha);
if (!vha) {
ql_log(ql_log_warn, vha, 0xa005,
- "scsi_host_alloc() failed for vport.\n");
+ "scsi_host_alloc() failed for vport.\n", NULL);
return(NULL);
}
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index 72b1c28e4dae..ce0d097f3317 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -5046,7 +5046,7 @@ struct scsi_qla_host *qla2x00_create_host(const struct scsi_host_template *sht,
struct Scsi_Host *host;
struct scsi_qla_host *vha = NULL;
- host = scsi_host_alloc(sht, sizeof(scsi_qla_host_t));
+ host = scsi_host_alloc(sht, sizeof(scsi_qla_host_t), &ha->pdev->dev);
if (!host) {
ql_log_pci(ql_log_fatal, ha->pdev, 0x0107,
"Failed to allocate host from the scsi layer, aborting.\n");
diff --git a/drivers/scsi/qlogicfas.c b/drivers/scsi/qlogicfas.c
index 8f05e3707d69..b9ead7dc371c 100644
--- a/drivers/scsi/qlogicfas.c
+++ b/drivers/scsi/qlogicfas.c
@@ -95,7 +95,7 @@ static struct Scsi_Host *__qlogicfas_detect(struct scsi_host_template *host,
qlogicfas408_setup(qbase, qinitid, INT_TYPE);
- hreg = scsi_host_alloc(host, sizeof(struct qlogicfas408_priv));
+ hreg = scsi_host_alloc(host, sizeof(struct qlogicfas408_priv), NULL);
if (!hreg)
goto err_release_mem;
priv = get_priv_by_host(hreg);
diff --git a/drivers/scsi/qlogicpti.c b/drivers/scsi/qlogicpti.c
index ea0a2b5a0a42..f67a9b400100 100644
--- a/drivers/scsi/qlogicpti.c
+++ b/drivers/scsi/qlogicpti.c
@@ -1316,7 +1316,7 @@ static int qpti_sbus_probe(struct platform_device *op)
if (op->archdata.irqs[0] == 0)
return -ENODEV;
- host = scsi_host_alloc(&qpti_template, sizeof(struct qlogicpti));
+ host = scsi_host_alloc(&qpti_template, sizeof(struct qlogicpti), NULL);
if (!host)
return -ENOMEM;
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index bb6b0e7fb910..59488bf74ce0 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -9548,7 +9548,7 @@ static int sdebug_driver_probe(struct device *dev)
sdbg_host = dev_to_sdebug_host(dev);
- hpnt = scsi_host_alloc(&sdebug_driver_template, 0);
+ hpnt = scsi_host_alloc(&sdebug_driver_template, 0, NULL);
if (NULL == hpnt) {
pr_err("scsi_host_alloc failed\n");
error = -ENODEV;
diff --git a/drivers/scsi/sgiwd93.c b/drivers/scsi/sgiwd93.c
index 6594661db5f4..07fbe6fda7c2 100644
--- a/drivers/scsi/sgiwd93.c
+++ b/drivers/scsi/sgiwd93.c
@@ -231,7 +231,7 @@ static int sgiwd93_probe(struct platform_device *pdev)
unsigned int irq = pd->irq;
int err;
- host = scsi_host_alloc(&sgiwd93_template, sizeof(struct ip22_hostdata));
+ host = scsi_host_alloc(&sgiwd93_template, sizeof(struct ip22_hostdata), NULL);
if (!host) {
err = -ENOMEM;
goto out;
diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index 65ff50982978..a3163c06b3f8 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -7619,7 +7619,7 @@ static int pqi_register_scsi(struct pqi_ctrl_info *ctrl_info)
int rc;
struct Scsi_Host *shost;
- shost = scsi_host_alloc(&pqi_driver_template, sizeof(ctrl_info));
+ shost = scsi_host_alloc(&pqi_driver_template, sizeof(ctrl_info), &ctrl_info->pci_dev->dev);
if (!shost) {
dev_err(&ctrl_info->pci_dev->dev, "scsi_host_alloc failed\n");
return -ENOMEM;
diff --git a/drivers/scsi/snic/snic_main.c b/drivers/scsi/snic/snic_main.c
index 82953e6a0915..9edf6661e6f1 100644
--- a/drivers/scsi/snic/snic_main.c
+++ b/drivers/scsi/snic/snic_main.c
@@ -363,7 +363,7 @@ snic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
/*
* Allocate SCSI Host and setup association between host, and snic
*/
- shost = scsi_host_alloc(&snic_host_template, sizeof(struct snic));
+ shost = scsi_host_alloc(&snic_host_template, sizeof(struct snic), &pdev->dev);
if (!shost) {
SNIC_ERR("Unable to alloc scsi_host\n");
ret = -ENOMEM;
diff --git a/drivers/scsi/stex.c b/drivers/scsi/stex.c
index 6aeeb338633d..7d6b851fef24 100644
--- a/drivers/scsi/stex.c
+++ b/drivers/scsi/stex.c
@@ -1667,7 +1667,7 @@ static int stex_probe(struct pci_dev *pdev, const struct pci_device_id *id)
S6flag = 0;
register_reboot_notifier(&stex_notifier);
- host = scsi_host_alloc(&driver_template, sizeof(struct st_hba));
+ host = scsi_host_alloc(&driver_template, sizeof(struct st_hba), &pdev->dev);
if (!host) {
printk(KERN_ERR DRV_NAME "(%s): scsi_host_alloc failed\n",
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 571ea549152b..fc4c05127dc4 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1969,7 +1969,7 @@ static int storvsc_probe(struct hv_device *device,
(100 - ring_avail_percent_lowater) / 100;
host = scsi_host_alloc(&scsi_driver,
- sizeof(struct hv_host_device));
+ sizeof(struct hv_host_device), NULL);
if (!host)
return -ENOMEM;
diff --git a/drivers/scsi/sun3_scsi.c b/drivers/scsi/sun3_scsi.c
index ca9cd691cc32..ed41b605328e 100644
--- a/drivers/scsi/sun3_scsi.c
+++ b/drivers/scsi/sun3_scsi.c
@@ -578,7 +578,7 @@ static int __init sun3_scsi_probe(struct platform_device *pdev)
#endif
instance = scsi_host_alloc(&sun3_scsi_template,
- sizeof(struct NCR5380_hostdata));
+ sizeof(struct NCR5380_hostdata), NULL);
if (!instance) {
error = -ENOMEM;
goto fail_alloc;
diff --git a/drivers/scsi/sun3x_esp.c b/drivers/scsi/sun3x_esp.c
index 365406885b8e..f7e48f4c5444 100644
--- a/drivers/scsi/sun3x_esp.c
+++ b/drivers/scsi/sun3x_esp.c
@@ -175,7 +175,7 @@ static int esp_sun3x_probe(struct platform_device *dev)
struct resource *res;
int err = -ENOMEM;
- host = scsi_host_alloc(tpnt, sizeof(struct esp));
+ host = scsi_host_alloc(tpnt, sizeof(struct esp), NULL);
if (!host)
goto fail;
diff --git a/drivers/scsi/sun_esp.c b/drivers/scsi/sun_esp.c
index aa430501f0c7..bc4e4030acb6 100644
--- a/drivers/scsi/sun_esp.c
+++ b/drivers/scsi/sun_esp.c
@@ -457,7 +457,7 @@ static int esp_sbus_probe_one(struct platform_device *op,
struct esp *esp;
int err;
- host = scsi_host_alloc(tpnt, sizeof(struct esp));
+ host = scsi_host_alloc(tpnt, sizeof(struct esp), NULL);
err = -ENOMEM;
if (!host)
diff --git a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c
index 27e22acaf1a7..16e821c3b59e 100644
--- a/drivers/scsi/sym53c8xx_2/sym_glue.c
+++ b/drivers/scsi/sym53c8xx_2/sym_glue.c
@@ -1300,7 +1300,7 @@ static struct Scsi_Host *sym_attach(const struct scsi_host_template *tpnt, int u
if (!fw)
goto attach_failed;
- shost = scsi_host_alloc(tpnt, sizeof(*sym_data));
+ shost = scsi_host_alloc(tpnt, sizeof(*sym_data), &pdev->dev);
if (!shost)
goto attach_failed;
sym_data = shost_priv(shost);
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 5fdaa71f0652..88375574cb18 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -929,7 +929,7 @@ static int virtscsi_probe(struct virtio_device *vdev)
num_targets = virtscsi_config_get(vdev, max_target) + 1;
shost = scsi_host_alloc(&virtscsi_host_template,
- struct_size(vscsi, req_vqs, num_queues));
+ struct_size(vscsi, req_vqs, num_queues), NULL);
if (!shost)
return -ENOMEM;
diff --git a/drivers/scsi/vmw_pvscsi.c b/drivers/scsi/vmw_pvscsi.c
index 151cac9f9c2a..32c39c66c49b 100644
--- a/drivers/scsi/vmw_pvscsi.c
+++ b/drivers/scsi/vmw_pvscsi.c
@@ -1435,7 +1435,7 @@ static int pvscsi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
PVSCSI_MAX_NUM_REQ_ENTRIES_PER_PAGE;
pvscsi_template.cmd_per_lun =
min(pvscsi_template.can_queue, pvscsi_cmd_per_lun);
- host = scsi_host_alloc(&pvscsi_template, sizeof(struct pvscsi_adapter));
+ host = scsi_host_alloc(&pvscsi_template, sizeof(struct pvscsi_adapter), &pdev->dev);
if (!host) {
printk(KERN_ERR "vmw_pvscsi: failed to allocate host\n");
goto out_release_resources_and_disable;
diff --git a/drivers/scsi/wd719x.c b/drivers/scsi/wd719x.c
index 830d40f57f6a..0aa6bb093431 100644
--- a/drivers/scsi/wd719x.c
+++ b/drivers/scsi/wd719x.c
@@ -921,7 +921,7 @@ static int wd719x_pci_probe(struct pci_dev *pdev, const struct pci_device_id *d)
goto release_region;
err = -ENOMEM;
- sh = scsi_host_alloc(&wd719x_template, sizeof(struct wd719x));
+ sh = scsi_host_alloc(&wd719x_template, sizeof(struct wd719x), &pdev->dev);
if (!sh)
goto release_region;
diff --git a/drivers/scsi/xen-scsifront.c b/drivers/scsi/xen-scsifront.c
index 989bcaee42ca..d4d57f33cc15 100644
--- a/drivers/scsi/xen-scsifront.c
+++ b/drivers/scsi/xen-scsifront.c
@@ -899,7 +899,7 @@ static int scsifront_probe(struct xenbus_device *dev,
int err = -ENOMEM;
char name[TASK_COMM_LEN];
- host = scsi_host_alloc(&scsifront_sht, sizeof(*info));
+ host = scsi_host_alloc(&scsifront_sht, sizeof(*info), NULL);
if (!host) {
xenbus_dev_fatal(dev, err, "fail to allocate scsi host");
return err;
diff --git a/drivers/scsi/zorro_esp.c b/drivers/scsi/zorro_esp.c
index 1622285c9aec..5983015877a7 100644
--- a/drivers/scsi/zorro_esp.c
+++ b/drivers/scsi/zorro_esp.c
@@ -774,7 +774,7 @@ static int zorro_esp_probe(struct zorro_dev *z,
goto fail_free_zep;
}
- host = scsi_host_alloc(tpnt, sizeof(struct esp));
+ host = scsi_host_alloc(tpnt, sizeof(struct esp), NULL);
if (!host) {
pr_err("No host detected; board configuration problem?\n");
diff --git a/include/scsi/libfc.h b/include/scsi/libfc.h
index be0ffe1e3395..17e545fa5c7e 100644
--- a/include/scsi/libfc.h
+++ b/include/scsi/libfc.h
@@ -883,7 +883,7 @@ libfc_host_alloc(const struct scsi_host_template *sht, int priv_size)
struct fc_lport *lport;
struct Scsi_Host *shost;
- shost = scsi_host_alloc(sht, sizeof(*lport) + priv_size);
+ shost = scsi_host_alloc(sht, sizeof(*lport) + priv_size, NULL);
if (!shost)
return NULL;
lport = shost_priv(shost);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 7e2011830ba4..09c82a41b7a1 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -796,7 +796,8 @@ static inline int scsi_host_in_recovery(struct Scsi_Host *shost)
extern int scsi_queue_work(struct Scsi_Host *, struct work_struct *);
extern void scsi_flush_work(struct Scsi_Host *);
-extern struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *, int);
+extern struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht,
+ int privsize, struct device *dev);
extern int __must_check scsi_add_host_with_dma(struct Scsi_Host *,
struct device *,
struct device *);
--
2.43.7
^ permalink raw reply related
* [PATCH v3 0/4] scsi/block: NUMA-local scan allocations, shared-tag path cleanup, and SCSI I/O counters
From: Sumit Saxena @ 2026-06-09 12:17 UTC (permalink / raw)
To: Martin K . Petersen, Jens Axboe
Cc: James E . J . Bottomley, linux-scsi, linux-block, Adam Radford,
Khalid Aziz, Adaptec OEM Raid Solutions, Matthew Wilcox,
Hannes Reinecke, Juergen E . Fischer, Russell King,
linux-arm-kernel, Finn Thain, Michael Schmitz, Anil Gurumurthy,
Sudarsana Kalluru, Oliver Neukum, Ali Akcaagac, Jamie Lenehan,
Ram Vegesna, target-devel, Bradley Grove, Satish Kharat,
Sesidhar Baddela, Karan Tilak Kumar, Yihang Li, Don Brace,
storagedev, HighPoint Linux Team, Tyrel Datwyler,
Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, linuxppc-dev, Brian King, Lee Duncan,
Chris Leech, Mike Christie, open-iscsi, Justin Tee, Paul Ely,
Kashyap Desai, Shivasharan S, Chandrakanth Patil,
megaraidlinux.pdl, Sathya Prakash Veerichetty, Sreekanth Reddy,
mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani, Ranjan Kumar,
MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori, YOKOTA Hiroshi,
Jack Wang, Geoff Levand, Michael Reed, Nilesh Javali,
GR-QLogic-Storage-Upstream, Narsimhulu Musini, K . Y . Srinivasan,
Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, linux-hyperv,
Michael S . Tsirkin, Jason Wang, Paolo Bonzini, Stefan Hajnoczi,
Eugenio Perez, virtualization, Vishal Bhakta,
bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, xen-devel, Sumit Saxena
This series contains three performance improvements targeting the SCSI
and block layers on multi-socket NUMA and heavily loaded SMP systems.
On multi-socket NUMA systems we observed extreme I/O throughput variance
of 50-60% between runs. This series identifies and fixes two root causes:
cross-node memory accesses due to NUMA-unaware allocations in the scan
path, and false sharing between hot atomic counters in struct request_queue
and struct scsi_device.
Performance notes:
Tested on a dual-socket NUMA system (2x 32-core, 256 GB/socket) with
an mpi3mr HBA, running fio (random read, 4K, QD 64, 16 jobs, 60 s,
direct I/O). IOPS figures are in KIOPS (thousands of IOPS):
Configuration Avg KIOPS Range (KIOPS) Spread
Baseline 6,255 4,200 - 6,700 ~37%
Baseline + all patches 7,350 7,000 - 7,700 ~10%
Key findings:
These patches combinedly reduces the observed 50-60% run-to-run variance
to under 10%, significantly improving workload predictability and
improves IOPs by 16-18%.
No functional regressions observed.
Changes in v3
-------------
-Handled feedback from Bart Van Assche and John Garry.
-Added a patch for shost local NUMA allocation.
-Converted ioerr_cnt and iotmo_cnt atomic counters into per-cpu counters.
Changes in v2
--------------
Patch 1 — Same functional goal as v1 patch 1: NUMA-local scsi_device /
scsi_target allocations in the scan path so steady-state I/O does not
habitually touch remote memory when the host has a fixed DMA/NUMA
affinity.
Patch 2 — Replaces v1’s ____cacheline_aligned_in_smp on
nr_active_requests_shared_tags with removal of the shared-tag fairness
throttling machinery (including hctx_may_queue(), blk_mq_hw_ctx.nr_active,
and request_queue.nr_active_requests_shared_tags and their updates).
This follows the earlier standalone proposal by Bart Van Assche [1],
rebased for the current tree; it removes the high-frequency atomic
accounting that motivated the v1 false-sharing workaround and, in our
testing, improves IOPS on the order of roughly 16–18% for the shared-tag
workload exercised.
Patch 3 — Replaces v1’s cache-line padding of iodone_cnt with
percpu_counter for both iorequest_cnt and iodone_cnt, so submission and
completion paths mostly update CPU-local state instead of bouncing a
single cache line, without inflating struct scsi_device for SMP
alignment.
Merge / review hints
--------------------
Patch 3 touches the block layer and should have block maintainer review;
rest of patches are SCSI-oriented. Please route or Ack as your subsystem
workflow requires.
Bart Van Assche (1):
block: drop shared-tag fairness throttling
James Rizzo (1):
scsi: scan: allocate sdev and starget on the NUMA node of the host
adapter
Sumit Saxena (2):
scsi: host: allocate struct Scsi_Host on the NUMA node of the host
adapter
scsi: use percpu counters for iostat counters in struct scsi_device
block/blk-core.c | 2 -
block/blk-mq-debugfs.c | 22 ++++-
block/blk-mq-tag.c | 4 -
block/blk-mq.c | 17 +---
block/blk-mq.h | 100 ----------------------
drivers/scsi/3w-9xxx.c | 2 +-
drivers/scsi/3w-sas.c | 2 +-
drivers/scsi/3w-xxxx.c | 2 +-
drivers/scsi/53c700.c | 2 +-
drivers/scsi/BusLogic.c | 2 +-
drivers/scsi/a100u2w.c | 2 +-
drivers/scsi/a2091.c | 2 +-
drivers/scsi/a3000.c | 2 +-
drivers/scsi/aacraid/linit.c | 2 +-
drivers/scsi/advansys.c | 6 +-
drivers/scsi/aha152x.c | 2 +-
drivers/scsi/aha1542.c | 2 +-
drivers/scsi/aha1740.c | 2 +-
drivers/scsi/aic7xxx/aic79xx_osm.c | 2 +-
drivers/scsi/aic7xxx/aic7xxx_osm.c | 2 +-
drivers/scsi/aic94xx/aic94xx_init.c | 2 +-
drivers/scsi/am53c974.c | 2 +-
drivers/scsi/arcmsr/arcmsr_hba.c | 3 +-
drivers/scsi/arm/acornscsi.c | 2 +-
drivers/scsi/arm/arxescsi.c | 2 +-
drivers/scsi/arm/cumana_1.c | 2 +-
drivers/scsi/arm/cumana_2.c | 2 +-
drivers/scsi/arm/eesox.c | 2 +-
drivers/scsi/arm/oak.c | 2 +-
drivers/scsi/arm/powertec.c | 2 +-
drivers/scsi/atari_scsi.c | 2 +-
drivers/scsi/atp870u.c | 2 +-
drivers/scsi/bfa/bfad_im.c | 2 +-
drivers/scsi/csiostor/csio_init.c | 4 +-
drivers/scsi/dc395x.c | 2 +-
drivers/scsi/dmx3191d.c | 2 +-
drivers/scsi/elx/efct/efct_xport.c | 4 +-
drivers/scsi/esas2r/esas2r_main.c | 2 +-
drivers/scsi/fdomain.c | 2 +-
drivers/scsi/fnic/fnic_main.c | 2 +-
drivers/scsi/g_NCR5380.c | 2 +-
drivers/scsi/gvp11.c | 2 +-
drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +-
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 2 +-
drivers/scsi/hosts.c | 6 +-
drivers/scsi/hpsa.c | 2 +-
drivers/scsi/hptiop.c | 2 +-
drivers/scsi/ibmvscsi/ibmvfc.c | 2 +-
drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +-
drivers/scsi/imm.c | 2 +-
drivers/scsi/initio.c | 2 +-
drivers/scsi/ipr.c | 2 +-
drivers/scsi/ips.c | 2 +-
drivers/scsi/isci/init.c | 2 +-
drivers/scsi/jazz_esp.c | 2 +-
drivers/scsi/libiscsi.c | 2 +-
drivers/scsi/lpfc/lpfc_init.c | 2 +-
drivers/scsi/mac53c94.c | 2 +-
drivers/scsi/mac_esp.c | 2 +-
drivers/scsi/mac_scsi.c | 2 +-
drivers/scsi/megaraid.c | 2 +-
drivers/scsi/megaraid/megaraid_mbox.c | 2 +-
drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
drivers/scsi/mesh.c | 2 +-
drivers/scsi/mpi3mr/mpi3mr_os.c | 2 +-
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 +-
drivers/scsi/mvme147.c | 2 +-
drivers/scsi/mvsas/mv_init.c | 2 +-
drivers/scsi/mvumi.c | 2 +-
drivers/scsi/myrb.c | 2 +-
drivers/scsi/myrs.c | 2 +-
drivers/scsi/ncr53c8xx.c | 2 +-
drivers/scsi/nsp32.c | 2 +-
drivers/scsi/pcmcia/nsp_cs.c | 2 +-
drivers/scsi/pcmcia/qlogic_stub.c | 2 +-
drivers/scsi/pcmcia/sym53c500_cs.c | 2 +-
drivers/scsi/pm8001/pm8001_init.c | 2 +-
drivers/scsi/pmcraid.c | 2 +-
drivers/scsi/ppa.c | 2 +-
drivers/scsi/ps3rom.c | 2 +-
drivers/scsi/qla1280.c | 2 +-
drivers/scsi/qla2xxx/qla_mid.c | 2 +-
drivers/scsi/qla2xxx/qla_os.c | 2 +-
drivers/scsi/qlogicfas.c | 2 +-
drivers/scsi/qlogicpti.c | 2 +-
drivers/scsi/scsi_debug.c | 2 +-
drivers/scsi/scsi_error.c | 4 +-
drivers/scsi/scsi_lib.c | 10 +--
drivers/scsi/scsi_scan.c | 15 +++-
drivers/scsi/scsi_sysfs.c | 23 +++--
drivers/scsi/sd.c | 2 +-
drivers/scsi/sgiwd93.c | 2 +-
drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
drivers/scsi/snic/snic_main.c | 2 +-
drivers/scsi/stex.c | 2 +-
drivers/scsi/storvsc_drv.c | 2 +-
drivers/scsi/sun3_scsi.c | 2 +-
drivers/scsi/sun3x_esp.c | 2 +-
drivers/scsi/sun_esp.c | 2 +-
drivers/scsi/sym53c8xx_2/sym_glue.c | 2 +-
drivers/scsi/virtio_scsi.c | 2 +-
drivers/scsi/vmw_pvscsi.c | 2 +-
drivers/scsi/wd719x.c | 2 +-
drivers/scsi/xen-scsifront.c | 2 +-
drivers/scsi/zorro_esp.c | 2 +-
include/linux/blk-mq.h | 6 --
include/linux/blkdev.h | 2 -
include/scsi/libfc.h | 2 +-
include/scsi/scsi_device.h | 9 +-
include/scsi/scsi_host.h | 3 +-
110 files changed, 168 insertions(+), 258 deletions(-)
--
2.43.7
^ permalink raw reply
* Re: [PATCH v4 4/8] block: implement NVMEM provider
From: Bartosz Golaszewski @ 2026-06-09 11:05 UTC (permalink / raw)
To: Loic Poulain
Cc: linux-mmc, devicetree, linux-kernel, linux-arm-msm, linux-block,
linux-wireless, ath10k, linux-bluetooth, netdev, daniel,
Ulf Hansson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Bjorn Andersson, Konrad Dybcio, Jens Axboe, Johannes Berg,
Jeff Johnson, Marcel Holtmann, Luiz Augusto von Dentz,
Balakrishna Godavarthi, Rocky Liao, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Srinivas Kandagatla,
Andrew Lunn, Heiner Kallweit, Russell King, Saravana Kannan,
Bartosz Golaszewski
In-Reply-To: <CAFEp6-1syMQsvuQ+dLU39bnDeL5Ok7vK1mA7CS0v1m7cjhyMQw@mail.gmail.com>
On Tue, 9 Jun 2026 12:13:08 +0200, Loic Poulain
<loic.poulain@oss.qualcomm.com> said:
> Hi bartosz,
>
...
>>
>> On the other hand, we don't want to not provide the block device just because
>> someone added a DT property on one that's too big. I'd say: warn, but return 0.
>> Does it make sense?
>
> It’s still technically an error in the sense that we cannot provide
> the required nvmem feature. However, it only becomes a real issue if a
> consumer actually attempts to use it.
> Also, the block device should still be added, since the return code
> from add_dev is not checked. In any case, I’m fine with either
> approach, as long as we emit a warning message.
>
We don't check the return value now (why?) but if we ever do, the safer option
is to return 0 IMO.
>>
>> > + }
>> > +
>> > + config.id = NVMEM_DEVID_NONE;
>> > + config.dev = dev;
>> > + config.name = dev_name(dev);
>> > + config.owner = THIS_MODULE;
>> > + config.priv = (void *)(uintptr_t)dev->devt;
>> > + config.reg_read = blk_nvmem_reg_read;
>> > + config.size = bdev_nr_bytes(bdev);
>> > + config.word_size = 1;
>> > + config.stride = 1;
>> > + config.read_only = true;
>> > + config.root_only = true;
>> > + config.ignore_wp = true;
>> > + config.of_node = to_of_node(dev->fwnode);
>> > +
>> > + return PTR_ERR_OR_ZERO(devm_nvmem_register(dev, &config));
>>
>> And that was a wrong suggestion on my part too because I was under the
>> impression that we're in the probe() path, not device_add(). You can't use
>> devres here as the device at this point is not yet bound and may never be.
>
> So I understand The bd_device is purely a class device with no bus, no driver.
> For driverless devices, devres_release_all() is called explicitly
> within device_del() .
>
Right.
>>
>> Which leads me to the second point: this is not the moment to add the nvmem
>> provider. This should happen at or after probe(). Once nvmem_register()
>> returns, you have a visible nvmem resource but nothing backing it in the block
>> layer.
>
> There is a short window during which a read attempt will 'properly'
> fail, but this does seem somewhat fragile indeed.
>
>> Either do this in block core when registering a new device or schedule
>> a notifier here for the BUS_NOTIFY_BOUND_DRIVER event and do it in the notifier
>> callback.
>
> So in the end, it seems that the simpler and more robust approach is
> probably to move away from the class_interface driver and instead
> register/unregister the nvmem directly in add_disk/del_gendisk.
> If that's ok I will move to this approach in the next version.
>
Yes, I think it's the right approach.
Bart
^ permalink raw reply
* Re: [PATCH v2 00/14] list: Prepare entry iterators to cache cursor state
From: Christian König @ 2026-06-09 10:33 UTC (permalink / raw)
To: Kaitao Cheng, Andy Shevchenko, Muchun Song, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder, Jens Axboe,
Takashi Sakamoto, Andrzej Hajda, Neil Armstrong, Robert Foss,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
Tvrtko Ursulin, Huang Rui, Eddie James, Mark Brown,
Maxime Coquelin, Alexandre Torgue, Laxman Dewangan,
Thierry Reding, Jonathan Hunter, Sowjanya Komatineni,
Davidlohr Bueso, Paul E . McKenney, Josh Triplett, Peter Zijlstra,
Ingo Molnar, Will Deacon, Boqun Feng, Liam Girdwood,
Jaroslav Kysela, Takashi Iwai
Cc: Laurent Pinchart, Jonas Karlman, Jernej Skrabec, Matthew Auld,
Matthew Brost, Waiman Long, drbd-dev, linux-block,
linux1394-devel, dri-devel, intel-gfx, linux-spi, linux-stm32,
linux-arm-kernel, linux-tegra, linux-sound, linux-kernel,
Andrew Morton, Randy Dunlap, Christian Brauner, David Howells,
Luca Ceresoli, Kaito Cheng
In-Reply-To: <20260609061347.93688-1-kaitao.cheng@linux.dev>
On 6/9/26 08:13, Kaitao Cheng wrote:
> From: Kaito Cheng <chengkaitao@kylinos.cn>
>
> This series prepares for, and then updates, the list_for_each_entry()
> family so the common entry iterators cache their next or previous cursor
> before the loop body runs.
Why in the world would we want to do that?
The safe and non-safe variants have very distinct use cases and that is completely intentional.
What we could improve maybe is the documentation, from my experience an astonishing large amount of people have misconceptions about the safe variants.
> The first 13 patches open-code loops that intentionally depend on the
> old "derive the next entry from the current cursor at the end of the
> iteration" behaviour. These loops append work to the list being walked,
> restart traversal after dropping a lock, skip an entry consumed by the
> current iteration, or otherwise adjust the cursor in the loop body.
Well I have to clearly reject the changes for subsystems/components I'm maintaining, that just looks horrible to me and I clearly don't see a good reason for that.
Regards,
Christian.
>
> The final patch changes include/linux/list.h to keep a private cursor in
> the common entry iterators while preserving the public macro interface.
> The safe variants remain available when callers need the temporary
> cursor explicitly or have stronger mutation requirements.
>
> Changes in v2 (Muchun Song, Andy Shevchenko):
> - Drop the list_for_each_entry_mutable*() helpers from v1 and make the
> cursor change directly in the existing list_for_each_entry*() helpers.
> - Open-code special list walks that rely on updating the loop cursor in
> the body, preserving their existing traversal semantics.
>
> Link to v1:
> https://lore.kernel.org/all/20260529082149.76764-1-kaitao.cheng@linux.dev/
>
> Kaitao Cheng (14):
> drbd: Open-code transfer log list walk
> firewire: core: Open-code topology list walk
> drm/bridge: Open-code bridge chain list walks
> drm/i915/gt: Open-code active timeline walk
> drm/i915: Open-code DFS dependency list walk
> drm/ttm: Open-code reservation list walk
> spi: fsi: Open-code message transfer walk
> spi: stm32-ospi: Open-code message transfer walk
> spi: stm32-qspi: Open-code message transfer walk
> spi: tegra210-quad: Open-code message transfer walk
> locking/locktorture: Open-code ww mutex list walk
> locking/ww_mutex: Open-code stress reorder list walk
> ASoC: dapm: Open-code widget invalidation walk
> list: Cache cursors in entry iterators
>
> drivers/block/drbd/drbd_debugfs.c | 4 ++-
> drivers/firewire/core-topology.c | 4 ++-
> drivers/gpu/drm/drm_bridge.c | 7 ++--
> drivers/gpu/drm/i915/gt/intel_reset.c | 4 ++-
> drivers/gpu/drm/i915/i915_scheduler.c | 4 ++-
> drivers/gpu/drm/ttm/ttm_execbuf_util.c | 4 ++-
> drivers/spi/spi-fsi.c | 5 ++-
> drivers/spi/spi-stm32-ospi.c | 4 ++-
> drivers/spi/spi-stm32-qspi.c | 5 ++-
> drivers/spi/spi-tegra210-quad.c | 4 ++-
> include/linux/list.h | 46 ++++++++++++++++++++------
> kernel/locking/locktorture.c | 4 ++-
> kernel/locking/test-ww_mutex.c | 4 ++-
> sound/soc/soc-dapm.c | 4 ++-
> 14 files changed, 78 insertions(+), 25 deletions(-)
>
> --
> 2.43.0
>
^ permalink raw reply
* Re: [PATCH v4 4/8] block: implement NVMEM provider
From: Loic Poulain @ 2026-06-09 10:13 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: linux-mmc, devicetree, linux-kernel, linux-arm-msm, linux-block,
linux-wireless, ath10k, linux-bluetooth, netdev, daniel,
Ulf Hansson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Bjorn Andersson, Konrad Dybcio, Jens Axboe, Johannes Berg,
Jeff Johnson, Marcel Holtmann, Luiz Augusto von Dentz,
Balakrishna Godavarthi, Rocky Liao, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, Srinivas Kandagatla,
Andrew Lunn, Heiner Kallweit, Russell King, Saravana Kannan
In-Reply-To: <CAMRc=MfuiPMkSm=-G6VEJpqKhos0TD_pf0ScBGqfwLHT0uk8yQ@mail.gmail.com>
Hi bartosz,
On Tue, Jun 9, 2026 at 10:52 AM Bartosz Golaszewski <brgl@kernel.org> wrote:
>
> On Tue, 9 Jun 2026 09:52:29 +0200, Loic Poulain
> <loic.poulain@oss.qualcomm.com> said:
> > From: Daniel Golle <daniel@makrotopia.org>
> >
> > On embedded devices using an eMMC it is common that one or more partitions
> > on the eMMC are used to store MAC addresses and Wi-Fi calibration EEPROM
> > data. Allow referencing the partition in device tree for the kernel and
> > Wi-Fi drivers accessing it via the NVMEM layer.
> >
> > Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> > Co-developed-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
> > Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
> > ---
> > block/Kconfig | 9 +++++
> > block/Makefile | 1 +
> > block/blk-nvmem.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 124 insertions(+)
> >
> > diff --git a/block/Kconfig b/block/Kconfig
> > index 15027963472d7b40e27b9097a5993c457b5b3054..0b33747e16dc33473683706f75c92bdf8b648f7c 100644
> > --- a/block/Kconfig
> > +++ b/block/Kconfig
> > @@ -209,6 +209,15 @@ config BLK_INLINE_ENCRYPTION_FALLBACK
> > by falling back to the kernel crypto API when inline
> > encryption hardware is not present.
> >
> > +config BLK_NVMEM
> > + bool "Block device NVMEM provider"
> > + depends on OF
> > + depends on NVMEM
> > + help
> > + Allow block devices (or partitions) to act as NVMEM providers,
> > + typically used with eMMC to store MAC addresses or Wi-Fi
> > + calibration data on embedded devices.
> > +
> > source "block/partitions/Kconfig"
> >
> > config BLK_PM
> > diff --git a/block/Makefile b/block/Makefile
> > index 7dce2e44276c4274c11a0a61121c83d9c43d6e0c..d7ac389e71902bc091a8800ea266190a43b3e63d 100644
> > --- a/block/Makefile
> > +++ b/block/Makefile
> > @@ -36,3 +36,4 @@ obj-$(CONFIG_BLK_INLINE_ENCRYPTION) += blk-crypto.o blk-crypto-profile.o \
> > blk-crypto-sysfs.o
> > obj-$(CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK) += blk-crypto-fallback.o
> > obj-$(CONFIG_BLOCK_HOLDER_DEPRECATED) += holder.o
> > +obj-$(CONFIG_BLK_NVMEM) += blk-nvmem.o
> > diff --git a/block/blk-nvmem.c b/block/blk-nvmem.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..a6e62fa98675ee9bcb9c7035a611b5a573ab9091
> > --- /dev/null
> > +++ b/block/blk-nvmem.c
> > @@ -0,0 +1,114 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * block device NVMEM provider
> > + *
> > + * Copyright (c) 2024 Daniel Golle <daniel@makrotopia.org>
> > + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> > + *
> > + * Useful on devices using a partition on an eMMC for MAC addresses or
> > + * Wi-Fi calibration EEPROM data.
> > + */
> > +
> > +#include <linux/file.h>
> > +#include <linux/nvmem-provider.h>
> > +#include <linux/nvmem-consumer.h>
> > +#include <linux/of.h>
> > +#include <linux/pagemap.h>
> > +#include <linux/property.h>
> > +
> > +#include "blk.h"
> > +
> > +static int blk_nvmem_reg_read(void *priv, unsigned int from,
> > + void *val, size_t bytes)
> > +{
> > + blk_mode_t mode = BLK_OPEN_READ | BLK_OPEN_RESTRICT_WRITES;
> > + dev_t devt = (dev_t)(uintptr_t)priv;
> > + size_t bytes_left = bytes;
> > + loff_t pos = from;
> > + int ret = 0;
> > +
> > + struct file *bdev_file __free(fput) = bdev_file_open_by_dev(devt, mode, priv, NULL);
> > + if (IS_ERR(bdev_file))
> > + return PTR_ERR(bdev_file);
> > +
> > + while (bytes_left) {
> > + pgoff_t f_index = pos >> PAGE_SHIFT;
> > + struct folio *folio;
> > + size_t folio_off;
> > + size_t to_read;
> > +
> > + folio = read_mapping_folio(bdev_file->f_mapping, f_index, NULL);
> > + if (IS_ERR(folio)) {
> > + ret = PTR_ERR(folio);
> > + break;
> > + }
> > +
> > + folio_off = offset_in_folio(folio, pos);
> > + to_read = min(bytes_left, folio_size(folio) - folio_off);
> > + memcpy_from_folio(val, folio, folio_off, to_read);
> > + pos += to_read;
> > + bytes_left -= to_read;
> > + val += to_read;
> > + folio_put(folio);
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static int blk_nvmem_register(struct device *dev)
> > +{
> > + struct block_device *bdev = dev_to_bdev(dev);
> > + struct nvmem_config config = {};
> > +
> > + /* skip devices which do not have a device tree node */
> > + if (!dev_of_node(dev))
> > + return 0;
> > +
> > + /* skip devices without an nvmem layout defined */
> > + struct device_node *child __free(device_node) =
> > + of_get_child_by_name(dev_of_node(dev), "nvmem-layout");
> > + if (!child)
> > + return 0;
> > +
> > + /*
> > + * skip block device too large to be represented as NVMEM devices,
> > + * the NVMEM reg_read callback uses an unsigned int offset
> > + */
> > + if (bdev_nr_bytes(bdev) > UINT_MAX) {
> > + dev_warn(dev, "block device too large to be an NVMEM provider\n");
> > + return -ENODEV;
>
> Wait, I must have suggested -ENODEV here on too little coffee. This callback
> is called from device_add(), not when the device is bound so it's not the same
> thing as returning -ENODEV from probe().
>
> On the other hand, we don't want to not provide the block device just because
> someone added a DT property on one that's too big. I'd say: warn, but return 0.
> Does it make sense?
It’s still technically an error in the sense that we cannot provide
the required nvmem feature. However, it only becomes a real issue if a
consumer actually attempts to use it.
Also, the block device should still be added, since the return code
from add_dev is not checked. In any case, I’m fine with either
approach, as long as we emit a warning message.
>
> > + }
> > +
> > + config.id = NVMEM_DEVID_NONE;
> > + config.dev = dev;
> > + config.name = dev_name(dev);
> > + config.owner = THIS_MODULE;
> > + config.priv = (void *)(uintptr_t)dev->devt;
> > + config.reg_read = blk_nvmem_reg_read;
> > + config.size = bdev_nr_bytes(bdev);
> > + config.word_size = 1;
> > + config.stride = 1;
> > + config.read_only = true;
> > + config.root_only = true;
> > + config.ignore_wp = true;
> > + config.of_node = to_of_node(dev->fwnode);
> > +
> > + return PTR_ERR_OR_ZERO(devm_nvmem_register(dev, &config));
>
> And that was a wrong suggestion on my part too because I was under the
> impression that we're in the probe() path, not device_add(). You can't use
> devres here as the device at this point is not yet bound and may never be.
So I understand The bd_device is purely a class device with no bus, no driver.
For driverless devices, devres_release_all() is called explicitly
within device_del() .
>
> Which leads me to the second point: this is not the moment to add the nvmem
> provider. This should happen at or after probe(). Once nvmem_register()
> returns, you have a visible nvmem resource but nothing backing it in the block
> layer.
There is a short window during which a read attempt will 'properly'
fail, but this does seem somewhat fragile indeed.
> Either do this in block core when registering a new device or schedule
> a notifier here for the BUS_NOTIFY_BOUND_DRIVER event and do it in the notifier
> callback.
So in the end, it seems that the simpler and more robust approach is
probably to move away from the class_interface driver and instead
register/unregister the nvmem directly in add_disk/del_gendisk.
If that's ok I will move to this approach in the next version.
Regards,
Loic
^ permalink raw reply
* Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Memory fragmentation with large block sizes
From: Vlastimil Babka @ 2026-06-09 9:37 UTC (permalink / raw)
To: Hannes Reinecke, Christoph Hellwig
Cc: lsf-pc, linux-nvme@lists.infradead.org,
linux-block@vger.kernel.org, linux-mm
In-Reply-To: <1225376d-dc29-44a1-add6-0cb14926a538@suse.de>
On 6/9/26 10:39, Hannes Reinecke wrote:
> On 6/9/26 09:28, Christoph Hellwig wrote:
>> Hannes,
>>
>> can you share your results on the mailing list?
>>
> I sure can.
>
> We have run a simple testcase with on fio job on an LBS-enabled device,
> and another job permanently allocating and deallocating arrays of pages
> of various array lengths.
>
> We then took snapshots of /proc/buddyinfo to track memory pressure
> over time.
>
> Results are visualized in the attach plot.
>
> With 4k block sizes we have seen a high number of 0- and 1- order pages,
> and then the expected decline towards higher orders.
>
> With 8k and 16k block sizes a noticeable 'bump' in free pages was
> developing in 2- and 3- order pages, which we think is down to
> compaction trying to merge pages together.
> The number of 0- order pages increased slightly, but only half of the
> maximum number of pages in the 'bump'.
>
> With 32k block sizes the picture changed completely; the 'bump'
> vanished, and there was only pronounces spike with 0-order pages
> (about four times the size of the spike with 4k block sizes).
>
> This led me to assume that compaction broke down at 32k block sizes;
> this assumption was confirmed by Vlastimil Babka who pointed out that
> there is a maximum order to which page compaction is attempted:
Yep, but after the LSF/MM session I've realized I made an off-by-one error
thanks to the misleading name of the define.
> include/linux/mmzone.h:
> /*
> * PAGE_ALLOC_COSTLY_ORDER is the order at which allocations are deemed
> * costly to service. That is between allocation orders which should
> * coalesce naturally under reasonable reclaim pressure and those which
> * will not.
> */
> #define PAGE_ALLOC_COSTLY_ORDER 3
Which is 32k
> and it's main usage is 'order > PAGE_ALLOC_COSTLY_ORDER'.
And indeed, this means any changes in behavior due to this should only
happen at 64k (order 4) or more. The value of 3 is in fact something like
"PAGE_ALLOC_MAX_CHEAP_ORDER". I'd send a rename patch (Kiryl fixed a similar
off-by-one gotcha with MAX_ORDER), but I suspect we'll be making more
involved changes here so I'd wait for that first.
> Which ties in directly with what we're seeing.
So it's probably not that straigtforward. We should investigate more first?
> It will probably make sense to align the maximum block size which we
> currently support (ie 64k) with this value to ensure that compaction
> works with larger block sizes. Or maybe even the other way round;
> tie the maximum block size which we support to PAGE_ALLOC_COSTLY_ORDER.
> But that would mean to restrict the blocksize to 16k, whereas xfs
> works happily with 32k. So we might want to raise PAGE_ALLOC_COSTLY_ORDER.
>
> Question is, though, how could we measure the impact?
> This particular value has been in since 2007 (commit 5ad333eb66ff1
> 'lumpy reclaim V4'), and it might well be that the original
> reasoning doesn't apply anymore.
>
> At the same time, this value is tied to a _LOT_ of things
> (not to mention the page allocator itself), so increasing it
> to '4' has an extremely high chance of impacting mm performance.
>
> I'll probably run mmtests and see what I get.
>
> Cheers,
>
> Hannes
>
>
> fragmentation.png
>
>
> _______________________________________________
> Lsf-pc mailing list
> Lsf-pc@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/lsf-pc
^ permalink raw reply
* Re: [PATCH v4 4/8] block: implement NVMEM provider
From: Bartosz Golaszewski @ 2026-06-09 8:52 UTC (permalink / raw)
To: Loic Poulain
Cc: linux-mmc, devicetree, linux-kernel, linux-arm-msm, linux-block,
linux-wireless, ath10k, linux-bluetooth, netdev, daniel,
Ulf Hansson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Bjorn Andersson, Konrad Dybcio, Jens Axboe, Johannes Berg,
Jeff Johnson, Bartosz Golaszewski, Marcel Holtmann,
Luiz Augusto von Dentz, Balakrishna Godavarthi, Rocky Liao,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Srinivas Kandagatla, Andrew Lunn, Heiner Kallweit,
Russell King, Saravana Kannan
In-Reply-To: <20260609-block-as-nvmem-v4-4-45712e6b22c6@oss.qualcomm.com>
On Tue, 9 Jun 2026 09:52:29 +0200, Loic Poulain
<loic.poulain@oss.qualcomm.com> said:
> From: Daniel Golle <daniel@makrotopia.org>
>
> On embedded devices using an eMMC it is common that one or more partitions
> on the eMMC are used to store MAC addresses and Wi-Fi calibration EEPROM
> data. Allow referencing the partition in device tree for the kernel and
> Wi-Fi drivers accessing it via the NVMEM layer.
>
> Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> Co-developed-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
> Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
> ---
> block/Kconfig | 9 +++++
> block/Makefile | 1 +
> block/blk-nvmem.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 124 insertions(+)
>
> diff --git a/block/Kconfig b/block/Kconfig
> index 15027963472d7b40e27b9097a5993c457b5b3054..0b33747e16dc33473683706f75c92bdf8b648f7c 100644
> --- a/block/Kconfig
> +++ b/block/Kconfig
> @@ -209,6 +209,15 @@ config BLK_INLINE_ENCRYPTION_FALLBACK
> by falling back to the kernel crypto API when inline
> encryption hardware is not present.
>
> +config BLK_NVMEM
> + bool "Block device NVMEM provider"
> + depends on OF
> + depends on NVMEM
> + help
> + Allow block devices (or partitions) to act as NVMEM providers,
> + typically used with eMMC to store MAC addresses or Wi-Fi
> + calibration data on embedded devices.
> +
> source "block/partitions/Kconfig"
>
> config BLK_PM
> diff --git a/block/Makefile b/block/Makefile
> index 7dce2e44276c4274c11a0a61121c83d9c43d6e0c..d7ac389e71902bc091a8800ea266190a43b3e63d 100644
> --- a/block/Makefile
> +++ b/block/Makefile
> @@ -36,3 +36,4 @@ obj-$(CONFIG_BLK_INLINE_ENCRYPTION) += blk-crypto.o blk-crypto-profile.o \
> blk-crypto-sysfs.o
> obj-$(CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK) += blk-crypto-fallback.o
> obj-$(CONFIG_BLOCK_HOLDER_DEPRECATED) += holder.o
> +obj-$(CONFIG_BLK_NVMEM) += blk-nvmem.o
> diff --git a/block/blk-nvmem.c b/block/blk-nvmem.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..a6e62fa98675ee9bcb9c7035a611b5a573ab9091
> --- /dev/null
> +++ b/block/blk-nvmem.c
> @@ -0,0 +1,114 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * block device NVMEM provider
> + *
> + * Copyright (c) 2024 Daniel Golle <daniel@makrotopia.org>
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + *
> + * Useful on devices using a partition on an eMMC for MAC addresses or
> + * Wi-Fi calibration EEPROM data.
> + */
> +
> +#include <linux/file.h>
> +#include <linux/nvmem-provider.h>
> +#include <linux/nvmem-consumer.h>
> +#include <linux/of.h>
> +#include <linux/pagemap.h>
> +#include <linux/property.h>
> +
> +#include "blk.h"
> +
> +static int blk_nvmem_reg_read(void *priv, unsigned int from,
> + void *val, size_t bytes)
> +{
> + blk_mode_t mode = BLK_OPEN_READ | BLK_OPEN_RESTRICT_WRITES;
> + dev_t devt = (dev_t)(uintptr_t)priv;
> + size_t bytes_left = bytes;
> + loff_t pos = from;
> + int ret = 0;
> +
> + struct file *bdev_file __free(fput) = bdev_file_open_by_dev(devt, mode, priv, NULL);
> + if (IS_ERR(bdev_file))
> + return PTR_ERR(bdev_file);
> +
> + while (bytes_left) {
> + pgoff_t f_index = pos >> PAGE_SHIFT;
> + struct folio *folio;
> + size_t folio_off;
> + size_t to_read;
> +
> + folio = read_mapping_folio(bdev_file->f_mapping, f_index, NULL);
> + if (IS_ERR(folio)) {
> + ret = PTR_ERR(folio);
> + break;
> + }
> +
> + folio_off = offset_in_folio(folio, pos);
> + to_read = min(bytes_left, folio_size(folio) - folio_off);
> + memcpy_from_folio(val, folio, folio_off, to_read);
> + pos += to_read;
> + bytes_left -= to_read;
> + val += to_read;
> + folio_put(folio);
> + }
> +
> + return ret;
> +}
> +
> +static int blk_nvmem_register(struct device *dev)
> +{
> + struct block_device *bdev = dev_to_bdev(dev);
> + struct nvmem_config config = {};
> +
> + /* skip devices which do not have a device tree node */
> + if (!dev_of_node(dev))
> + return 0;
> +
> + /* skip devices without an nvmem layout defined */
> + struct device_node *child __free(device_node) =
> + of_get_child_by_name(dev_of_node(dev), "nvmem-layout");
> + if (!child)
> + return 0;
> +
> + /*
> + * skip block device too large to be represented as NVMEM devices,
> + * the NVMEM reg_read callback uses an unsigned int offset
> + */
> + if (bdev_nr_bytes(bdev) > UINT_MAX) {
> + dev_warn(dev, "block device too large to be an NVMEM provider\n");
> + return -ENODEV;
Wait, I must have suggested -ENODEV here on too little coffee. This callback
is called from device_add(), not when the device is bound so it's not the same
thing as returning -ENODEV from probe().
On the other hand, we don't want to not provide the block device just because
someone added a DT property on one that's too big. I'd say: warn, but return 0.
Does it make sense?
> + }
> +
> + config.id = NVMEM_DEVID_NONE;
> + config.dev = dev;
> + config.name = dev_name(dev);
> + config.owner = THIS_MODULE;
> + config.priv = (void *)(uintptr_t)dev->devt;
> + config.reg_read = blk_nvmem_reg_read;
> + config.size = bdev_nr_bytes(bdev);
> + config.word_size = 1;
> + config.stride = 1;
> + config.read_only = true;
> + config.root_only = true;
> + config.ignore_wp = true;
> + config.of_node = to_of_node(dev->fwnode);
> +
> + return PTR_ERR_OR_ZERO(devm_nvmem_register(dev, &config));
And that was a wrong suggestion on my part too because I was under the
impression that we're in the probe() path, not device_add(). You can't use
devres here as the device at this point is not yet bound and may never be.
Which leads me to the second point: this is not the moment to add the nvmem
provider. This should happen at or after probe(). Once nvmem_register()
returns, you have a visible nvmem resource but nothing backing it in the block
layer.
Either do this in block core when registering a new device or schedule
a notifier here for the BUS_NOTIFY_BOUND_DRIVER event and do it in the notifier
callback.
Sorry, I should have paid more attention, I forgot how the class interface
works.
> +}
> +
> +static struct class_interface blk_nvmem_bus_interface __refdata = {
> + .class = &block_class,
> + .add_dev = &blk_nvmem_register,
> +};
> +
> +static int __init blk_nvmem_init(void)
> +{
> + int ret;
> +
> + ret = class_interface_register(&blk_nvmem_bus_interface);
> + if (ret)
> + return ret;
> +
> + return 0;
> +}
> +device_initcall(blk_nvmem_init);
>
> --
> 2.34.1
>
>
Bart
^ permalink raw reply
* Re: [LSF/MM/BPF TOPIC] Memory fragmentation with large block sizes
From: Hannes Reinecke @ 2026-06-09 8:39 UTC (permalink / raw)
To: Christoph Hellwig
Cc: lsf-pc, linux-nvme@lists.infradead.org,
linux-block@vger.kernel.org, linux-mm
In-Reply-To: <aifAm1rRKF75HVeM@infradead.org>
[-- Attachment #1: Type: text/plain, Size: 2764 bytes --]
On 6/9/26 09:28, Christoph Hellwig wrote:
> Hannes,
>
> can you share your results on the mailing list?
>
I sure can.
We have run a simple testcase with on fio job on an LBS-enabled device,
and another job permanently allocating and deallocating arrays of pages
of various array lengths.
We then took snapshots of /proc/buddyinfo to track memory pressure
over time.
Results are visualized in the attach plot.
With 4k block sizes we have seen a high number of 0- and 1- order pages,
and then the expected decline towards higher orders.
With 8k and 16k block sizes a noticeable 'bump' in free pages was
developing in 2- and 3- order pages, which we think is down to
compaction trying to merge pages together.
The number of 0- order pages increased slightly, but only half of the
maximum number of pages in the 'bump'.
With 32k block sizes the picture changed completely; the 'bump'
vanished, and there was only pronounces spike with 0-order pages
(about four times the size of the spike with 4k block sizes).
This led me to assume that compaction broke down at 32k block sizes;
this assumption was confirmed by Vlastimil Babka who pointed out that
there is a maximum order to which page compaction is attempted:
include/linux/mmzone.h:
/*
* PAGE_ALLOC_COSTLY_ORDER is the order at which allocations are deemed
* costly to service. That is between allocation orders which should
* coalesce naturally under reasonable reclaim pressure and those which
* will not.
*/
#define PAGE_ALLOC_COSTLY_ORDER 3
and it's main usage is 'order > PAGE_ALLOC_COSTLY_ORDER'.
Which ties in directly with what we're seeing.
It will probably make sense to align the maximum block size which we
currently support (ie 64k) with this value to ensure that compaction
works with larger block sizes. Or maybe even the other way round;
tie the maximum block size which we support to PAGE_ALLOC_COSTLY_ORDER.
But that would mean to restrict the blocksize to 16k, whereas xfs
works happily with 32k. So we might want to raise PAGE_ALLOC_COSTLY_ORDER.
Question is, though, how could we measure the impact?
This particular value has been in since 2007 (commit 5ad333eb66ff1
'lumpy reclaim V4'), and it might well be that the original
reasoning doesn't apply anymore.
At the same time, this value is tied to a _LOT_ of things
(not to mention the page allocator itself), so increasing it
to '4' has an extremely high chance of impacting mm performance.
I'll probably run mmtests and see what I get.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
[-- Attachment #2: fragmentation.png --]
[-- Type: image/png, Size: 7250 bytes --]
^ permalink raw reply
* Re: [PATCH] block: optimize I/O merge hot path with unlikely() hints
From: Steven Feng @ 2026-06-09 8:24 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, linux-kernel, Steven Feng
In-Reply-To: <ZmUjB3aF8K1xYzHr@lst.de>
On Tue, Jun 09, 2026 at 12:38:47AM -0700, Christoph Hellwig wrote:
> Umm, these are not failures. Just conditions to not merge because
> of this, and not merging, both because of these conditions and others,
> it the most common case for most workloads.
You're absolutely right. I misunderstood the workload characteristics.
I incorrectly assumed these were rare failure cases rather than common
merge rejection conditions.
> With your patch the object file size of blk-merge.o increases slightly
> for me on x86_64:
>
> What kind of optimization are you attempting and how did you measure
> the results of this "optimizatіon"?
I apologize - I did not properly measure the impact. I was focused on
code style cleanup (removing '== false' comparisons) and incorrectly
added unlikely() hints based on a flawed assumption about branch
frequency.
Given the increased code size and my incorrect understanding of the
workload, please disregard this patch.
Thank you for the detailed feedback. I'll be more careful to profile
and measure before claiming performance improvements in the future.
Steven
^ permalink raw reply
* Re: [PATCH] block: fix arg type in `blk_mq_update_nr_hw_queues`
From: Andreas Hindborg @ 2026-06-09 7:59 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe; +Cc: linux-block, linux-kernel
In-Reply-To: <5353c4b0-fc3b-4fa7-bcd6-c8cc490699ea@acm.org>
"Bart Van Assche" <bvanassche@acm.org> writes:
> On 6/8/26 1:39 AM, Andreas Hindborg wrote:
>> The type of the argument `nr_hw_queues` in the function
>> `blk_mq_update_nr_hw_queues` is a signed integer. This is wrong,
>> considering the field `nr_hw_queues` of `struct blk_mq_tag_set` is
>> unsigned. Thus, change the type of the parameter to unsigned.
>
> Will there ever be storage devices that support more than 2**31 hardware
> queues? If not,
Probably not.
> I think the word "wrong" in the commit message is too
> strong.
Right, I understand. I can change the wording. I chose it because it
seems wrong to me to use a signed type for a value that logically should
never assume a negative value. If the negative value was used to carry
some kind of information, it would make sense, but that is not the case
here.
> If this patch does not change the behavior of the code for any practical
> use case that would be good to mention.
The patch does not change behavior of the code as long as no overflows
occur, and they are not likely to occur. I will be sure to add that to
the commit message.
I came across this while designing the Rust API for these functions.
Rust is kind of particular about integer types, so it gave me some
checks/casts on the Rust side that really should no be required. I was
thinking we might as well change the type on the C side to unsigned when
the data it carries really should be unsigned.
Best regards,
Andreas Hindborg
^ permalink raw reply
* [PATCH v4 7/8] Bluetooth: qca: Set NVMEM BD address quirks when address is invalid
From: Loic Poulain @ 2026-06-09 7:52 UTC (permalink / raw)
To: Ulf Hansson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Bjorn Andersson, Konrad Dybcio, Jens Axboe, Johannes Berg,
Jeff Johnson, Bartosz Golaszewski, Marcel Holtmann,
Luiz Augusto von Dentz, Balakrishna Godavarthi, Rocky Liao,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Srinivas Kandagatla, Andrew Lunn, Heiner Kallweit,
Russell King, Saravana Kannan
Cc: linux-mmc, devicetree, linux-kernel, linux-arm-msm, linux-block,
linux-wireless, ath10k, linux-bluetooth, netdev, daniel,
Loic Poulain, Bartosz Golaszewski
In-Reply-To: <20260609-block-as-nvmem-v4-0-45712e6b22c6@oss.qualcomm.com>
When the controller BD address is invalid (zero or default),
set the NVMEM quirks to allow retrieving the address from a
'local-bd-address' NVMEM cell. The BD address is often stored
alongside the WiFi MAC address in big-endian format, so also
set the big-endian quirk.
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
---
drivers/bluetooth/btqca.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c
index dda76365726f0bfe0e80e05fe04859fa4f0592e1..df33eacfd29fa680f393f90215150743e6001d5b 100644
--- a/drivers/bluetooth/btqca.c
+++ b/drivers/bluetooth/btqca.c
@@ -721,8 +721,11 @@ static int qca_check_bdaddr(struct hci_dev *hdev, const struct qca_fw_config *co
}
bda = (struct hci_rp_read_bd_addr *)skb->data;
- if (!bacmp(&bda->bdaddr, &config->bdaddr))
+ if (!bacmp(&bda->bdaddr, &config->bdaddr)) {
hci_set_quirk(hdev, HCI_QUIRK_USE_BDADDR_PROPERTY);
+ hci_set_quirk(hdev, HCI_QUIRK_USE_BDADDR_NVMEM);
+ hci_set_quirk(hdev, HCI_QUIRK_BDADDR_NVMEM_BE);
+ }
kfree_skb(skb);
--
2.34.1
^ permalink raw reply related
* [PATCH v4 8/8] arm64: dts: qcom: arduino-imola: Describe NVMEM layout for WiFi/BT addresses
From: Loic Poulain @ 2026-06-09 7:52 UTC (permalink / raw)
To: Ulf Hansson, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Bjorn Andersson, Konrad Dybcio, Jens Axboe, Johannes Berg,
Jeff Johnson, Bartosz Golaszewski, Marcel Holtmann,
Luiz Augusto von Dentz, Balakrishna Godavarthi, Rocky Liao,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Srinivas Kandagatla, Andrew Lunn, Heiner Kallweit,
Russell King, Saravana Kannan
Cc: linux-mmc, devicetree, linux-kernel, linux-arm-msm, linux-block,
linux-wireless, ath10k, linux-bluetooth, netdev, daniel,
Loic Poulain, Konrad Dybcio, Bartosz Golaszewski
In-Reply-To: <20260609-block-as-nvmem-v4-0-45712e6b22c6@oss.qualcomm.com>
On Arduino Uno-Q, the eMMC boot1 partition is factory provisioned
with device-specific information such as the WiFi MAC address
and the Bluetooth BD address. This partition can serve as an
alternative to additional non-volatile memory, such as a
dedicated EEPROM.
The eMMC boot partitions are typically good candidates, as they
are relatively small, read-only by default (and can be enforced
as hardware read-only), and are not affected by board reflashing
procedures, which generally target the eMMC user or GP partitions.
Describe the corresponding nvmem-layout for the WiFi and Bluetooth
addresses, and point the WiFi and Bluetooth nodes to the appropriate
NVMEM cells to retrieve them.
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
---
arch/arm64/boot/dts/qcom/qrb2210-arduino-imola.dts | 39 ++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/qrb2210-arduino-imola.dts b/arch/arm64/boot/dts/qcom/qrb2210-arduino-imola.dts
index bf088fa9807f040f0c8f405f9111b01790b09377..128c7a7e76b5b089044745f5d6407d6391055fc2 100644
--- a/arch/arm64/boot/dts/qcom/qrb2210-arduino-imola.dts
+++ b/arch/arm64/boot/dts/qcom/qrb2210-arduino-imola.dts
@@ -409,7 +409,40 @@ &sdhc_1 {
no-sdio;
no-sd;
+ #address-cells = <1>;
+ #size-cells = <0>;
+
status = "okay";
+
+ card@0 {
+ compatible = "mmc-card";
+ reg = <0>;
+
+ partitions-boot1 {
+ compatible = "fixed-partitions";
+
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ nvmem-layout {
+ compatible = "fixed-layout";
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ wifi_mac_addr: mac-addr@4400 {
+ compatible = "mac-base";
+ reg = <0x4400 0x6>;
+ #nvmem-cell-cells = <1>;
+ };
+
+ bd_addr: bd-addr@5400 {
+ compatible = "mac-base";
+ reg = <0x5400 0x6>;
+ #nvmem-cell-cells = <1>;
+ };
+ };
+ };
+ };
};
&spi5 {
@@ -512,6 +545,9 @@ bluetooth {
vddch0-supply = <&pm4125_l22>;
enable-gpios = <&tlmm 87 GPIO_ACTIVE_HIGH>;
max-speed = <3000000>;
+
+ nvmem-cells = <&bd_addr 0>;
+ nvmem-cell-names = "local-bd-address";
};
};
@@ -557,6 +593,9 @@ &wifi {
qcom,ath10k-calibration-variant = "ArduinoImola";
firmware-name = "qcm2290";
+ nvmem-cells = <&wifi_mac_addr 0>;
+ nvmem-cell-names = "mac-address";
+
status = "okay";
};
--
2.34.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox