All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH V2 0/2] swiotlb: Add child io tlb mem support
@ 2022-05-02 12:54 ` Tianyu Lan
  0 siblings, 0 replies; 18+ messages in thread
From: Tianyu Lan @ 2022-05-02 12:54 UTC (permalink / raw)
  To: hch, m.szyprowski, robin.murphy, michael.h.kelley, kys
  Cc: Tianyu Lan, iommu, linux-kernel, vkuznets, brijesh.singh,
	konrad.wilk, hch, wei.liu, parri.andrea, thomas.lendacky,
	linux-hyperv, andi.kleen, kirill.shutemov

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

Traditionally swiotlb was not performance critical because it was only
used for slow devices. But in some setups, like TDX/SEV confidential
guests, all IO has to go through swiotlb. Currently swiotlb only has a
single lock. Under high IO load with multiple CPUs this can lead to
significant lock contention on the swiotlb lock.

This patch adds child IO TLB mem support to resolve spinlock overhead
among device's queues. Each device may allocate IO tlb mem and setup
child IO TLB mem according to queue number. The number child IO tlb
mem maybe set up equal with device queue number and this helps to resolve
swiotlb spinlock overhead among devices and queues.

Patch 2 introduces IO TLB Block concepts and swiotlb_device_allocate()
API to allocate per-device swiotlb bounce buffer. The new API Accepts
queue number as the number of child IO TLB mem to set up device's IO
TLB mem.

Tianyu Lan (2):
  swiotlb: Add Child IO TLB mem support
  Swiotlb: Add device bounce buffer allocation interface

 include/linux/swiotlb.h |  40 ++++++
 kernel/dma/swiotlb.c    | 290 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 317 insertions(+), 13 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: [RFC PATCH V2 2/2] Swiotlb: Add device bounce buffer allocation interface
  2022-05-02 12:54   ` Tianyu Lan
@ 2022-05-04  8:11 ` Dan Carpenter
  -1 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2022-05-02 20:36 UTC (permalink / raw)
  To: kbuild

[-- Attachment #1: Type: text/plain, Size: 8652 bytes --]

CC: kbuild-all(a)lists.01.org
BCC: lkp(a)intel.com
In-Reply-To: <20220502125436.23607-3-ltykernel@gmail.com>
References: <20220502125436.23607-3-ltykernel@gmail.com>
TO: Tianyu Lan <ltykernel@gmail.com>

Hi Tianyu,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on next-20220429]
[cannot apply to linus/master v5.18-rc5 v5.18-rc4 v5.18-rc3 v5.18-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Tianyu-Lan/swiotlb-Add-child-io-tlb-mem-support/20220502-205700
base:    5469f0c06732a077c70a759a81f2a1f00b277694
:::::: branch date: 8 hours ago
:::::: commit date: 8 hours ago
config: x86_64-randconfig-m001 (https://download.01.org/0day-ci/archive/20220503/202205030442.Iugj4ezG-lkp(a)intel.com/config)
compiler: gcc-11 (Debian 11.2.0-20) 11.2.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
kernel/dma/swiotlb.c:958 swiotlb_alloc_block() error: uninitialized symbol 'nslot'.
kernel/dma/swiotlb.c:1024 swiotlb_device_allocate() error: double free of 'mem'

vim +/nslot +958 kernel/dma/swiotlb.c

3349f5b007cd7e Tianyu Lan 2022-05-02   926  
3349f5b007cd7e Tianyu Lan 2022-05-02   927  
3349f5b007cd7e Tianyu Lan 2022-05-02   928  static struct page *swiotlb_alloc_block(struct io_tlb_mem *mem, unsigned int block_num)
3349f5b007cd7e Tianyu Lan 2022-05-02   929  {
3349f5b007cd7e Tianyu Lan 2022-05-02   930  	unsigned int block_index, nslot;
3349f5b007cd7e Tianyu Lan 2022-05-02   931  	phys_addr_t tlb_addr;
3349f5b007cd7e Tianyu Lan 2022-05-02   932  	unsigned long flags;
3349f5b007cd7e Tianyu Lan 2022-05-02   933  	int i, j;
3349f5b007cd7e Tianyu Lan 2022-05-02   934  
3349f5b007cd7e Tianyu Lan 2022-05-02   935  	if (!mem || !mem->block)
3349f5b007cd7e Tianyu Lan 2022-05-02   936  		return NULL;
3349f5b007cd7e Tianyu Lan 2022-05-02   937  
3349f5b007cd7e Tianyu Lan 2022-05-02   938  	spin_lock_irqsave(&mem->lock, flags);
3349f5b007cd7e Tianyu Lan 2022-05-02   939  	block_index = mem->block_index;
3349f5b007cd7e Tianyu Lan 2022-05-02   940  
3349f5b007cd7e Tianyu Lan 2022-05-02   941  	/* Todo: Search more blocks. */
3349f5b007cd7e Tianyu Lan 2022-05-02   942  	if (mem->block[block_index].list < block_num) {
3349f5b007cd7e Tianyu Lan 2022-05-02   943  		spin_unlock_irqrestore(&mem->lock, flags);
3349f5b007cd7e Tianyu Lan 2022-05-02   944  		return NULL;
3349f5b007cd7e Tianyu Lan 2022-05-02   945  	}
3349f5b007cd7e Tianyu Lan 2022-05-02   946  
3349f5b007cd7e Tianyu Lan 2022-05-02   947  	/* Update block and slot list. */
3349f5b007cd7e Tianyu Lan 2022-05-02   948  	for (i = block_index; i < block_index + block_num; i++) {
3349f5b007cd7e Tianyu Lan 2022-05-02   949  		mem->block[i].list = 0;
3349f5b007cd7e Tianyu Lan 2022-05-02   950  		mem->block[i].alloc_size = IO_TLB_BLOCKSIZE;
3349f5b007cd7e Tianyu Lan 2022-05-02   951  		for (j = 0; j < IO_TLB_BLOCKSIZE; j++) {
3349f5b007cd7e Tianyu Lan 2022-05-02   952  			nslot = i * IO_TLB_BLOCKSIZE + j;
3349f5b007cd7e Tianyu Lan 2022-05-02   953  			mem->slots[nslot].list = 0;
3349f5b007cd7e Tianyu Lan 2022-05-02   954  			mem->slots[nslot].alloc_size = IO_TLB_SIZE;
3349f5b007cd7e Tianyu Lan 2022-05-02   955  		}
3349f5b007cd7e Tianyu Lan 2022-05-02   956  	}
3349f5b007cd7e Tianyu Lan 2022-05-02   957  
3349f5b007cd7e Tianyu Lan 2022-05-02  @958  	mem->index = nslot + 1;
3349f5b007cd7e Tianyu Lan 2022-05-02   959  	mem->block_index += block_num;
3349f5b007cd7e Tianyu Lan 2022-05-02   960  	mem->used += block_num * IO_TLB_BLOCKSIZE;
3349f5b007cd7e Tianyu Lan 2022-05-02   961  	spin_unlock_irqrestore(&mem->lock, flags);
3349f5b007cd7e Tianyu Lan 2022-05-02   962  
3349f5b007cd7e Tianyu Lan 2022-05-02   963  	tlb_addr = slot_addr(mem->start, block_index * IO_TLB_BLOCKSIZE);
3349f5b007cd7e Tianyu Lan 2022-05-02   964  	return pfn_to_page(PFN_DOWN(tlb_addr));
3349f5b007cd7e Tianyu Lan 2022-05-02   965  }
3349f5b007cd7e Tianyu Lan 2022-05-02   966  
3349f5b007cd7e Tianyu Lan 2022-05-02   967  /*
3349f5b007cd7e Tianyu Lan 2022-05-02   968   * swiotlb_device_allocate - Allocate bounce buffer fo device from
3349f5b007cd7e Tianyu Lan 2022-05-02   969   * default io tlb pool. The allocation size should be aligned with
3349f5b007cd7e Tianyu Lan 2022-05-02   970   * IO_TLB_BLOCK_UNIT.
3349f5b007cd7e Tianyu Lan 2022-05-02   971   */
3349f5b007cd7e Tianyu Lan 2022-05-02   972  int swiotlb_device_allocate(struct device *dev,
3349f5b007cd7e Tianyu Lan 2022-05-02   973  			    unsigned int queue_num,
3349f5b007cd7e Tianyu Lan 2022-05-02   974  			    unsigned long size)
3349f5b007cd7e Tianyu Lan 2022-05-02   975  {
3349f5b007cd7e Tianyu Lan 2022-05-02   976  	struct io_tlb_mem *mem, *parent_mem = dev->dma_io_tlb_mem;
3349f5b007cd7e Tianyu Lan 2022-05-02   977  	unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_BLOCKSIZE);
3349f5b007cd7e Tianyu Lan 2022-05-02   978  	struct page *page;
3349f5b007cd7e Tianyu Lan 2022-05-02   979  	int ret = -ENOMEM;
3349f5b007cd7e Tianyu Lan 2022-05-02   980  
3349f5b007cd7e Tianyu Lan 2022-05-02   981  	page = swiotlb_alloc_block(parent_mem, nslabs / IO_TLB_BLOCKSIZE);
3349f5b007cd7e Tianyu Lan 2022-05-02   982  	if (!page)
3349f5b007cd7e Tianyu Lan 2022-05-02   983  		return -ENOMEM;
3349f5b007cd7e Tianyu Lan 2022-05-02   984  
3349f5b007cd7e Tianyu Lan 2022-05-02   985  	mem = kzalloc(sizeof(*mem), GFP_KERNEL);
3349f5b007cd7e Tianyu Lan 2022-05-02   986  	if (!mem)
3349f5b007cd7e Tianyu Lan 2022-05-02   987  		goto error_mem;
3349f5b007cd7e Tianyu Lan 2022-05-02   988  
3349f5b007cd7e Tianyu Lan 2022-05-02   989  	mem->slots = kzalloc(array_size(sizeof(*mem->slots), nslabs),
3349f5b007cd7e Tianyu Lan 2022-05-02   990  			     GFP_KERNEL);
3349f5b007cd7e Tianyu Lan 2022-05-02   991  	if (!mem->slots)
3349f5b007cd7e Tianyu Lan 2022-05-02   992  		goto error_slots;
3349f5b007cd7e Tianyu Lan 2022-05-02   993  
3349f5b007cd7e Tianyu Lan 2022-05-02   994  	mem->block = kcalloc(nslabs / IO_TLB_BLOCKSIZE,
3349f5b007cd7e Tianyu Lan 2022-05-02   995  				sizeof(struct io_tlb_block),
3349f5b007cd7e Tianyu Lan 2022-05-02   996  				GFP_KERNEL);
3349f5b007cd7e Tianyu Lan 2022-05-02   997  	if (!mem->block)
3349f5b007cd7e Tianyu Lan 2022-05-02   998  		goto error_block;
3349f5b007cd7e Tianyu Lan 2022-05-02   999  
3349f5b007cd7e Tianyu Lan 2022-05-02  1000  	mem->num_child = queue_num;
3349f5b007cd7e Tianyu Lan 2022-05-02  1001  	mem->child = kcalloc(queue_num,
3349f5b007cd7e Tianyu Lan 2022-05-02  1002  				sizeof(struct io_tlb_mem),
3349f5b007cd7e Tianyu Lan 2022-05-02  1003  				GFP_KERNEL);
3349f5b007cd7e Tianyu Lan 2022-05-02  1004  	if (!mem->child)
3349f5b007cd7e Tianyu Lan 2022-05-02  1005  		goto error_child;
3349f5b007cd7e Tianyu Lan 2022-05-02  1006  
3349f5b007cd7e Tianyu Lan 2022-05-02  1007  
3349f5b007cd7e Tianyu Lan 2022-05-02  1008  	swiotlb_init_io_tlb_mem(mem, page_to_phys(page), nslabs, true);
3349f5b007cd7e Tianyu Lan 2022-05-02  1009  	mem->force_bounce = true;
3349f5b007cd7e Tianyu Lan 2022-05-02  1010  	mem->for_alloc = true;
3349f5b007cd7e Tianyu Lan 2022-05-02  1011  
3349f5b007cd7e Tianyu Lan 2022-05-02  1012  	mem->vaddr = parent_mem->vaddr + page_to_phys(page) -  parent_mem->start;
3349f5b007cd7e Tianyu Lan 2022-05-02  1013  	dev->dma_io_tlb_mem->parent = parent_mem;
3349f5b007cd7e Tianyu Lan 2022-05-02  1014  	dev->dma_io_tlb_mem = mem;
3349f5b007cd7e Tianyu Lan 2022-05-02  1015  	return 0;
3349f5b007cd7e Tianyu Lan 2022-05-02  1016  
3349f5b007cd7e Tianyu Lan 2022-05-02  1017  error_child:
3349f5b007cd7e Tianyu Lan 2022-05-02  1018  	kfree(mem->block);
3349f5b007cd7e Tianyu Lan 2022-05-02  1019  error_block:
3349f5b007cd7e Tianyu Lan 2022-05-02  1020  	kfree(mem->slots);
3349f5b007cd7e Tianyu Lan 2022-05-02  1021  error_slots:
3349f5b007cd7e Tianyu Lan 2022-05-02  1022  	kfree(mem);
3349f5b007cd7e Tianyu Lan 2022-05-02  1023  error_mem:
3349f5b007cd7e Tianyu Lan 2022-05-02 @1024  	swiotlb_free_block(mem, page_to_phys(page), nslabs / IO_TLB_BLOCKSIZE);
3349f5b007cd7e Tianyu Lan 2022-05-02  1025  	return ret;
3349f5b007cd7e Tianyu Lan 2022-05-02  1026  }
3349f5b007cd7e Tianyu Lan 2022-05-02  1027  EXPORT_SYMBOL_GPL(swiotlb_device_allocate);
3349f5b007cd7e Tianyu Lan 2022-05-02  1028  

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-05-31 14:13 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-02 12:54 [RFC PATCH V2 0/2] swiotlb: Add child io tlb mem support Tianyu Lan
2022-05-02 12:54 ` Tianyu Lan
2022-05-02 12:54 ` [RFC PATCH V2 1/2] swiotlb: Add Child IO TLB " Tianyu Lan
2022-05-02 12:54   ` Tianyu Lan
2022-05-16  7:34   ` Christoph Hellwig
2022-05-16  7:34     ` Christoph Hellwig
2022-05-16 13:08     ` Tianyu Lan
2022-05-16 13:08       ` Tianyu Lan
2022-05-30  1:52     ` Michael Kelley (LINUX) via iommu
2022-05-31  7:16       ` hch
2022-05-31  7:16         ` hch
2022-05-31 14:13         ` Michael Kelley (LINUX) via iommu
2022-05-02 12:54 ` [RFC PATCH V2 2/2] Swiotlb: Add device bounce buffer allocation interface Tianyu Lan
2022-05-02 12:54   ` Tianyu Lan
2022-05-09 11:49 ` [RFC PATCH V2 0/2] swiotlb: Add child io tlb mem support Tianyu Lan
2022-05-09 11:49   ` Tianyu Lan
  -- strict thread matches above, loose matches on Subject: below --
2022-05-02 20:36 [RFC PATCH V2 2/2] Swiotlb: Add device bounce buffer allocation interface kernel test robot
2022-05-04  8:11 ` Dan Carpenter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.