All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: Wei Yang <weiyang@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@lists.ozlabs.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH kernel v10 16/34] powerpc/spapr: vfio: Replace iommu_table with iommu_table_group
Date: Thu, 14 May 2015 13:31:36 +1000	[thread overview]
Message-ID: <55541718.70304@ozlabs.ru> (raw)
In-Reply-To: <20150514012105.GA22701@gwshan>

On 05/14/2015 11:21 AM, Gavin Shan wrote:
> On Tue, May 12, 2015 at 01:39:05AM +1000, Alexey Kardashevskiy wrote:
>> Modern IBM POWERPC systems support multiple (currently two) TCE tables
>> per IOMMU group (a.k.a. PE). This adds a iommu_table_group container
>> for TCE tables. Right now just one table is supported.
>>
>> This defines iommu_table_group struct which stores pointers to
>> iommu_group and iommu_table(s). This replaces iommu_table with
>> iommu_table_group where iommu_table was used to identify a group:
>> - iommu_register_group();
>> - iommudata of generic iommu_group;
>>
>> This removes @data from iommu_table as it_table_group provides
>> same access to pnv_ioda_pe.
>>
>> For IODA, instead of embedding iommu_table, the new iommu_table_group
>> keeps pointers to those. The iommu_table structs are allocated
>> dynamically.
>>
>> For P5IOC2, both iommu_table_group and iommu_table are embedded into
>> PE struct. As there is no EEH and SRIOV support for P5IOC2,
>> iommu_free_table() should not be called on iommu_table struct pointers
>> so we can keep it embedded in pnv_phb::p5ioc2.
>>
>> For pSeries, this replaces multiple calls of kzalloc_node() with a new
>> iommu_pseries_alloc_group() helper and stores the table group struct
>> pointer into the pci_dn struct. For release, a iommu_table_free_group()
>> helper is added.
>>
>> This moves iommu_table struct allocation from SR-IOV code to
>> the generic DMA initialization code in pnv_pci_ioda_setup_dma_pe and
>> pnv_pci_ioda2_setup_dma_pe as this is where DMA is actually initialized.
>> This change is here because those lines had to be changed anyway.
>>
>> This should cause no behavioural change.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>
>> ---
>> Changes:
>> v10:
>> * new to the series, separated from
>> "powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group"
>> * iommu_table is not embedded into iommu_table_group but allocated
>> dynamically in most cases
>> * iommu_table allocation is moved to a single place for IODA2's
>> pnv_pci_ioda_setup_dma_pe where it belongs to
>> * added list of groups into iommu_table; most of the code just looks at
>> the first item to keep the patch simpler
>> ---
>> arch/powerpc/include/asm/iommu.h            |  17 +++--
>> arch/powerpc/include/asm/pci-bridge.h       |   2 +-
>> arch/powerpc/kernel/iommu.c                 |  17 ++---
>> arch/powerpc/platforms/powernv/pci-ioda.c   |  55 +++++++-------
>> arch/powerpc/platforms/powernv/pci-p5ioc2.c |  18 +++--
>> arch/powerpc/platforms/powernv/pci.h        |   3 +-
>> arch/powerpc/platforms/pseries/iommu.c      | 107 +++++++++++++++++++---------
>> drivers/vfio/vfio_iommu_spapr_tce.c         |  23 +++---
>> 8 files changed, 152 insertions(+), 90 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
>> index e2a45c3..61bde1a 100644
>> --- a/arch/powerpc/include/asm/iommu.h
>> +++ b/arch/powerpc/include/asm/iommu.h
>> @@ -92,13 +92,10 @@ struct iommu_table {
>> 	unsigned long *it_map;       /* A simple allocation bitmap for now */
>> 	unsigned long  it_page_shift;/* table iommu page size */
>> #ifdef CONFIG_IOMMU_API
>> -	struct iommu_group *it_group;
>> +	struct iommu_table_group *it_table_group;
>> #endif
>> 	struct iommu_table_ops *it_ops;
>> 	void (*set_bypass)(struct iommu_table *tbl, bool enable);
>> -#ifdef CONFIG_PPC_POWERNV
>> -	void           *data;
>> -#endif
>> };
>>
>> /* Pure 2^n version of get_order */
>> @@ -130,13 +127,21 @@ extern void iommu_free_table(struct iommu_table *tbl, const char *node_name);
>> extern struct iommu_table *iommu_init_table(struct iommu_table * tbl,
>> 					    int nid);
>> #ifdef CONFIG_IOMMU_API
>> -extern void iommu_register_group(struct iommu_table *tbl,
>> +
>> +#define IOMMU_TABLE_GROUP_MAX_TABLES	1
>> +
>> +struct iommu_table_group {
>> +	struct iommu_group *group;
>> +	struct iommu_table *tables[IOMMU_TABLE_GROUP_MAX_TABLES];
>> +};
>> +
>> +extern void iommu_register_group(struct iommu_table_group *table_group,
>> 				 int pci_domain_number, unsigned long pe_num);
>> extern int iommu_add_device(struct device *dev);
>> extern void iommu_del_device(struct device *dev);
>> extern int __init tce_iommu_bus_notifier_init(void);
>> #else
>> -static inline void iommu_register_group(struct iommu_table *tbl,
>> +static inline void iommu_register_group(struct iommu_table_group *table_group,
>> 					int pci_domain_number,
>> 					unsigned long pe_num)
>> {
>> diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>> index 1811c44..e2d7479 100644
>> --- a/arch/powerpc/include/asm/pci-bridge.h
>> +++ b/arch/powerpc/include/asm/pci-bridge.h
>> @@ -185,7 +185,7 @@ struct pci_dn {
>>
>> 	struct  pci_dn *parent;
>> 	struct  pci_controller *phb;	/* for pci devices */
>> -	struct	iommu_table *iommu_table;	/* for phb's or bridges */
>> +	struct	iommu_table_group *table_group;	/* for phb's or bridges */
>> 	struct	device_node *node;	/* back-pointer to the device_node */
>>
>> 	int	pci_ext_config_space;	/* for pci devices */
>> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
>> index 16be6aa..79e8b43 100644
>> --- a/arch/powerpc/kernel/iommu.c
>> +++ b/arch/powerpc/kernel/iommu.c
>> @@ -886,11 +886,12 @@ EXPORT_SYMBOL_GPL(iommu_direction_to_tce_perm);
>>   */
>> static void group_release(void *iommu_data)
>> {
>> -	struct iommu_table *tbl = iommu_data;
>> -	tbl->it_group = NULL;
>> +	struct iommu_table_group *table_group = iommu_data;
>> +
>> +	table_group->group = NULL;
>> }
>>
>> -void iommu_register_group(struct iommu_table *tbl,
>> +void iommu_register_group(struct iommu_table_group *table_group,
>> 		int pci_domain_number, unsigned long pe_num)
>> {
>> 	struct iommu_group *grp;
>> @@ -902,8 +903,8 @@ void iommu_register_group(struct iommu_table *tbl,
>> 				PTR_ERR(grp));
>> 		return;
>> 	}
>> -	tbl->it_group = grp;
>> -	iommu_group_set_iommudata(grp, tbl, group_release);
>> +	table_group->group = grp;
>> +	iommu_group_set_iommudata(grp, table_group, group_release);
>> 	name = kasprintf(GFP_KERNEL, "domain%d-pe%lx",
>> 			pci_domain_number, pe_num);
>> 	if (!name)
>> @@ -1091,7 +1092,7 @@ int iommu_add_device(struct device *dev)
>> 	}
>>
>> 	tbl = get_iommu_table_base(dev);
>> -	if (!tbl || !tbl->it_group) {
>> +	if (!tbl || !tbl->it_table_group || !tbl->it_table_group->group) {
>> 		pr_debug("%s: Skipping device %s with no tbl\n",
>> 			 __func__, dev_name(dev));
>> 		return 0;
>> @@ -1099,7 +1100,7 @@ int iommu_add_device(struct device *dev)
>>
>> 	pr_debug("%s: Adding %s to iommu group %d\n",
>> 		 __func__, dev_name(dev),
>> -		 iommu_group_id(tbl->it_group));
>> +		 iommu_group_id(tbl->it_table_group->group));
>>
>> 	if (PAGE_SIZE < IOMMU_PAGE_SIZE(tbl)) {
>> 		pr_err("%s: Invalid IOMMU page size %lx (%lx) on %s\n",
>> @@ -1108,7 +1109,7 @@ int iommu_add_device(struct device *dev)
>> 		return -EINVAL;
>> 	}
>>
>> -	return iommu_group_add_device(tbl->it_group, dev);
>> +	return iommu_group_add_device(tbl->it_table_group->group, dev);
>> }
>> EXPORT_SYMBOL_GPL(iommu_add_device);
>>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index 1b43e25..02ed448 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -1087,10 +1087,6 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
>> 		return;
>> 	}
>>
>> -	pe->tce32_table = kzalloc_node(sizeof(struct iommu_table),
>> -			GFP_KERNEL, hose->node);
>> -	pe->tce32_table->data = pe;
>> -
>> 	/* Associate it with all child devices */
>> 	pnv_ioda_setup_same_PE(bus, pe);
>>
>> @@ -1292,11 +1288,12 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe
>> 	struct iommu_table    *tbl;
>> 	unsigned long         addr;
>> 	int64_t               rc;
>> +	struct iommu_table_group *table_group;
>>
>> 	bus = dev->bus;
>> 	hose = pci_bus_to_host(bus);
>> 	phb = hose->private_data;
>> -	tbl = pe->tce32_table;
>> +	tbl = pe->table_group.tables[0];
>> 	addr = tbl->it_base;
>>
>> 	opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
>> @@ -1311,13 +1308,14 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe
>> 	if (rc)
>> 		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
>>
>> -	if (tbl->it_group) {
>> -		iommu_group_put(tbl->it_group);
>> -		BUG_ON(tbl->it_group);
>> +	table_group = tbl->it_table_group;
>> +	if (table_group->group) {
>> +		iommu_group_put(table_group->group);
>> +		BUG_ON(table_group->group);
>> 	}
>> 	iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
>> 	free_pages(addr, get_order(TCE32_TABLE_SIZE));
>> -	pe->tce32_table = NULL;
>> +	pe->table_group.tables[0] = NULL;
>> }
>>
>> static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>> @@ -1465,10 +1463,6 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>> 			continue;
>> 		}
>>
>> -		pe->tce32_table = kzalloc_node(sizeof(struct iommu_table),
>> -				GFP_KERNEL, hose->node);
>> -		pe->tce32_table->data = pe;
>> -
>> 		/* Put PE to the list */
>> 		mutex_lock(&phb->ioda.pe_list_mutex);
>> 		list_add_tail(&pe->list, &phb->ioda.pe_list);
>> @@ -1603,7 +1597,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
>>
>> 	pe = &phb->ioda.pe_array[pdn->pe_number];
>> 	WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
>> -	set_iommu_table_base(&pdev->dev, pe->tce32_table);
>> +	set_iommu_table_base(&pdev->dev, pe->table_group.tables[0]);
>> 	/*
>> 	 * Note: iommu_add_device() will fail here as
>> 	 * for physical PE: the device is already added by now;
>> @@ -1636,7 +1630,7 @@ static int pnv_pci_ioda_dma_set_mask(struct pnv_phb *phb,
>> 	} else {
>> 		dev_info(&pdev->dev, "Using 32-bit DMA via iommu\n");
>> 		set_dma_ops(&pdev->dev, &dma_iommu_ops);
>> -		set_iommu_table_base(&pdev->dev, pe->tce32_table);
>> +		set_iommu_table_base(&pdev->dev, pe->table_group.tables[0]);
>> 	}
>> 	*pdev->dev.dma_mask = dma_mask;
>> 	return 0;
>> @@ -1670,7 +1664,7 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe,
>> 	struct pci_dev *dev;
>>
>> 	list_for_each_entry(dev, &bus->devices, bus_list) {
>> -		set_iommu_table_base(&dev->dev, pe->tce32_table);
>> +		set_iommu_table_base(&dev->dev, pe->table_group.tables[0]);
>> 		iommu_add_device(&dev->dev);
>>
>> 		if (dev->subordinate)
>> @@ -1681,7 +1675,8 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe,
>> static void pnv_pci_ioda1_tce_invalidate(struct iommu_table *tbl,
>> 		unsigned long index, unsigned long npages, bool rm)
>> {
>> -	struct pnv_ioda_pe *pe = tbl->data;
>> +	struct pnv_ioda_pe *pe = container_of(tbl->it_table_group,
>> +			struct pnv_ioda_pe, table_group);
>> 	__be64 __iomem *invalidate = rm ?
>> 		(__be64 __iomem *)pe->tce_inval_reg_phys :
>> 		(__be64 __iomem *)tbl->it_index;
>> @@ -1758,7 +1753,8 @@ static struct iommu_table_ops pnv_ioda1_iommu_ops = {
>> static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
>> 		unsigned long index, unsigned long npages, bool rm)
>> {
>> -	struct pnv_ioda_pe *pe = tbl->data;
>> +	struct pnv_ioda_pe *pe = container_of(tbl->it_table_group,
>> +			struct pnv_ioda_pe, table_group);
>> 	unsigned long start, end, inc;
>> 	__be64 __iomem *invalidate = rm ?
>> 		(__be64 __iomem *)pe->tce_inval_reg_phys :
>> @@ -1834,8 +1830,12 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>> 	if (WARN_ON(pe->tce32_seg >= 0))
>> 		return;
>>
>> -	tbl = pe->tce32_table;
>> -	iommu_register_group(tbl, phb->hose->global_number, pe->pe_number);
>> +	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
>> +			phb->hose->node);
>> +	tbl->it_table_group = &pe->table_group;
>> +	pe->table_group.tables[0] = tbl;
>> +	iommu_register_group(&pe->table_group, phb->hose->global_number,
>> +			pe->pe_number);
>>
>> 	/* Grab a 32-bit TCE table */
>> 	pe->tce32_seg = base;
>> @@ -1914,7 +1914,8 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>>
>> static void pnv_pci_ioda2_set_bypass(struct iommu_table *tbl, bool enable)
>> {
>> -	struct pnv_ioda_pe *pe = tbl->data;
>> +	struct pnv_ioda_pe *pe = container_of(tbl->it_table_group,
>> +			struct pnv_ioda_pe, table_group);
>> 	uint16_t window_id = (pe->pe_number << 1 ) + 1;
>> 	int64_t rc;
>>
>> @@ -1948,10 +1949,10 @@ static void pnv_pci_ioda2_setup_bypass_pe(struct pnv_phb *phb,
>> 	pe->tce_bypass_base = 1ull << 59;
>>
>> 	/* Install set_bypass callback for VFIO */
>> -	pe->tce32_table->set_bypass = pnv_pci_ioda2_set_bypass;
>> +	pe->table_group.tables[0]->set_bypass = pnv_pci_ioda2_set_bypass;
>
> It could be simplied as:
>
> 	tbl->set_bypass = pnv_pci_ioda2_set_bypass;

No, this is worse in this case. The whole idea of the patch to be 
mechanical as much as we can. That means:
s/tce32_table/table_group.tables[0]/
If I do what you suggest, the reviewers will have to look further if "tbl" 
has been initialized properly, etc.

And later in this patchset I am getting rid of 
pnv_pci_ioda2_setup_bypass_pe so there is no point to try making it look 
nice :)

>>
>> 	/* Enable bypass by default */
>> -	pnv_pci_ioda2_set_bypass(pe->tce32_table, true);
>> +	pnv_pci_ioda2_set_bypass(pe->table_group.tables[0], true);
>
> Similar to above:
>
> 	tbl->set_bypass(tbl, true);
>
>> }
>>
>> static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>> @@ -1968,8 +1969,12 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>> 	if (WARN_ON(pe->tce32_seg >= 0))
>> 		return;
>>
>> -	tbl = pe->tce32_table;
>> -	iommu_register_group(tbl, phb->hose->global_number, pe->pe_number);
>> +	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
>> +			phb->hose->node);
>> +	tbl->it_table_group = &pe->table_group;
>> +	pe->table_group.tables[0] = tbl;
>> +	iommu_register_group(&pe->table_group, phb->hose->global_number,
>> +			pe->pe_number);
>>
>> 	/* The PE will reserve all possible 32-bits space */
>> 	pe->tce32_seg = 0;
>> diff --git a/arch/powerpc/platforms/powernv/pci-p5ioc2.c b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
>> index 2722c1a..4ea9def 100644
>> --- a/arch/powerpc/platforms/powernv/pci-p5ioc2.c
>> +++ b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
>> @@ -92,14 +92,16 @@ static struct iommu_table_ops pnv_p5ioc2_iommu_ops = {
>> static void pnv_pci_p5ioc2_dma_dev_setup(struct pnv_phb *phb,
>> 					 struct pci_dev *pdev)
>> {
>> -	if (phb->p5ioc2.iommu_table.it_map == NULL) {
>> -		phb->p5ioc2.iommu_table.it_ops = &pnv_p5ioc2_iommu_ops;
>> -		iommu_init_table(&phb->p5ioc2.iommu_table, phb->hose->node);
>> -		iommu_register_group(&phb->p5ioc2.iommu_table,
>> +	struct iommu_table *tbl = phb->p5ioc2.table_group.tables[0];
>> +
>> +	if (!tbl->it_map) {
>> +		tbl->it_ops = &pnv_p5ioc2_iommu_ops;
>> +		iommu_init_table(tbl, phb->hose->node);
>> +		iommu_register_group(&phb->p5ioc2.table_group,
>> 				pci_domain_nr(phb->hose->bus), phb->opal_id);
>> 	}
>>
>> -	set_iommu_table_base(&pdev->dev, &phb->p5ioc2.iommu_table);
>> +	set_iommu_table_base(&pdev->dev, tbl);
>> 	iommu_add_device(&pdev->dev);
>> }
>>
>> @@ -180,6 +182,12 @@ static void __init pnv_pci_init_p5ioc2_phb(struct device_node *np, u64 hub_id,
>> 	pnv_pci_setup_iommu_table(&phb->p5ioc2.iommu_table,
>> 				  tce_mem, tce_size, 0,
>> 				  IOMMU_PAGE_SHIFT_4K);
>> +	/*
>> +	 * We do not allocate iommu_table as we do not support
>> +	 * hotplug or SRIOV on P5IOC2 and therefore iommu_free_table()
>> +	 * should not be called for phb->p5ioc2.table_group.tables[0] ever.
>> +	 */
>> +	phb->p5ioc2.table_group.tables[0] = &phb->p5ioc2.iommu_table;
>> }
>>
>> void __init pnv_pci_init_p5ioc2_hub(struct device_node *np)
>> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
>> index ec26afd..720cc99 100644
>> --- a/arch/powerpc/platforms/powernv/pci.h
>> +++ b/arch/powerpc/platforms/powernv/pci.h
>> @@ -57,7 +57,7 @@ struct pnv_ioda_pe {
>> 	/* "Base" iommu table, ie, 4K TCEs, 32-bit DMA */
>> 	int			tce32_seg;
>> 	int			tce32_segcount;
>> -	struct iommu_table	*tce32_table;
>> +	struct iommu_table_group table_group;
>> 	phys_addr_t		tce_inval_reg_phys;
>>
>> 	/* 64-bit TCE bypass region */
>> @@ -123,6 +123,7 @@ struct pnv_phb {
>> 	union {
>> 		struct {
>> 			struct iommu_table iommu_table;
>> +			struct iommu_table_group table_group;
>> 		} p5ioc2;
>>
>> 		struct {
>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>> index 4f2ab90..ad5ac6d 100644
>> --- a/arch/powerpc/platforms/pseries/iommu.c
>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>> @@ -52,14 +52,49 @@
>>
>> #include "pseries.h"
>>
>> -static void iommu_pseries_free_table(struct iommu_table *tbl,
>> +static struct iommu_table_group *iommu_pseries_alloc_group(int node)
>
> Since it's a static function, the name could be simplied to
> iommu_group_alloc(), or alloc_iommu_group(). But it might
> not the style you like :-)

Giving it the name like this I am telling people not to try reusing it for 
anything else + easier to grep.


>
>> +{
>> +	struct iommu_table_group *table_group = NULL;
>> +	struct iommu_table *tbl = NULL;
>> +
>> +	table_group = kzalloc_node(sizeof(struct iommu_table_group), GFP_KERNEL,
>> +			   node);
>> +	if (!table_group)
>> +		goto fail_exit;
>> +
>> +	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL, node);
>> +	if (!tbl)
>> +		goto fail_exit;
>> +
>> +	tbl->it_table_group = table_group;
>> +	table_group->tables[0] = tbl;
>> +
>> +	return table_group;
>> +
>> +fail_exit:
>> +	kfree(table_group);
>> +	kfree(tbl);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void iommu_pseries_free_group(struct iommu_table_group *table_group,
>> 		const char *node_name)
>
> Same suggestion as above.
>
>> {
>> -	if (tbl->it_group) {
>> -		iommu_group_put(tbl->it_group);
>> -		BUG_ON(tbl->it_group);
>> +	struct iommu_table *tbl;
>> +
>> +	if (!table_group)
>> +		return;
>> +
>> +	if (table_group->group) {
>> +		iommu_group_put(table_group->group);
>> +		BUG_ON(table_group->group);
>> 	}
>> +
>> +	tbl = table_group->tables[0];
>> 	iommu_free_table(tbl, node_name);
>
> It might worthy to have one check:
>
> 	if (table_group->tables[0])
> 		iommu_free_table(table_group->tables[0], node_name);


BUG_ON() - may be. (table_group->tables[0] == NULL) cannot happen normally 
(no memory corruption, etc) as iommu_pseries_alloc_group() allocated both 
table_group and table or nothing.


>
>> +
>> +	kfree(table_group);
>> }
>>
>> static void tce_invalidate_pSeries_sw(struct iommu_table *tbl,
>> @@ -629,13 +664,13 @@ static void pci_dma_bus_setup_pSeries(struct pci_bus *bus)
>> 	pci->phb->dma_window_size = 0x8000000ul;
>> 	pci->phb->dma_window_base_cur = 0x8000000ul;
>>
>> -	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
>> -			   pci->phb->node);
>> +	pci->table_group = iommu_pseries_alloc_group(pci->phb->node);
>> +	tbl = pci->table_group->tables[0];
>
> The orginal code isn't checking "!pci->table_group". If this function is
> called only at bootup time, it would be nice to see kernel crash. Otherwise,
> I guess it's still worthy to have the check :-)


kzalloc_node() is supposed to print something when fails.



-- 
Alexey

WARNING: multiple messages have this Message-ID (diff)
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org,
	David Gibson <david@gibson.dropbear.id.au>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Wei Yang <weiyang@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH kernel v10 16/34] powerpc/spapr: vfio: Replace iommu_table with iommu_table_group
Date: Thu, 14 May 2015 13:31:36 +1000	[thread overview]
Message-ID: <55541718.70304@ozlabs.ru> (raw)
In-Reply-To: <20150514012105.GA22701@gwshan>

On 05/14/2015 11:21 AM, Gavin Shan wrote:
> On Tue, May 12, 2015 at 01:39:05AM +1000, Alexey Kardashevskiy wrote:
>> Modern IBM POWERPC systems support multiple (currently two) TCE tables
>> per IOMMU group (a.k.a. PE). This adds a iommu_table_group container
>> for TCE tables. Right now just one table is supported.
>>
>> This defines iommu_table_group struct which stores pointers to
>> iommu_group and iommu_table(s). This replaces iommu_table with
>> iommu_table_group where iommu_table was used to identify a group:
>> - iommu_register_group();
>> - iommudata of generic iommu_group;
>>
>> This removes @data from iommu_table as it_table_group provides
>> same access to pnv_ioda_pe.
>>
>> For IODA, instead of embedding iommu_table, the new iommu_table_group
>> keeps pointers to those. The iommu_table structs are allocated
>> dynamically.
>>
>> For P5IOC2, both iommu_table_group and iommu_table are embedded into
>> PE struct. As there is no EEH and SRIOV support for P5IOC2,
>> iommu_free_table() should not be called on iommu_table struct pointers
>> so we can keep it embedded in pnv_phb::p5ioc2.
>>
>> For pSeries, this replaces multiple calls of kzalloc_node() with a new
>> iommu_pseries_alloc_group() helper and stores the table group struct
>> pointer into the pci_dn struct. For release, a iommu_table_free_group()
>> helper is added.
>>
>> This moves iommu_table struct allocation from SR-IOV code to
>> the generic DMA initialization code in pnv_pci_ioda_setup_dma_pe and
>> pnv_pci_ioda2_setup_dma_pe as this is where DMA is actually initialized.
>> This change is here because those lines had to be changed anyway.
>>
>> This should cause no behavioural change.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
>
>> ---
>> Changes:
>> v10:
>> * new to the series, separated from
>> "powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group"
>> * iommu_table is not embedded into iommu_table_group but allocated
>> dynamically in most cases
>> * iommu_table allocation is moved to a single place for IODA2's
>> pnv_pci_ioda_setup_dma_pe where it belongs to
>> * added list of groups into iommu_table; most of the code just looks at
>> the first item to keep the patch simpler
>> ---
>> arch/powerpc/include/asm/iommu.h            |  17 +++--
>> arch/powerpc/include/asm/pci-bridge.h       |   2 +-
>> arch/powerpc/kernel/iommu.c                 |  17 ++---
>> arch/powerpc/platforms/powernv/pci-ioda.c   |  55 +++++++-------
>> arch/powerpc/platforms/powernv/pci-p5ioc2.c |  18 +++--
>> arch/powerpc/platforms/powernv/pci.h        |   3 +-
>> arch/powerpc/platforms/pseries/iommu.c      | 107 +++++++++++++++++++---------
>> drivers/vfio/vfio_iommu_spapr_tce.c         |  23 +++---
>> 8 files changed, 152 insertions(+), 90 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
>> index e2a45c3..61bde1a 100644
>> --- a/arch/powerpc/include/asm/iommu.h
>> +++ b/arch/powerpc/include/asm/iommu.h
>> @@ -92,13 +92,10 @@ struct iommu_table {
>> 	unsigned long *it_map;       /* A simple allocation bitmap for now */
>> 	unsigned long  it_page_shift;/* table iommu page size */
>> #ifdef CONFIG_IOMMU_API
>> -	struct iommu_group *it_group;
>> +	struct iommu_table_group *it_table_group;
>> #endif
>> 	struct iommu_table_ops *it_ops;
>> 	void (*set_bypass)(struct iommu_table *tbl, bool enable);
>> -#ifdef CONFIG_PPC_POWERNV
>> -	void           *data;
>> -#endif
>> };
>>
>> /* Pure 2^n version of get_order */
>> @@ -130,13 +127,21 @@ extern void iommu_free_table(struct iommu_table *tbl, const char *node_name);
>> extern struct iommu_table *iommu_init_table(struct iommu_table * tbl,
>> 					    int nid);
>> #ifdef CONFIG_IOMMU_API
>> -extern void iommu_register_group(struct iommu_table *tbl,
>> +
>> +#define IOMMU_TABLE_GROUP_MAX_TABLES	1
>> +
>> +struct iommu_table_group {
>> +	struct iommu_group *group;
>> +	struct iommu_table *tables[IOMMU_TABLE_GROUP_MAX_TABLES];
>> +};
>> +
>> +extern void iommu_register_group(struct iommu_table_group *table_group,
>> 				 int pci_domain_number, unsigned long pe_num);
>> extern int iommu_add_device(struct device *dev);
>> extern void iommu_del_device(struct device *dev);
>> extern int __init tce_iommu_bus_notifier_init(void);
>> #else
>> -static inline void iommu_register_group(struct iommu_table *tbl,
>> +static inline void iommu_register_group(struct iommu_table_group *table_group,
>> 					int pci_domain_number,
>> 					unsigned long pe_num)
>> {
>> diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
>> index 1811c44..e2d7479 100644
>> --- a/arch/powerpc/include/asm/pci-bridge.h
>> +++ b/arch/powerpc/include/asm/pci-bridge.h
>> @@ -185,7 +185,7 @@ struct pci_dn {
>>
>> 	struct  pci_dn *parent;
>> 	struct  pci_controller *phb;	/* for pci devices */
>> -	struct	iommu_table *iommu_table;	/* for phb's or bridges */
>> +	struct	iommu_table_group *table_group;	/* for phb's or bridges */
>> 	struct	device_node *node;	/* back-pointer to the device_node */
>>
>> 	int	pci_ext_config_space;	/* for pci devices */
>> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
>> index 16be6aa..79e8b43 100644
>> --- a/arch/powerpc/kernel/iommu.c
>> +++ b/arch/powerpc/kernel/iommu.c
>> @@ -886,11 +886,12 @@ EXPORT_SYMBOL_GPL(iommu_direction_to_tce_perm);
>>   */
>> static void group_release(void *iommu_data)
>> {
>> -	struct iommu_table *tbl = iommu_data;
>> -	tbl->it_group = NULL;
>> +	struct iommu_table_group *table_group = iommu_data;
>> +
>> +	table_group->group = NULL;
>> }
>>
>> -void iommu_register_group(struct iommu_table *tbl,
>> +void iommu_register_group(struct iommu_table_group *table_group,
>> 		int pci_domain_number, unsigned long pe_num)
>> {
>> 	struct iommu_group *grp;
>> @@ -902,8 +903,8 @@ void iommu_register_group(struct iommu_table *tbl,
>> 				PTR_ERR(grp));
>> 		return;
>> 	}
>> -	tbl->it_group = grp;
>> -	iommu_group_set_iommudata(grp, tbl, group_release);
>> +	table_group->group = grp;
>> +	iommu_group_set_iommudata(grp, table_group, group_release);
>> 	name = kasprintf(GFP_KERNEL, "domain%d-pe%lx",
>> 			pci_domain_number, pe_num);
>> 	if (!name)
>> @@ -1091,7 +1092,7 @@ int iommu_add_device(struct device *dev)
>> 	}
>>
>> 	tbl = get_iommu_table_base(dev);
>> -	if (!tbl || !tbl->it_group) {
>> +	if (!tbl || !tbl->it_table_group || !tbl->it_table_group->group) {
>> 		pr_debug("%s: Skipping device %s with no tbl\n",
>> 			 __func__, dev_name(dev));
>> 		return 0;
>> @@ -1099,7 +1100,7 @@ int iommu_add_device(struct device *dev)
>>
>> 	pr_debug("%s: Adding %s to iommu group %d\n",
>> 		 __func__, dev_name(dev),
>> -		 iommu_group_id(tbl->it_group));
>> +		 iommu_group_id(tbl->it_table_group->group));
>>
>> 	if (PAGE_SIZE < IOMMU_PAGE_SIZE(tbl)) {
>> 		pr_err("%s: Invalid IOMMU page size %lx (%lx) on %s\n",
>> @@ -1108,7 +1109,7 @@ int iommu_add_device(struct device *dev)
>> 		return -EINVAL;
>> 	}
>>
>> -	return iommu_group_add_device(tbl->it_group, dev);
>> +	return iommu_group_add_device(tbl->it_table_group->group, dev);
>> }
>> EXPORT_SYMBOL_GPL(iommu_add_device);
>>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index 1b43e25..02ed448 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -1087,10 +1087,6 @@ static void pnv_ioda_setup_bus_PE(struct pci_bus *bus, int all)
>> 		return;
>> 	}
>>
>> -	pe->tce32_table = kzalloc_node(sizeof(struct iommu_table),
>> -			GFP_KERNEL, hose->node);
>> -	pe->tce32_table->data = pe;
>> -
>> 	/* Associate it with all child devices */
>> 	pnv_ioda_setup_same_PE(bus, pe);
>>
>> @@ -1292,11 +1288,12 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe
>> 	struct iommu_table    *tbl;
>> 	unsigned long         addr;
>> 	int64_t               rc;
>> +	struct iommu_table_group *table_group;
>>
>> 	bus = dev->bus;
>> 	hose = pci_bus_to_host(bus);
>> 	phb = hose->private_data;
>> -	tbl = pe->tce32_table;
>> +	tbl = pe->table_group.tables[0];
>> 	addr = tbl->it_base;
>>
>> 	opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
>> @@ -1311,13 +1308,14 @@ static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct pnv_ioda_pe
>> 	if (rc)
>> 		pe_warn(pe, "OPAL error %ld release DMA window\n", rc);
>>
>> -	if (tbl->it_group) {
>> -		iommu_group_put(tbl->it_group);
>> -		BUG_ON(tbl->it_group);
>> +	table_group = tbl->it_table_group;
>> +	if (table_group->group) {
>> +		iommu_group_put(table_group->group);
>> +		BUG_ON(table_group->group);
>> 	}
>> 	iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
>> 	free_pages(addr, get_order(TCE32_TABLE_SIZE));
>> -	pe->tce32_table = NULL;
>> +	pe->table_group.tables[0] = NULL;
>> }
>>
>> static void pnv_ioda_release_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>> @@ -1465,10 +1463,6 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>> 			continue;
>> 		}
>>
>> -		pe->tce32_table = kzalloc_node(sizeof(struct iommu_table),
>> -				GFP_KERNEL, hose->node);
>> -		pe->tce32_table->data = pe;
>> -
>> 		/* Put PE to the list */
>> 		mutex_lock(&phb->ioda.pe_list_mutex);
>> 		list_add_tail(&pe->list, &phb->ioda.pe_list);
>> @@ -1603,7 +1597,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev
>>
>> 	pe = &phb->ioda.pe_array[pdn->pe_number];
>> 	WARN_ON(get_dma_ops(&pdev->dev) != &dma_iommu_ops);
>> -	set_iommu_table_base(&pdev->dev, pe->tce32_table);
>> +	set_iommu_table_base(&pdev->dev, pe->table_group.tables[0]);
>> 	/*
>> 	 * Note: iommu_add_device() will fail here as
>> 	 * for physical PE: the device is already added by now;
>> @@ -1636,7 +1630,7 @@ static int pnv_pci_ioda_dma_set_mask(struct pnv_phb *phb,
>> 	} else {
>> 		dev_info(&pdev->dev, "Using 32-bit DMA via iommu\n");
>> 		set_dma_ops(&pdev->dev, &dma_iommu_ops);
>> -		set_iommu_table_base(&pdev->dev, pe->tce32_table);
>> +		set_iommu_table_base(&pdev->dev, pe->table_group.tables[0]);
>> 	}
>> 	*pdev->dev.dma_mask = dma_mask;
>> 	return 0;
>> @@ -1670,7 +1664,7 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe,
>> 	struct pci_dev *dev;
>>
>> 	list_for_each_entry(dev, &bus->devices, bus_list) {
>> -		set_iommu_table_base(&dev->dev, pe->tce32_table);
>> +		set_iommu_table_base(&dev->dev, pe->table_group.tables[0]);
>> 		iommu_add_device(&dev->dev);
>>
>> 		if (dev->subordinate)
>> @@ -1681,7 +1675,8 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe,
>> static void pnv_pci_ioda1_tce_invalidate(struct iommu_table *tbl,
>> 		unsigned long index, unsigned long npages, bool rm)
>> {
>> -	struct pnv_ioda_pe *pe = tbl->data;
>> +	struct pnv_ioda_pe *pe = container_of(tbl->it_table_group,
>> +			struct pnv_ioda_pe, table_group);
>> 	__be64 __iomem *invalidate = rm ?
>> 		(__be64 __iomem *)pe->tce_inval_reg_phys :
>> 		(__be64 __iomem *)tbl->it_index;
>> @@ -1758,7 +1753,8 @@ static struct iommu_table_ops pnv_ioda1_iommu_ops = {
>> static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
>> 		unsigned long index, unsigned long npages, bool rm)
>> {
>> -	struct pnv_ioda_pe *pe = tbl->data;
>> +	struct pnv_ioda_pe *pe = container_of(tbl->it_table_group,
>> +			struct pnv_ioda_pe, table_group);
>> 	unsigned long start, end, inc;
>> 	__be64 __iomem *invalidate = rm ?
>> 		(__be64 __iomem *)pe->tce_inval_reg_phys :
>> @@ -1834,8 +1830,12 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>> 	if (WARN_ON(pe->tce32_seg >= 0))
>> 		return;
>>
>> -	tbl = pe->tce32_table;
>> -	iommu_register_group(tbl, phb->hose->global_number, pe->pe_number);
>> +	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
>> +			phb->hose->node);
>> +	tbl->it_table_group = &pe->table_group;
>> +	pe->table_group.tables[0] = tbl;
>> +	iommu_register_group(&pe->table_group, phb->hose->global_number,
>> +			pe->pe_number);
>>
>> 	/* Grab a 32-bit TCE table */
>> 	pe->tce32_seg = base;
>> @@ -1914,7 +1914,8 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
>>
>> static void pnv_pci_ioda2_set_bypass(struct iommu_table *tbl, bool enable)
>> {
>> -	struct pnv_ioda_pe *pe = tbl->data;
>> +	struct pnv_ioda_pe *pe = container_of(tbl->it_table_group,
>> +			struct pnv_ioda_pe, table_group);
>> 	uint16_t window_id = (pe->pe_number << 1 ) + 1;
>> 	int64_t rc;
>>
>> @@ -1948,10 +1949,10 @@ static void pnv_pci_ioda2_setup_bypass_pe(struct pnv_phb *phb,
>> 	pe->tce_bypass_base = 1ull << 59;
>>
>> 	/* Install set_bypass callback for VFIO */
>> -	pe->tce32_table->set_bypass = pnv_pci_ioda2_set_bypass;
>> +	pe->table_group.tables[0]->set_bypass = pnv_pci_ioda2_set_bypass;
>
> It could be simplied as:
>
> 	tbl->set_bypass = pnv_pci_ioda2_set_bypass;

No, this is worse in this case. The whole idea of the patch to be 
mechanical as much as we can. That means:
s/tce32_table/table_group.tables[0]/
If I do what you suggest, the reviewers will have to look further if "tbl" 
has been initialized properly, etc.

And later in this patchset I am getting rid of 
pnv_pci_ioda2_setup_bypass_pe so there is no point to try making it look 
nice :)

>>
>> 	/* Enable bypass by default */
>> -	pnv_pci_ioda2_set_bypass(pe->tce32_table, true);
>> +	pnv_pci_ioda2_set_bypass(pe->table_group.tables[0], true);
>
> Similar to above:
>
> 	tbl->set_bypass(tbl, true);
>
>> }
>>
>> static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>> @@ -1968,8 +1969,12 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
>> 	if (WARN_ON(pe->tce32_seg >= 0))
>> 		return;
>>
>> -	tbl = pe->tce32_table;
>> -	iommu_register_group(tbl, phb->hose->global_number, pe->pe_number);
>> +	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
>> +			phb->hose->node);
>> +	tbl->it_table_group = &pe->table_group;
>> +	pe->table_group.tables[0] = tbl;
>> +	iommu_register_group(&pe->table_group, phb->hose->global_number,
>> +			pe->pe_number);
>>
>> 	/* The PE will reserve all possible 32-bits space */
>> 	pe->tce32_seg = 0;
>> diff --git a/arch/powerpc/platforms/powernv/pci-p5ioc2.c b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
>> index 2722c1a..4ea9def 100644
>> --- a/arch/powerpc/platforms/powernv/pci-p5ioc2.c
>> +++ b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
>> @@ -92,14 +92,16 @@ static struct iommu_table_ops pnv_p5ioc2_iommu_ops = {
>> static void pnv_pci_p5ioc2_dma_dev_setup(struct pnv_phb *phb,
>> 					 struct pci_dev *pdev)
>> {
>> -	if (phb->p5ioc2.iommu_table.it_map == NULL) {
>> -		phb->p5ioc2.iommu_table.it_ops = &pnv_p5ioc2_iommu_ops;
>> -		iommu_init_table(&phb->p5ioc2.iommu_table, phb->hose->node);
>> -		iommu_register_group(&phb->p5ioc2.iommu_table,
>> +	struct iommu_table *tbl = phb->p5ioc2.table_group.tables[0];
>> +
>> +	if (!tbl->it_map) {
>> +		tbl->it_ops = &pnv_p5ioc2_iommu_ops;
>> +		iommu_init_table(tbl, phb->hose->node);
>> +		iommu_register_group(&phb->p5ioc2.table_group,
>> 				pci_domain_nr(phb->hose->bus), phb->opal_id);
>> 	}
>>
>> -	set_iommu_table_base(&pdev->dev, &phb->p5ioc2.iommu_table);
>> +	set_iommu_table_base(&pdev->dev, tbl);
>> 	iommu_add_device(&pdev->dev);
>> }
>>
>> @@ -180,6 +182,12 @@ static void __init pnv_pci_init_p5ioc2_phb(struct device_node *np, u64 hub_id,
>> 	pnv_pci_setup_iommu_table(&phb->p5ioc2.iommu_table,
>> 				  tce_mem, tce_size, 0,
>> 				  IOMMU_PAGE_SHIFT_4K);
>> +	/*
>> +	 * We do not allocate iommu_table as we do not support
>> +	 * hotplug or SRIOV on P5IOC2 and therefore iommu_free_table()
>> +	 * should not be called for phb->p5ioc2.table_group.tables[0] ever.
>> +	 */
>> +	phb->p5ioc2.table_group.tables[0] = &phb->p5ioc2.iommu_table;
>> }
>>
>> void __init pnv_pci_init_p5ioc2_hub(struct device_node *np)
>> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
>> index ec26afd..720cc99 100644
>> --- a/arch/powerpc/platforms/powernv/pci.h
>> +++ b/arch/powerpc/platforms/powernv/pci.h
>> @@ -57,7 +57,7 @@ struct pnv_ioda_pe {
>> 	/* "Base" iommu table, ie, 4K TCEs, 32-bit DMA */
>> 	int			tce32_seg;
>> 	int			tce32_segcount;
>> -	struct iommu_table	*tce32_table;
>> +	struct iommu_table_group table_group;
>> 	phys_addr_t		tce_inval_reg_phys;
>>
>> 	/* 64-bit TCE bypass region */
>> @@ -123,6 +123,7 @@ struct pnv_phb {
>> 	union {
>> 		struct {
>> 			struct iommu_table iommu_table;
>> +			struct iommu_table_group table_group;
>> 		} p5ioc2;
>>
>> 		struct {
>> diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
>> index 4f2ab90..ad5ac6d 100644
>> --- a/arch/powerpc/platforms/pseries/iommu.c
>> +++ b/arch/powerpc/platforms/pseries/iommu.c
>> @@ -52,14 +52,49 @@
>>
>> #include "pseries.h"
>>
>> -static void iommu_pseries_free_table(struct iommu_table *tbl,
>> +static struct iommu_table_group *iommu_pseries_alloc_group(int node)
>
> Since it's a static function, the name could be simplied to
> iommu_group_alloc(), or alloc_iommu_group(). But it might
> not the style you like :-)

Giving it the name like this I am telling people not to try reusing it for 
anything else + easier to grep.


>
>> +{
>> +	struct iommu_table_group *table_group = NULL;
>> +	struct iommu_table *tbl = NULL;
>> +
>> +	table_group = kzalloc_node(sizeof(struct iommu_table_group), GFP_KERNEL,
>> +			   node);
>> +	if (!table_group)
>> +		goto fail_exit;
>> +
>> +	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL, node);
>> +	if (!tbl)
>> +		goto fail_exit;
>> +
>> +	tbl->it_table_group = table_group;
>> +	table_group->tables[0] = tbl;
>> +
>> +	return table_group;
>> +
>> +fail_exit:
>> +	kfree(table_group);
>> +	kfree(tbl);
>> +
>> +	return NULL;
>> +}
>> +
>> +static void iommu_pseries_free_group(struct iommu_table_group *table_group,
>> 		const char *node_name)
>
> Same suggestion as above.
>
>> {
>> -	if (tbl->it_group) {
>> -		iommu_group_put(tbl->it_group);
>> -		BUG_ON(tbl->it_group);
>> +	struct iommu_table *tbl;
>> +
>> +	if (!table_group)
>> +		return;
>> +
>> +	if (table_group->group) {
>> +		iommu_group_put(table_group->group);
>> +		BUG_ON(table_group->group);
>> 	}
>> +
>> +	tbl = table_group->tables[0];
>> 	iommu_free_table(tbl, node_name);
>
> It might worthy to have one check:
>
> 	if (table_group->tables[0])
> 		iommu_free_table(table_group->tables[0], node_name);


BUG_ON() - may be. (table_group->tables[0] == NULL) cannot happen normally 
(no memory corruption, etc) as iommu_pseries_alloc_group() allocated both 
table_group and table or nothing.


>
>> +
>> +	kfree(table_group);
>> }
>>
>> static void tce_invalidate_pSeries_sw(struct iommu_table *tbl,
>> @@ -629,13 +664,13 @@ static void pci_dma_bus_setup_pSeries(struct pci_bus *bus)
>> 	pci->phb->dma_window_size = 0x8000000ul;
>> 	pci->phb->dma_window_base_cur = 0x8000000ul;
>>
>> -	tbl = kzalloc_node(sizeof(struct iommu_table), GFP_KERNEL,
>> -			   pci->phb->node);
>> +	pci->table_group = iommu_pseries_alloc_group(pci->phb->node);
>> +	tbl = pci->table_group->tables[0];
>
> The orginal code isn't checking "!pci->table_group". If this function is
> called only at bootup time, it would be nice to see kernel crash. Otherwise,
> I guess it's still worthy to have the check :-)


kzalloc_node() is supposed to print something when fails.



-- 
Alexey

  reply	other threads:[~2015-05-14  3:31 UTC|newest]

Thread overview: 163+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-11 15:38 [PATCH kernel v10 00/34] powerpc/iommu/vfio: Enable Dynamic DMA windows Alexey Kardashevskiy
2015-05-11 15:38 ` Alexey Kardashevskiy
2015-05-11 15:38 ` [PATCH kernel v10 01/34] powerpc/eeh/ioda2: Use device::iommu_group to check IOMMU group Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-12  1:51   ` Gavin Shan
2015-05-12  1:51     ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 02/34] powerpc/iommu/powernv: Get rid of set_iommu_table_base_and_group Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  5:18   ` Gavin Shan
2015-05-13  5:18     ` Gavin Shan
2015-05-13  7:26     ` Alexey Kardashevskiy
2015-05-13  7:26       ` Alexey Kardashevskiy
2015-05-11 15:38 ` [PATCH kernel v10 03/34] powerpc/powernv/ioda: Clean up IOMMU group registration Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  5:21   ` Gavin Shan
2015-05-13  5:21     ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 04/34] powerpc/iommu: Put IOMMU group explicitly Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  5:27   ` Gavin Shan
2015-05-13  5:27     ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 05/34] powerpc/iommu: Always release iommu_table in iommu_free_table() Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  5:33   ` Gavin Shan
2015-05-13  5:33     ` Gavin Shan
2015-05-13  6:30     ` Alexey Kardashevskiy
2015-05-13  6:30       ` Alexey Kardashevskiy
2015-05-13 12:51       ` Thomas Huth
2015-05-13 12:51         ` Thomas Huth
2015-05-13 23:27         ` Gavin Shan
2015-05-13 23:27           ` Gavin Shan
2015-05-14  2:34           ` Alexey Kardashevskiy
2015-05-14  2:53             ` Alex Williamson
2015-05-14  2:53               ` Alex Williamson
2015-05-14  6:29               ` Alexey Kardashevskiy
2015-05-14  6:29                 ` Alexey Kardashevskiy
2015-05-11 15:38 ` [PATCH kernel v10 06/34] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  5:58   ` Gavin Shan
2015-05-13  5:58     ` Gavin Shan
2015-05-13  6:32     ` Alexey Kardashevskiy
2015-05-13  6:32       ` Alexey Kardashevskiy
2015-05-11 15:38 ` [PATCH kernel v10 07/34] vfio: powerpc/spapr: Check that IOMMU page is fully contained by system page Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  6:06   ` Gavin Shan
2015-05-13  6:06     ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 08/34] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  6:12   ` Gavin Shan
2015-05-13  6:12     ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 09/34] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  6:18   ` Gavin Shan
2015-05-13  6:18     ` Gavin Shan
2015-05-11 15:38 ` [PATCH kernel v10 10/34] vfio: powerpc/spapr: Disable DMA mappings on disabled container Alexey Kardashevskiy
2015-05-11 15:38   ` Alexey Kardashevskiy
2015-05-13  6:20   ` Gavin Shan
2015-05-13  6:20     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 11/34] vfio: powerpc/spapr: Moving pinning/unpinning to helpers Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-13  6:32   ` Gavin Shan
2015-05-13  6:32     ` Gavin Shan
2015-05-13  7:30     ` Alexey Kardashevskiy
2015-05-13  7:30       ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 12/34] vfio: powerpc/spapr: Rework groups attaching Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-13 23:35   ` Gavin Shan
2015-05-13 23:35     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 13/34] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  0:00   ` Gavin Shan
2015-05-14  0:00     ` Gavin Shan
2015-05-14  2:51     ` Alexey Kardashevskiy
2015-05-14  2:51       ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 14/34] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  0:23   ` Gavin Shan
2015-05-14  0:23     ` Gavin Shan
2015-05-14  3:07     ` Alexey Kardashevskiy
2015-05-14  3:07       ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 15/34] powerpc/powernv/ioda/ioda2: Rework TCE invalidation in tce_build()/tce_free() Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  0:48   ` Gavin Shan
2015-05-14  0:48     ` Gavin Shan
2015-05-14  3:19     ` Alexey Kardashevskiy
2015-05-14  3:19       ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 16/34] powerpc/spapr: vfio: Replace iommu_table with iommu_table_group Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-13 21:30   ` Alex Williamson
2015-05-13 21:30     ` Alex Williamson
2015-05-14  1:21   ` Gavin Shan
2015-05-14  1:21     ` Gavin Shan
2015-05-14  3:31     ` Alexey Kardashevskiy [this message]
2015-05-14  3:31       ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 17/34] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  1:52   ` Gavin Shan
2015-05-14  1:52     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 18/34] vfio: powerpc/spapr/iommu/powernv/ioda2: Rework IOMMU ownership control Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  2:01   ` Gavin Shan
2015-05-14  2:01     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 19/34] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  3:36   ` Gavin Shan
2015-05-14  3:36     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 20/34] powerpc/powernv/ioda2: Move TCE kill register address to PE Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  2:10   ` Gavin Shan
2015-05-14  2:10     ` Gavin Shan
2015-05-14  3:39     ` Alexey Kardashevskiy
2015-05-14  3:39       ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 21/34] powerpc/powernv/ioda2: Add TCE invalidation for all attached groups Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  2:22   ` Gavin Shan
2015-05-14  2:22     ` Gavin Shan
2015-05-14  3:50     ` Alexey Kardashevskiy
2015-05-14  3:50       ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 22/34] powerpc/powernv: Implement accessor to TCE entry Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  2:34   ` Gavin Shan
2015-05-14  2:34     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 23/34] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-13 15:00   ` Thomas Huth
2015-05-13 15:00     ` Thomas Huth
2015-05-14  3:53     ` Alexey Kardashevskiy
2015-05-14  3:53       ` Alexey Kardashevskiy
2015-05-15  8:09       ` Thomas Huth
2015-05-15  8:09         ` Thomas Huth
2015-05-11 15:39 ` [PATCH kernel v10 24/34] powerpc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  4:14   ` Gavin Shan
2015-05-14  4:14     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 25/34] powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  4:31   ` Gavin Shan
2015-05-14  4:31     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 26/34] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-14  5:01   ` Gavin Shan
2015-05-14  5:01     ` Gavin Shan
2015-05-11 15:39 ` [PATCH kernel v10 27/34] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 28/34] vfio: powerpc/spapr: powerpc/powernv/ioda: Define and implement DMA windows API Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-13 21:30   ` Alex Williamson
2015-05-13 21:30     ` Alex Williamson
2015-05-11 15:39 ` [PATCH kernel v10 29/34] powerpc/powernv/ioda2: Use new helpers to do proper cleanup on PE release Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 30/34] powerpc/iommu/ioda2: Add get_table_size() to calculate the size of future table Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 31/34] vfio: powerpc/spapr: powerpc/powernv/ioda2: Use DMA windows API in ownership control Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 32/34] powerpc/mmu: Add userspace-to-physical addresses translation cache Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 33/34] vfio: powerpc/spapr: Register memory and define IOMMU v2 Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy
2015-05-13 21:30   ` Alex Williamson
2015-05-13 21:30     ` Alex Williamson
2015-05-14  6:08     ` Alexey Kardashevskiy
2015-05-14  6:08       ` Alexey Kardashevskiy
2015-05-11 15:39 ` [PATCH kernel v10 34/34] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy
2015-05-11 15:39   ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55541718.70304@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    --cc=weiyang@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.