linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next V1 0/3] RoCE GID management fixes
@ 2015-08-03 13:08 Matan Barak
       [not found] ` <1438607342-11964-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-08-03 13:08 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jason Gunthorpe, Matan Barak,
	Somnath Kotur, Haggai Eran, Or Gerlitz

Hi Doug,

This patch mainly fixes teardown and error flow fixes that Jason and
you have found in the recent submitted patchset. As suggested by
Jason, I've split the cache removal process to cleanup and release.
Although I don't expect the HW vendor drivers to use the cache after
it was cleaned up, releasing it in the put function could prevent
possible use-after-free errors. Saying that, we could alternatively
review the vendors' usage more carefully and probably put the release
function as part of the unregister device flow.

In addition, this patch also fixes a small error flow issue that
was found by Dan Carpenter's kbuild system and a possible dead-lock.

Thanks,
Matan

Matan Barak (3):
  IB/core: Access to one past end of array in _gid_table_setup_one
  IB/core: RoCE GID management separate cleanup and release
  IB/core: Fix possible deadlock in write_gid

 drivers/infiniband/core/cache.c     | 90 +++++++++++++++++++++++++++----------
 drivers/infiniband/core/core_priv.h |  1 +
 drivers/infiniband/core/device.c    |  4 ++
 drivers/infiniband/core/sysfs.c     |  2 +-
 4 files changed, 73 insertions(+), 24 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH for-next V1 1/3] IB/core: Access to one past end of array in _gid_table_setup_one
       [not found] ` <1438607342-11964-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-08-03 13:09   ` Matan Barak
       [not found]     ` <1438607342-11964-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-03 13:09   ` [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release Matan Barak
  2015-08-03 13:09   ` [PATCH for-next V1 3/3] IB/core: Fix possible deadlock in write_gid Matan Barak
  2 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-08-03 13:09 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jason Gunthorpe, Matan Barak,
	Somnath Kotur, Haggai Eran, Or Gerlitz

The error flow of _gid_table_setup_one accessed table[port] when
port was 1-based instead of 0-based.

Squash-into: 'IB/core: Add RoCE GID table management'
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index a9d5c70..01b8792 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -582,8 +582,9 @@ static int _gid_table_setup_one(struct ib_device *ib_dev)
 	return 0;
 
 rollback_table_setup:
-	for (port = 1; port <= ib_dev->phys_port_cnt; port++)
-		free_gid_table(ib_dev, port, table[port]);
+	for (port = 0; port < ib_dev->phys_port_cnt; port++)
+		free_gid_table(ib_dev, port + rdma_start_port(ib_dev),
+			       table[port]);
 
 	kfree(table);
 	return err;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release
       [not found] ` <1438607342-11964-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-03 13:09   ` [PATCH for-next V1 1/3] IB/core: Access to one past end of array in _gid_table_setup_one Matan Barak
@ 2015-08-03 13:09   ` Matan Barak
       [not found]     ` <1438607342-11964-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-03 13:09   ` [PATCH for-next V1 3/3] IB/core: Fix possible deadlock in write_gid Matan Barak
  2 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-08-03 13:09 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jason Gunthorpe, Matan Barak,
	Somnath Kotur, Haggai Eran, Or Gerlitz

Separate the cleanup and release IB cache functions.
The cleanup function delete all GIDs and unregister the event channel,
while the release function frees all memory. The cleanup function is
called after all clients were removed (in the device un-registration
process).

The release function is called after the device was put.
Although vendor drivers aren't expected to use IB cache in their
removal process, we postpone freeing the cache in order to avoid
possible use-after-free errors.

Squash-into: 'IB/core: Add RoCE GID table management'
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cache.c     | 82 ++++++++++++++++++++++++++++---------
 drivers/infiniband/core/core_priv.h |  1 +
 drivers/infiniband/core/device.c    |  4 ++
 drivers/infiniband/core/sysfs.c     |  2 +-
 4 files changed, 68 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 01b8792..58a44bd 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -464,8 +464,16 @@ err_free_table:
 	return NULL;
 }
 
-static void free_gid_table(struct ib_device *ib_dev, u8 port,
-			   struct ib_gid_table *table)
+static void release_gid_table(struct ib_gid_table *table)
+{
+	if (table) {
+		kfree(table->data_vec);
+		kfree(table);
+	}
+}
+
+static void cleanup_gid_table_port(struct ib_device *ib_dev, u8 port,
+				   struct ib_gid_table *table)
 {
 	int i;
 
@@ -479,8 +487,6 @@ static void free_gid_table(struct ib_device *ib_dev, u8 port,
 				table->data_vec[i].props &
 				GID_ATTR_FIND_MASK_DEFAULT);
 	}
-	kfree(table->data_vec);
-	kfree(table);
 }
 
 void ib_cache_gid_set_default_gid(struct ib_device *ib_dev, u8 port,
@@ -582,15 +588,17 @@ static int _gid_table_setup_one(struct ib_device *ib_dev)
 	return 0;
 
 rollback_table_setup:
-	for (port = 0; port < ib_dev->phys_port_cnt; port++)
-		free_gid_table(ib_dev, port + rdma_start_port(ib_dev),
-			       table[port]);
+	for (port = 0; port < ib_dev->phys_port_cnt; port++) {
+		cleanup_gid_table_port(ib_dev, port + rdma_start_port(ib_dev),
+				       table[port]);
+		release_gid_table(table[port]);
+	}
 
 	kfree(table);
 	return err;
 }
 
-static void gid_table_cleanup_one(struct ib_device *ib_dev)
+static void gid_table_release_one(struct ib_device *ib_dev)
 {
 	struct ib_gid_table **table = ib_dev->cache.gid_cache;
 	u8 port;
@@ -599,10 +607,23 @@ static void gid_table_cleanup_one(struct ib_device *ib_dev)
 		return;
 
 	for (port = 0; port < ib_dev->phys_port_cnt; port++)
-		free_gid_table(ib_dev, port + rdma_start_port(ib_dev),
-			       table[port]);
+		release_gid_table(table[port]);
 
 	kfree(table);
+	ib_dev->cache.gid_cache = NULL;
+}
+
+static void gid_table_cleanup_one(struct ib_device *ib_dev)
+{
+	struct ib_gid_table **table = ib_dev->cache.gid_cache;
+	u8 port;
+
+	if (!table)
+		return;
+
+	for (port = 0; port < ib_dev->phys_port_cnt; port++)
+		cleanup_gid_table_port(ib_dev, port + rdma_start_port(ib_dev),
+				       table[port]);
 }
 
 static int gid_table_setup_one(struct ib_device *ib_dev)
@@ -616,8 +637,10 @@ static int gid_table_setup_one(struct ib_device *ib_dev)
 
 	err = roce_rescan_device(ib_dev);
 
-	if (err)
+	if (err) {
 		gid_table_cleanup_one(ib_dev);
+		gid_table_release_one(ib_dev);
+	}
 
 	return err;
 }
@@ -902,9 +925,7 @@ int ib_cache_setup_one(struct ib_device *device)
 					  (rdma_end_port(device) -
 					   rdma_start_port(device) + 1),
 					  GFP_KERNEL);
-	err = gid_table_setup_one(device);
-
-	if (!device->cache.pkey_cache || !device->cache.gid_cache ||
+	if (!device->cache.pkey_cache ||
 	    !device->cache.lmc_cache) {
 		printk(KERN_WARNING "Couldn't allocate cache "
 		       "for %s\n", device->name);
@@ -912,6 +933,10 @@ int ib_cache_setup_one(struct ib_device *device)
 		goto err;
 	}
 
+	err = gid_table_setup_one(device);
+	if (err)
+		goto err;
+
 	for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p) {
 		device->cache.pkey_cache[p] = NULL;
 		ib_cache_update(device, p + rdma_start_port(device));
@@ -929,29 +954,46 @@ err_cache:
 	for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p)
 		kfree(device->cache.pkey_cache[p]);
 
+	gid_table_cleanup_one(device);
+	gid_table_release_one(device);
 err:
 	kfree(device->cache.pkey_cache);
-	gid_table_cleanup_one(device);
 	kfree(device->cache.lmc_cache);
 
 	return err;
 }
 
-void ib_cache_cleanup_one(struct ib_device *device)
+void ib_cache_release_one(struct ib_device *device)
 {
 	int p;
 
-	ib_unregister_event_handler(&device->cache.event_handler);
-	flush_workqueue(ib_wq);
-
+	/* The release function frees all the cache elements.
+	 * This function should be called as part of freeing
+	 * all the device's resources when the cache could no
+	 * longer be accessed.
+	 */
 	for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p)
 		kfree(device->cache.pkey_cache[p]);
 
+	gid_table_release_one(device);
 	kfree(device->cache.pkey_cache);
-	gid_table_cleanup_one(device);
 	kfree(device->cache.lmc_cache);
 }
 
+void ib_cache_cleanup_one(struct ib_device *device)
+{
+	/* The cleanup function unregisters the event handler,
+	 * waits for all in-progress workqueue elements and cleans
+	 * up the GID cache. This function should be called after
+	 * the device was removed from the devices list and all
+	 * clients were removed, so the cache exists but is
+	 * non-functional and shouldn't be updated anymore.
+	 */
+	ib_unregister_event_handler(&device->cache.event_handler);
+	flush_workqueue(ib_wq);
+	gid_table_cleanup_one(device);
+}
+
 void __init ib_cache_setup(void)
 {
 	roce_gid_mgmt_init();
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 210ddd2..b78adb5 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -98,5 +98,6 @@ int roce_rescan_device(struct ib_device *ib_dev);
 
 int ib_cache_setup_one(struct ib_device *device);
 void ib_cache_cleanup_one(struct ib_device *device);
+void ib_cache_release_one(struct ib_device *device);
 
 #endif /* _CORE_PRIV_H */
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 14ea709..abd511e 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -366,6 +366,10 @@ void ib_unregister_device(struct ib_device *device)
 
 	mutex_unlock(&device_mutex);
 
+	/* All clients were removed and the device is being removed, no calls
+	 * are expcted to ib_cache by clients/API.
+	 */
+	ib_cache_cleanup_one(device);
 	ib_device_unregister_sysfs(device);
 
 	spin_lock_irqsave(&device->client_data_lock, flags);
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index c10c6d3..5f45786 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -461,7 +461,7 @@ static void ib_device_release(struct device *device)
 {
 	struct ib_device *dev = container_of(device, struct ib_device, dev);
 
-	ib_cache_cleanup_one(dev);
+	ib_cache_release_one(dev);
 	kfree(dev->port_immutable);
 	kfree(dev);
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH for-next V1 3/3] IB/core: Fix possible deadlock in write_gid
       [not found] ` <1438607342-11964-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-03 13:09   ` [PATCH for-next V1 1/3] IB/core: Access to one past end of array in _gid_table_setup_one Matan Barak
  2015-08-03 13:09   ` [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release Matan Barak
@ 2015-08-03 13:09   ` Matan Barak
       [not found]     ` <1438607342-11964-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-08-03 13:09 UTC (permalink / raw)
  To: Doug Ledford
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jason Gunthorpe, Matan Barak,
	Somnath Kotur, Haggai Eran, Or Gerlitz

Fixing the possible deadlock in write_gid:
write_gid -> write_lock()
<interrupt>
	find_gid -> read_lock_irqsave()
		    ** DEAD LOCK **LOCK

Fixing it by changing the write_lock to write_lock_irqsave.

Squash-into: 'IB/core: Add RoCE GID table management'
Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
 drivers/infiniband/core/cache.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 58a44bd..d7fdcb7 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -121,15 +121,16 @@ static int write_gid(struct ib_device *ib_dev, u8 port,
 {
 	int ret = 0;
 	struct net_device *old_net_dev;
+	unsigned long flags;
 
 	/* in rdma_cap_roce_gid_table, this funciton should be protected by a
 	 * sleep-able lock.
 	 */
-	write_lock(&table->data_vec[ix].lock);
+	write_lock_irqsave(&table->data_vec[ix].lock, flags);
 
 	if (rdma_cap_roce_gid_table(ib_dev, port)) {
 		table->data_vec[ix].props |= GID_TABLE_ENTRY_INVALID;
-		write_unlock(&table->data_vec[ix].lock);
+		write_unlock_irqrestore(&table->data_vec[ix].lock, flags);
 		/* GID_TABLE_WRITE_ACTION_MODIFY currently isn't supported by
 		 * RoCE providers and thus only updates the cache.
 		 */
@@ -139,7 +140,7 @@ static int write_gid(struct ib_device *ib_dev, u8 port,
 		else if (action == GID_TABLE_WRITE_ACTION_DEL)
 			ret = ib_dev->del_gid(ib_dev, port, ix,
 					      &table->data_vec[ix].context);
-		write_lock(&table->data_vec[ix].lock);
+		write_lock_irqsave(&table->data_vec[ix].lock, flags);
 	}
 
 	old_net_dev = table->data_vec[ix].attr.ndev;
@@ -161,7 +162,7 @@ static int write_gid(struct ib_device *ib_dev, u8 port,
 
 	table->data_vec[ix].props &= ~GID_TABLE_ENTRY_INVALID;
 
-	write_unlock(&table->data_vec[ix].lock);
+	write_unlock_irqrestore(&table->data_vec[ix].lock, flags);
 
 	if (!ret && rdma_cap_roce_gid_table(ib_dev, port)) {
 		struct ib_event event;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release
       [not found]     ` <1438607342-11964-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-08-04  3:10       ` Jason Gunthorpe
       [not found]         ` <20150804031038.GA27627-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2015-08-14 21:56       ` Doug Ledford
  1 sibling, 1 reply; 13+ messages in thread
From: Jason Gunthorpe @ 2015-08-04  3:10 UTC (permalink / raw)
  To: Matan Barak
  Cc: Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Somnath Kotur,
	Haggai Eran, Or Gerlitz

On Mon, Aug 03, 2015 at 04:09:01PM +0300, Matan Barak wrote:

> The release function is called after the device was put.
> Although vendor drivers aren't expected to use IB cache in their
> removal process, we postpone freeing the cache in order to avoid
> possible use-after-free errors.

It isn't so much that they are not expected, it is that they may not
have a choice. A driver cannot tear down things like tasklets and work
queues until after removal finishes, and any of those async things
could call into the core. That is why a driver audit would be so hard..

> @@ -902,9 +925,7 @@ int ib_cache_setup_one(struct ib_device *device)
>  					  (rdma_end_port(device) -
>  					   rdma_start_port(device) + 1),
>  					  GFP_KERNEL);
> -	err = gid_table_setup_one(device);
> -
> -	if (!device->cache.pkey_cache || !device->cache.gid_cache ||
> +	if (!device->cache.pkey_cache ||
>  	    !device->cache.lmc_cache) {
>  		printk(KERN_WARNING "Couldn't allocate cache "
>  		       "for %s\n", device->name);
> @@ -912,6 +933,10 @@ int ib_cache_setup_one(struct ib_device *device)
>  		goto err;
>  	}
>  
> +	err = gid_table_setup_one(device);
> +	if (err)
> +		goto err;
> +
>  	for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p) {
>  		device->cache.pkey_cache[p] = NULL;
>  		ib_cache_update(device, p + rdma_start_port(device));
> @@ -929,29 +954,46 @@ err_cache:
>  	for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p)
>  		kfree(device->cache.pkey_cache[p]);
>  
> +	gid_table_cleanup_one(device);
> +	gid_table_release_one(device);
>  err:
>  	kfree(device->cache.pkey_cache);
> -	gid_table_cleanup_one(device);
>  	kfree(device->cache.lmc_cache);

This still seems to double kfree on error..

Pick a scheme and use it consistently, either rely on release to be
called on error via kref-put, or kfree and null, for all the kfress in
release_one.

> +	ib_cache_cleanup_one(device);
> 	ib_device_unregister_sysfs(device);

I didn't check closely, but I suspect the above order should be
swapped, and the matching swap in register. sysfs can legitimately
call into core code, but vice-versa shouldn't happen...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release
       [not found]         ` <20150804031038.GA27627-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-08-04 12:09           ` Matan Barak
       [not found]             ` <CAAKD3BBESq61-UJJvqm=ni5vrtu8yuNJvC57mWwCpehQSd1k4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-08-04 12:09 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Somnath Kotur, Haggai Eran, Or Gerlitz

On Tue, Aug 4, 2015 at 6:10 AM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Mon, Aug 03, 2015 at 04:09:01PM +0300, Matan Barak wrote:
>
>> The release function is called after the device was put.
>> Although vendor drivers aren't expected to use IB cache in their
>> removal process, we postpone freeing the cache in order to avoid
>> possible use-after-free errors.
>
> It isn't so much that they are not expected, it is that they may not
> have a choice. A driver cannot tear down things like tasklets and work
> queues until after removal finishes, and any of those async things
> could call into the core. That is why a driver audit would be so hard..
>

Correct, I'll change this comment to:
    The release function is called after the device was put.
    This is in order to avoid use-after-free errors if the vendor
    driver's teardown code uses IB cache.


>> @@ -902,9 +925,7 @@ int ib_cache_setup_one(struct ib_device *device)
>>                                         (rdma_end_port(device) -
>>                                          rdma_start_port(device) + 1),
>>                                         GFP_KERNEL);
>> -     err = gid_table_setup_one(device);
>> -
>> -     if (!device->cache.pkey_cache || !device->cache.gid_cache ||
>> +     if (!device->cache.pkey_cache ||
>>           !device->cache.lmc_cache) {
>>               printk(KERN_WARNING "Couldn't allocate cache "
>>                      "for %s\n", device->name);
>> @@ -912,6 +933,10 @@ int ib_cache_setup_one(struct ib_device *device)
>>               goto err;
>>       }
>>
>> +     err = gid_table_setup_one(device);
>> +     if (err)
>> +             goto err;
>> +
>>       for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p) {
>>               device->cache.pkey_cache[p] = NULL;
>>               ib_cache_update(device, p + rdma_start_port(device));
>> @@ -929,29 +954,46 @@ err_cache:
>>       for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p)
>>               kfree(device->cache.pkey_cache[p]);
>>
>> +     gid_table_cleanup_one(device);
>> +     gid_table_release_one(device);
>>  err:
>>       kfree(device->cache.pkey_cache);
>> -     gid_table_cleanup_one(device);
>>       kfree(device->cache.lmc_cache);
>
> This still seems to double kfree on error..
>
> Pick a scheme and use it consistently, either rely on release to be
> called on error via kref-put, or kfree and null, for all the kfress in
> release_one.
>

I'll switch to kref-put cleanup. That means:
gid_table_setup_one - should only call cleanup and not release in error-flow
ib_cache_setup_one - shouldn't free any allocated memory/release the GID cache
          but just cleanup the GID cache.
ib_cache_release_one - should check if the pkey_cache is NULL before freeing it.


>> +     ib_cache_cleanup_one(device);
>>       ib_device_unregister_sysfs(device);
>
> I didn't check closely, but I suspect the above order should be
> swapped, and the matching swap in register. sysfs can legitimately
> call into core code, but vice-versa shouldn't happen...
>

I didn't understand this comment. The cleanup code calls del_gid which tells the
vendor to delete this GID (and dev_put the ndevs). The kref-put (which
is called when
the device is unregistered) frees the memory. If we switch the order,
we would have
use-after-free scenario.

> Jason

Thanks for looking at this patch.

Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release
       [not found]             ` <CAAKD3BBESq61-UJJvqm=ni5vrtu8yuNJvC57mWwCpehQSd1k4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-08-04 16:46               ` Jason Gunthorpe
       [not found]                 ` <20150804164650.GA3858-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Gunthorpe @ 2015-08-04 16:46 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Somnath Kotur, Haggai Eran, Or Gerlitz

On Tue, Aug 04, 2015 at 03:09:39PM +0300, Matan Barak wrote:
> Correct, I'll change this comment to:
>     The release function is called after the device was put.
>     This is in order to avoid use-after-free errors if the vendor
>     driver's teardown code uses IB cache.

.. the vendor driver uses IB cache from async contexts ..

> >> +     ib_cache_cleanup_one(device);
> >>       ib_device_unregister_sysfs(device);
> >
> > I didn't check closely, but I suspect the above order should be
> > swapped, and the matching swap in register. sysfs can legitimately
> > call into core code, but vice-versa shouldn't happen...
> >
> 
> I didn't understand this comment. The cleanup code calls del_gid
> which tells the vendor to delete this GID (and dev_put the
> ndevs). The kref-put (which is called when the device is
> unregistered) frees the memory. If we switch the order, we would
> have use-after-free scenario.

I don't understand your comment either.

What code path from ib_cache will go into ib_sysfs?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release
       [not found]                 ` <20150804164650.GA3858-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-08-04 18:55                   ` Matan Barak
       [not found]                     ` <CAAKD3BDDKcyA0xitGpuMkKsr6=9onxFgdVXHE3n-zb=xjX4Uhg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Matan Barak @ 2015-08-04 18:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Somnath Kotur, Haggai Eran, Or Gerlitz

On Tue, Aug 4, 2015 at 7:46 PM, Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org> wrote:
> On Tue, Aug 04, 2015 at 03:09:39PM +0300, Matan Barak wrote:
>> Correct, I'll change this comment to:
>>     The release function is called after the device was put.
>>     This is in order to avoid use-after-free errors if the vendor
>>     driver's teardown code uses IB cache.
>
> .. the vendor driver uses IB cache from async contexts ..
>

right (or just from the unregister code path of the vendor driver) :)

>> >> +     ib_cache_cleanup_one(device);
>> >>       ib_device_unregister_sysfs(device);
>> >
>> > I didn't check closely, but I suspect the above order should be
>> > swapped, and the matching swap in register. sysfs can legitimately
>> > call into core code, but vice-versa shouldn't happen...
>> >
>>
>> I didn't understand this comment. The cleanup code calls del_gid
>> which tells the vendor to delete this GID (and dev_put the
>> ndevs). The kref-put (which is called when the device is
>> unregistered) frees the memory. If we switch the order, we would
>> have use-after-free scenario.
>
> I don't understand your comment either.
>
> What code path from ib_cache will go into ib_sysfs?
>

The device code goes to ib_sysfs and ib_cache. I was rethinking this
and I think it still might be a bit problematic.
The ib_register_device error flow could fail either in
ib_device_register_sysfs or ib_cache_setup_one.
If it fails in ib_device_register_sysfs, the device release function
isn't called and the device pointer isn't freed.
If ib_cache_setup_one fails, we call ib_device_unregister_sysfs and
the memory will be freed. So it seems like
ib_device_register_sysfs should be the last call in
ib_reigster_device. But in the ib_unregister_device path, I still
need to first call ib_cache_cleanup_once and then call
ib_device_unregister_sysfs (otherwise I'll have use-after-free).
What do you think?

> Jason

Thanks again for the helpful comments.

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release
       [not found]                     ` <CAAKD3BDDKcyA0xitGpuMkKsr6=9onxFgdVXHE3n-zb=xjX4Uhg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-08-04 21:23                       ` Jason Gunthorpe
       [not found]                         ` <20150804212334.GB10934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Gunthorpe @ 2015-08-04 21:23 UTC (permalink / raw)
  To: Matan Barak
  Cc: Matan Barak, Doug Ledford, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Somnath Kotur, Haggai Eran, Or Gerlitz

On Tue, Aug 04, 2015 at 09:55:03PM +0300, Matan Barak wrote:

> If it fails in ib_device_register_sysfs, the device release function
> isn't called and the device pointer isn't freed.

Ugh, yes, the abuse in ib_dealloc_device needs to go too.

Attached is a compile tested patch that fixes that up. Feel free to
fix it up as necessary. It should be sequenced before your 'Add RoCE
GID table management'.. It would probably be helpful for doug to
resend that one patch adjusted.

> If ib_cache_setup_one fails, we call ib_device_unregister_sysfs and
> the memory will be freed.

?? ib_device_unregister_sysfs doesn't free memory..

At the driver the sequence should be:
 ib_alloc_device
 ib_register_device
 ib_unregister_device
 ib_dealloc_device

The issue I'm looking (which I suspected before, but just went and
confirmed) at is that sysfs's show_port_gid calls ib_query_gid which
your series now makes use the cache, so sysfs should ideally be shut
off before the cache stuff.

.. and we must setup the cache before setting up sysfs, otherwise
there is a race where userspace can trigger a cache call before
setup.. (null deref?)

> ib_device_unregister_sysfs (otherwise I'll have use-after-free).
> What do you think?

I'm still not sure what you are seeing..

You might also want the cache code to have an entry from
ib_alloc_device to setup locks and other no-fail initalization
things. AFIAK there isn't a strong reason to do this other than to
keep with the basic idiom.

Jason

>From 32bafdef61a9e571ef1457f7f1966a7372d1b8d7 Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Date: Tue, 4 Aug 2015 15:11:41 -0600
Subject: [PATCH] IB/core: Make ib_alloc_device initialize the kobject

This gets rid of the weird in-between state where struct ib_device
was allocated but the kobject didn't work.

Consequently ib_device_release is now guaranteed to be called in
all situations and we can't duplicate its kfrees on error paths.
Choose to just drop kfree(port_immutable)

Signed-off-by: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
---
 drivers/infiniband/core/core_priv.h |  3 --
 drivers/infiniband/core/device.c    | 92 ++++++++++++++++++++++++-------------
 drivers/infiniband/core/sysfs.c     | 51 ++------------------
 3 files changed, 65 insertions(+), 81 deletions(-)

diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 87d1936f5c1c..950583a62e3b 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -43,9 +43,6 @@ int  ib_device_register_sysfs(struct ib_device *device,
 						   u8, struct kobject *));
 void ib_device_unregister_sysfs(struct ib_device *device);
 
-int  ib_sysfs_setup(void);
-void ib_sysfs_cleanup(void);
-
 int  ib_cache_setup(void);
 void ib_cache_cleanup(void);
 
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 9567756ca4f9..0a2c28610026 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -152,6 +152,35 @@ static int alloc_name(char *name)
 	return 0;
 }
 
+static void ib_device_release(struct device *device)
+{
+	struct ib_device *dev = container_of(device, struct ib_device, dev);
+
+	kfree(dev->port_immutable);
+	kfree(dev);
+}
+
+static int ib_device_uevent(struct device *device,
+			    struct kobj_uevent_env *env)
+{
+	struct ib_device *dev = container_of(device, struct ib_device, dev);
+
+	if (add_uevent_var(env, "NAME=%s", dev->name))
+		return -ENOMEM;
+
+	/*
+	 * It would be nice to pass the node GUID with the event...
+	 */
+
+	return 0;
+}
+
+static struct class ib_class = {
+	.name    = "infiniband",
+	.dev_release = ib_device_release,
+	.dev_uevent = ib_device_uevent,
+};
+
 /**
  * ib_alloc_device - allocate an IB device struct
  * @size:size of structure to allocate
@@ -164,9 +193,27 @@ static int alloc_name(char *name)
  */
 struct ib_device *ib_alloc_device(size_t size)
 {
-	BUG_ON(size < sizeof (struct ib_device));
+	struct ib_device *device;
+
+	if (WARN_ON(size < sizeof(struct ib_device)))
+		return NULL;
+
+	device = kzalloc(size, GFP_KERNEL);
+	if (!device)
+		return NULL;
+
+	device->dev.class = &ib_class;
+	device_initialize(&device->dev);
+
+	dev_set_drvdata(&device->dev, device);
+
+	INIT_LIST_HEAD(&device->event_handler_list);
+	spin_lock_init(&device->event_handler_lock);
+	spin_lock_init(&device->client_data_lock);
+	INIT_LIST_HEAD(&device->client_data_list);
+	INIT_LIST_HEAD(&device->port_list);
 
-	return kzalloc(size, GFP_KERNEL);
+	return device;
 }
 EXPORT_SYMBOL(ib_alloc_device);
 
@@ -178,13 +225,8 @@ EXPORT_SYMBOL(ib_alloc_device);
  */
 void ib_dealloc_device(struct ib_device *device)
 {
-	if (device->reg_state == IB_DEV_UNINITIALIZED) {
-		kfree(device);
-		return;
-	}
-
-	BUG_ON(device->reg_state != IB_DEV_UNREGISTERED);
-
+	WARN_ON(device->reg_state != IB_DEV_UNREGISTERED &&
+		device->reg_state != IB_DEV_UNINITIALIZED);
 	kobject_put(&device->dev.kobj);
 }
 EXPORT_SYMBOL(ib_dealloc_device);
@@ -219,7 +261,7 @@ static int verify_immutable(const struct ib_device *dev, u8 port)
 
 static int read_port_immutable(struct ib_device *device)
 {
-	int ret = -ENOMEM;
+	int ret;
 	u8 start_port = rdma_start_port(device);
 	u8 end_port = rdma_end_port(device);
 	u8 port;
@@ -235,26 +277,18 @@ static int read_port_immutable(struct ib_device *device)
 					 * (end_port + 1),
 					 GFP_KERNEL);
 	if (!device->port_immutable)
-		goto err;
+		return -ENOMEM;
 
 	for (port = start_port; port <= end_port; ++port) {
 		ret = device->get_port_immutable(device, port,
 						 &device->port_immutable[port]);
 		if (ret)
-			goto err;
+			return ret;
 
-		if (verify_immutable(device, port)) {
-			ret = -EINVAL;
-			goto err;
-		}
+		if (verify_immutable(device, port))
+			return -EINVAL;
 	}
-
-	ret = 0;
-	goto out;
-err:
-	kfree(device->port_immutable);
-out:
-	return ret;
+	return 0;
 }
 
 /**
@@ -285,11 +319,6 @@ int ib_register_device(struct ib_device *device,
 		goto out;
 	}
 
-	INIT_LIST_HEAD(&device->event_handler_list);
-	INIT_LIST_HEAD(&device->client_data_list);
-	spin_lock_init(&device->event_handler_lock);
-	spin_lock_init(&device->client_data_lock);
-
 	ret = read_port_immutable(device);
 	if (ret) {
 		printk(KERN_WARNING "Couldn't create per port immutable data %s\n",
@@ -301,7 +330,6 @@ int ib_register_device(struct ib_device *device,
 	if (ret) {
 		printk(KERN_WARNING "Couldn't register device %s with driver model\n",
 		       device->name);
-		kfree(device->port_immutable);
 		goto out;
 	}
 
@@ -737,7 +765,7 @@ static int __init ib_core_init(void)
 	if (!ib_wq)
 		return -ENOMEM;
 
-	ret = ib_sysfs_setup();
+	ret = class_register(&ib_class);
 	if (ret) {
 		printk(KERN_WARNING "Couldn't create InfiniBand device class\n");
 		goto err;
@@ -761,7 +789,7 @@ err_nl:
 	ibnl_cleanup();
 
 err_sysfs:
-	ib_sysfs_cleanup();
+	class_unregister(&ib_class);
 
 err:
 	destroy_workqueue(ib_wq);
@@ -772,7 +800,7 @@ static void __exit ib_core_cleanup(void)
 {
 	ib_cache_cleanup();
 	ibnl_cleanup();
-	ib_sysfs_cleanup();
+	class_unregister(&ib_class);
 	/* Make sure that any pending umem accounting work is done. */
 	destroy_workqueue(ib_wq);
 }
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 0b84a9cdfe5b..34cdd74b0a17 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -457,29 +457,6 @@ static struct kobj_type port_type = {
 	.default_attrs = port_default_attrs
 };
 
-static void ib_device_release(struct device *device)
-{
-	struct ib_device *dev = container_of(device, struct ib_device, dev);
-
-	kfree(dev->port_immutable);
-	kfree(dev);
-}
-
-static int ib_device_uevent(struct device *device,
-			    struct kobj_uevent_env *env)
-{
-	struct ib_device *dev = container_of(device, struct ib_device, dev);
-
-	if (add_uevent_var(env, "NAME=%s", dev->name))
-		return -ENOMEM;
-
-	/*
-	 * It would be nice to pass the node GUID with the event...
-	 */
-
-	return 0;
-}
-
 static struct attribute **
 alloc_group_attrs(ssize_t (*show)(struct ib_port *,
 				  struct port_attribute *, char *buf),
@@ -702,12 +679,6 @@ static struct device_attribute *ib_class_attributes[] = {
 	&dev_attr_node_desc
 };
 
-static struct class ib_class = {
-	.name    = "infiniband",
-	.dev_release = ib_device_release,
-	.dev_uevent = ib_device_uevent,
-};
-
 /* Show a given an attribute in the statistics group */
 static ssize_t show_protocol_stat(const struct device *device,
 			    struct device_attribute *attr, char *buf,
@@ -846,14 +817,12 @@ int ib_device_register_sysfs(struct ib_device *device,
 	int ret;
 	int i;
 
-	class_dev->class      = &ib_class;
-	class_dev->parent     = device->dma_device;
-	dev_set_name(class_dev, "%s", device->name);
-	dev_set_drvdata(class_dev, device);
-
-	INIT_LIST_HEAD(&device->port_list);
+	device->dev.parent = device->dma_device;
+	ret = dev_set_name(class_dev, "%s", device->name);
+	if (ret)
+		return ret;
 
-	ret = device_register(class_dev);
+	ret = device_add(class_dev);
 	if (ret)
 		goto err;
 
@@ -916,13 +885,3 @@ void ib_device_unregister_sysfs(struct ib_device *device)
 
 	device_unregister(&device->dev);
 }
-
-int ib_sysfs_setup(void)
-{
-	return class_register(&ib_class);
-}
-
-void ib_sysfs_cleanup(void)
-{
-	class_unregister(&ib_class);
-}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 1/3] IB/core: Access to one past end of array in _gid_table_setup_one
       [not found]     ` <1438607342-11964-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-08-14 21:04       ` Doug Ledford
  0 siblings, 0 replies; 13+ messages in thread
From: Doug Ledford @ 2015-08-14 21:04 UTC (permalink / raw)
  To: Matan Barak
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jason Gunthorpe, Somnath Kotur,
	Haggai Eran, Or Gerlitz

[-- Attachment #1: Type: text/plain, Size: 431 bytes --]

On 08/03/2015 09:09 AM, Matan Barak wrote:
> The error flow of _gid_table_setup_one accessed table[port] when
> port was 1-based instead of 0-based.
> 
> Squash-into: 'IB/core: Add RoCE GID table management'
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Squashed as requested.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 3/3] IB/core: Fix possible deadlock in write_gid
       [not found]     ` <1438607342-11964-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2015-08-14 21:16       ` Doug Ledford
  0 siblings, 0 replies; 13+ messages in thread
From: Doug Ledford @ 2015-08-14 21:16 UTC (permalink / raw)
  To: Matan Barak
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jason Gunthorpe, Somnath Kotur,
	Haggai Eran, Or Gerlitz

[-- Attachment #1: Type: text/plain, Size: 545 bytes --]

On 08/03/2015 09:09 AM, Matan Barak wrote:
> Fixing the possible deadlock in write_gid:
> write_gid -> write_lock()
> <interrupt>
> 	find_gid -> read_lock_irqsave()
> 		    ** DEAD LOCK **LOCK
> 
> Fixing it by changing the write_lock to write_lock_irqsave.
> 
> Squash-into: 'IB/core: Add RoCE GID table management'
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

Squashed as requested.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release
       [not found]                         ` <20150804212334.GB10934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2015-08-14 21:49                           ` Doug Ledford
  0 siblings, 0 replies; 13+ messages in thread
From: Doug Ledford @ 2015-08-14 21:49 UTC (permalink / raw)
  To: Jason Gunthorpe, Matan Barak
  Cc: Matan Barak, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Somnath Kotur,
	Haggai Eran, Or Gerlitz

[-- Attachment #1: Type: text/plain, Size: 2974 bytes --]

On 08/04/2015 05:23 PM, Jason Gunthorpe wrote:
> On Tue, Aug 04, 2015 at 09:55:03PM +0300, Matan Barak wrote:
> 
>> If it fails in ib_device_register_sysfs, the device release function
>> isn't called and the device pointer isn't freed.
> 
> Ugh, yes, the abuse in ib_dealloc_device needs to go too.
> 
> Attached is a compile tested patch that fixes that up. Feel free to
> fix it up as necessary. It should be sequenced before your 'Add RoCE
> GID table management'.. It would probably be helpful for doug to
> resend that one patch adjusted.
> 
>> If ib_cache_setup_one fails, we call ib_device_unregister_sysfs and
>> the memory will be freed.
> 
> ?? ib_device_unregister_sysfs doesn't free memory..
> 
> At the driver the sequence should be:
>  ib_alloc_device
>  ib_register_device
>  ib_unregister_device
>  ib_dealloc_device
> 
> The issue I'm looking (which I suspected before, but just went and
> confirmed) at is that sysfs's show_port_gid calls ib_query_gid which
> your series now makes use the cache, so sysfs should ideally be shut
> off before the cache stuff.
> 
> .. and we must setup the cache before setting up sysfs, otherwise
> there is a race where userspace can trigger a cache call before
> setup.. (null deref?)
> 
>> ib_device_unregister_sysfs (otherwise I'll have use-after-free).
>> What do you think?
> 
> I'm still not sure what you are seeing..
> 
> You might also want the cache code to have an entry from
> ib_alloc_device to setup locks and other no-fail initalization
> things. AFIAK there isn't a strong reason to do this other than to
> keep with the basic idiom.
> 
> Jason
> 
> From 32bafdef61a9e571ef1457f7f1966a7372d1b8d7 Mon Sep 17 00:00:00 2001
> From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
> Date: Tue, 4 Aug 2015 15:11:41 -0600
> Subject: [PATCH] IB/core: Make ib_alloc_device initialize the kobject
> 
> This gets rid of the weird in-between state where struct ib_device
> was allocated but the kobject didn't work.
> 
> Consequently ib_device_release is now guaranteed to be called in
> all situations and we can't duplicate its kfrees on error paths.
> Choose to just drop kfree(port_immutable)
> 
> Signed-off-by: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>

I reworded the commit message some, but the patch was good and makes
sense.  I've inserted it into the GID series and, as expected, it caused
failures in the original "IB/core: Add RoCE GID table management" patch.
 However, the fixups for those failures seem pretty obvious to me, so I
made them myself.  When I finally push this, it will be in this branch
on my github repo:

v4.2-rc6-based/gid-table-v7

You and Matan might want to double check that I fixed things up properly
and didn't miss something.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release
       [not found]     ` <1438607342-11964-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2015-08-04  3:10       ` Jason Gunthorpe
@ 2015-08-14 21:56       ` Doug Ledford
  1 sibling, 0 replies; 13+ messages in thread
From: Doug Ledford @ 2015-08-14 21:56 UTC (permalink / raw)
  To: Matan Barak
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Jason Gunthorpe, Somnath Kotur,
	Haggai Eran, Or Gerlitz

[-- Attachment #1: Type: text/plain, Size: 926 bytes --]

On 08/03/2015 09:09 AM, Matan Barak wrote:
> Separate the cleanup and release IB cache functions.
> The cleanup function delete all GIDs and unregister the event channel,
> while the release function frees all memory. The cleanup function is
> called after all clients were removed (in the device un-registration
> process).
> 
> The release function is called after the device was put.
> Although vendor drivers aren't expected to use IB cache in their
> removal process, we postpone freeing the cache in order to avoid
> possible use-after-free errors.
> 
> Squash-into: 'IB/core: Add RoCE GID table management'
> Signed-off-by: Matan Barak <matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

I squashed this after applying Jason's follow-up patch first in the
series and fixes up the conflicts.


-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 884 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-08-14 21:56 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-03 13:08 [PATCH for-next V1 0/3] RoCE GID management fixes Matan Barak
     [not found] ` <1438607342-11964-1-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-03 13:09   ` [PATCH for-next V1 1/3] IB/core: Access to one past end of array in _gid_table_setup_one Matan Barak
     [not found]     ` <1438607342-11964-2-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-14 21:04       ` Doug Ledford
2015-08-03 13:09   ` [PATCH for-next V1 2/3] IB/core: RoCE GID management separate cleanup and release Matan Barak
     [not found]     ` <1438607342-11964-3-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-04  3:10       ` Jason Gunthorpe
     [not found]         ` <20150804031038.GA27627-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-08-04 12:09           ` Matan Barak
     [not found]             ` <CAAKD3BBESq61-UJJvqm=ni5vrtu8yuNJvC57mWwCpehQSd1k4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-08-04 16:46               ` Jason Gunthorpe
     [not found]                 ` <20150804164650.GA3858-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-08-04 18:55                   ` Matan Barak
     [not found]                     ` <CAAKD3BDDKcyA0xitGpuMkKsr6=9onxFgdVXHE3n-zb=xjX4Uhg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-08-04 21:23                       ` Jason Gunthorpe
     [not found]                         ` <20150804212334.GB10934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-08-14 21:49                           ` Doug Ledford
2015-08-14 21:56       ` Doug Ledford
2015-08-03 13:09   ` [PATCH for-next V1 3/3] IB/core: Fix possible deadlock in write_gid Matan Barak
     [not found]     ` <1438607342-11964-4-git-send-email-matanb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-14 21:16       ` Doug Ledford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).