public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] staging: lustre: add error handling for try_module_get
@ 2018-06-12  4:49 Zhouyang Jia
  2018-06-12  5:26 ` NeilBrown
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Zhouyang Jia @ 2018-06-12  4:49 UTC (permalink / raw)
  Cc: Zhouyang Jia, Oleg Drokin, Andreas Dilger, James Simmons,
	Greg Kroah-Hartman, NeilBrown, Haneen Mohammed, Al Viro,
	Gustavo A. R. Silva, lustre-devel, devel, linux-kernel

When try_module_get fails, the lack of error-handling code may
cause unexpected results.

This patch adds error-handling code after calling try_module_get.

Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
---
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
index 7086678..72a42bd 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
@@ -2422,7 +2422,10 @@ ksocknal_base_startup(void)
 
 	/* flag lists/ptrs/locks initialised */
 	ksocknal_data.ksnd_init = SOCKNAL_INIT_DATA;
-	try_module_get(THIS_MODULE);
+	if (!try_module_get(THIS_MODULE)) {
+		CERROR("%s: cannot get module\n", __func__);
+		goto failed;
+	}
 
 	ksocknal_data.ksnd_sched_info = cfs_percpt_alloc(lnet_cpt_table(),
 							 sizeof(*info));
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] staging: lustre: add error handling for try_module_get
  2018-06-12  4:49 [PATCH] staging: lustre: add error handling for try_module_get Zhouyang Jia
@ 2018-06-12  5:26 ` NeilBrown
  2018-06-12  6:31 ` Greg Kroah-Hartman
  2018-06-13 10:53 ` David Laight
  2 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2018-06-12  5:26 UTC (permalink / raw)
  To: Zhouyang Jia
  Cc: Zhouyang Jia, Oleg Drokin, Andreas Dilger, James Simmons,
	Greg Kroah-Hartman, Haneen Mohammed, Al Viro, Gustavo A. R. Silva,
	lustre-devel, devel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1318 bytes --]

On Tue, Jun 12 2018, Zhouyang Jia wrote:

> When try_module_get fails, the lack of error-handling code may
> cause unexpected results.
>
> This patch adds error-handling code after calling try_module_get.
>
> Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
> ---
>  drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> index 7086678..72a42bd 100644
> --- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> @@ -2422,7 +2422,10 @@ ksocknal_base_startup(void)
>  
>  	/* flag lists/ptrs/locks initialised */
>  	ksocknal_data.ksnd_init = SOCKNAL_INIT_DATA;
> -	try_module_get(THIS_MODULE);
> +	if (!try_module_get(THIS_MODULE)) {
> +		CERROR("%s: cannot get module\n", __func__);
> +		goto failed;
> +	}
>  
>  	ksocknal_data.ksnd_sched_info = cfs_percpt_alloc(lnet_cpt_table(),
>  							 sizeof(*info));

Thanks for the patch....
I agree that this is probably a bug, but the code is still buggy after
you patch, just in a different way.
Try following through the code and see what happens when you 'goto
failed'.

NeilBrown



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] staging: lustre: add error handling for try_module_get
  2018-06-12  4:49 [PATCH] staging: lustre: add error handling for try_module_get Zhouyang Jia
  2018-06-12  5:26 ` NeilBrown
@ 2018-06-12  6:31 ` Greg Kroah-Hartman
  2018-06-13 10:53 ` David Laight
  2 siblings, 0 replies; 5+ messages in thread
From: Greg Kroah-Hartman @ 2018-06-12  6:31 UTC (permalink / raw)
  To: Zhouyang Jia
  Cc: Oleg Drokin, Andreas Dilger, James Simmons, NeilBrown,
	Haneen Mohammed, Al Viro, Gustavo A. R. Silva, lustre-devel,
	devel, linux-kernel

On Tue, Jun 12, 2018 at 12:49:26PM +0800, Zhouyang Jia wrote:
> When try_module_get fails, the lack of error-handling code may
> cause unexpected results.
> 
> This patch adds error-handling code after calling try_module_get.
> 
> Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
> ---
>  drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c | 5 ++++-

This patch does not apply to Linus's tree.  Always be sure to work
against linux-next to catch things like this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] staging: lustre: add error handling for try_module_get
  2018-06-12  4:49 [PATCH] staging: lustre: add error handling for try_module_get Zhouyang Jia
  2018-06-12  5:26 ` NeilBrown
  2018-06-12  6:31 ` Greg Kroah-Hartman
@ 2018-06-13 10:53 ` David Laight
  2018-06-13 12:02   ` NeilBrown
  2 siblings, 1 reply; 5+ messages in thread
From: David Laight @ 2018-06-13 10:53 UTC (permalink / raw)
  To: 'Zhouyang Jia'
  Cc: Oleg Drokin, Andreas Dilger, James Simmons, Greg Kroah-Hartman,
	NeilBrown, Haneen Mohammed, Al Viro, Gustavo A. R. Silva,
	lustre-devel@lists.lustre.org, devel@driverdev.osuosl.org,
	linux-kernel@vger.kernel.org

From: Zhouyang Jia
> Sent: 12 June 2018 05:49
> 
> When try_module_get fails, the lack of error-handling code may
> cause unexpected results.
> 
> This patch adds error-handling code after calling try_module_get.
...
> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
> @@ -2422,7 +2422,10 @@ ksocknal_base_startup(void)
> 
>  	/* flag lists/ptrs/locks initialised */
>  	ksocknal_data.ksnd_init = SOCKNAL_INIT_DATA;
> -	try_module_get(THIS_MODULE);
> +	if (!try_module_get(THIS_MODULE)) {
> +		CERROR("%s: cannot get module\n", __func__);
> +		goto failed;
> +	}


Can try_module_get(THIS_MODULE) ever fail?
Since you are running code in 'THIS_MODULE' the caller must have a
reference that can't go away.
So try_module_get() just increments the count that is already greater
than zero.

Similarly module_put(THIS_MODULE) must never be able to release the
last reference.
Any such calls that aren't in error paths after try_module_get() are
probably buggy.

	David


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] staging: lustre: add error handling for try_module_get
  2018-06-13 10:53 ` David Laight
@ 2018-06-13 12:02   ` NeilBrown
  0 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2018-06-13 12:02 UTC (permalink / raw)
  To: David Laight, 'Zhouyang Jia'
  Cc: Oleg Drokin, Andreas Dilger, James Simmons, Greg Kroah-Hartman,
	Haneen Mohammed, Al Viro, Gustavo A. R. Silva,
	lustre-devel@lists.lustre.org, devel@driverdev.osuosl.org,
	linux-kernel@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2220 bytes --]

On Wed, Jun 13 2018, David Laight wrote:

> From: Zhouyang Jia
>> Sent: 12 June 2018 05:49
>> 
>> When try_module_get fails, the lack of error-handling code may
>> cause unexpected results.
>> 
>> This patch adds error-handling code after calling try_module_get.
> ...
>> +++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
>> @@ -2422,7 +2422,10 @@ ksocknal_base_startup(void)
>> 
>>  	/* flag lists/ptrs/locks initialised */
>>  	ksocknal_data.ksnd_init = SOCKNAL_INIT_DATA;
>> -	try_module_get(THIS_MODULE);
>> +	if (!try_module_get(THIS_MODULE)) {
>> +		CERROR("%s: cannot get module\n", __func__);
>> +		goto failed;
>> +	}
>
>
> Can try_module_get(THIS_MODULE) ever fail?

Yes.

> Since you are running code in 'THIS_MODULE' the caller must have a
> reference that can't go away.

Not necessarily, though it does usually work that way.

try_module_get() can fail while the exit function is running, but it is
safe to run code in the module until the exit function completes.
So if the exit function takes a lock, then other code can safely run
code in the module while holding the lock, but not holding a reference
to the module.  If this code calls try_module_get(), it could fail.

That is exactly what is happening here.
ksoclnd_exit() calls lnet_unregister_lnd() which takes
the_lnet.ln_lnd_mutex.

ksocknal_base_startup() is called from ksocknal_startup()
which is the_ksocklnd.lnd_startup and is called, from
lnet_startup_lndni(), with that lock held.

> So try_module_get() just increments the count that is already greater
> than zero.
>
> Similarly module_put(THIS_MODULE) must never be able to release the
> last reference.

It can if a suitable lock is held.

> Any such calls that aren't in error paths after try_module_get() are
> probably buggy.
Being in an error path doesn't make it safe.
module_put(THIS_MODULE) can only be safe if a lock is held which
prevents the exit function from completing.  Some code outside the
module must release the lock.

Having said that, I don't really like this approach.  I much prefer for
the module reference to be taken and put outside of the module - it
seems less error-prone.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-06-13 12:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-12  4:49 [PATCH] staging: lustre: add error handling for try_module_get Zhouyang Jia
2018-06-12  5:26 ` NeilBrown
2018-06-12  6:31 ` Greg Kroah-Hartman
2018-06-13 10:53 ` David Laight
2018-06-13 12:02   ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox