Linux ATA/IDE development
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org
Subject: Re: [libsas PATCH v12 04/11] sysfs: handle 'parent deleted before child added'
Date: Thu, 22 Mar 2012 07:39:59 -0700	[thread overview]
Message-ID: <20120322143959.GF19835@kroah.com> (raw)
In-Reply-To: <20120322063214.22036.77957.stgit@dwillia2-linux.jf.intel.com>

On Wed, Mar 21, 2012 at 11:32:14PM -0700, Dan Williams wrote:
> In scsi at least two cases of the parent device being deleted before the
> child is added have been observed.
> 
> 1/ scsi is performing async scans and the device is removed prior to the
>    async can thread running.
> 
> 2/ libsas discovery event running after the parent port has been torn
>    down.
> 
> Result in crash signatures like:
>  BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
>  IP: [<ffffffff8115e100>] sysfs_create_dir+0x32/0xb6
>  ...
>  Process scsi_scan_8 (pid: 5417, threadinfo ffff88080bd16000, task ffff880801b8a0b0)
>  Stack:
>   00000000fffffffe ffff880813470628 ffff88080bd17cd0 ffff88080614b7e8
>   ffff88080b45c108 00000000fffffffe ffff88080bd17d20 ffffffff8125e4a8
>   ffff88080bd17cf0 ffffffff81075149 ffff88080bd17d30 ffff88080614b7e8
>  Call Trace:
>   [<ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3
>   [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf
>   [<ffffffff8125e641>] kobject_add_varg+0x41/0x50
>   [<ffffffff8125e70b>] kobject_add+0x64/0x66
>   [<ffffffff8131122b>] device_add+0x12d/0x63a
> 
> These scenarios need to be cleaned up, but in the meantime the system
> need not crash if this ordering occurs.  Instead report:
> 
>  kobject_add_internal failed for target8:0:16 (error: -2 parent: end_device-8:0:24)
>  Pid: 2942, comm: scsi_scan_8 Not tainted 3.3.0-rc7-isci+ #2
>  Call Trace:
>   [<ffffffff8125e551>] kobject_add_internal+0x1c1/0x1f3
>   [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf
>   [<ffffffff8125e659>] kobject_add_varg+0x41/0x50
>   [<ffffffff8125e723>] kobject_add+0x64/0x66
>   [<ffffffff8131124b>] device_add+0x12d/0x63a
>   [<ffffffff8125e0ef>] ? kobject_put+0x4c/0x50
>   [<ffffffff8132f370>] scsi_sysfs_add_sdev+0x4e/0x28a
>   [<ffffffff8132dce3>] do_scan_async+0x9c/0x145
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  fs/sysfs/dir.c |    3 +++
>  lib/kobject.c  |    7 ++++---
>  2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index 7fdf6a7..86521ee 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -714,6 +714,9 @@ int sysfs_create_dir(struct kobject * kobj)
>  	else
>  		parent_sd = &sysfs_root;
>  
> +	if (!parent_sd)
> +		return -ENOENT;
> +
>  	if (sysfs_ns_type(parent_sd))
>  		ns = kobj->ktype->namespace(kobj);
>  	type = sysfs_read_ns_type(kobj);

So what happens if this is true?  Does this patch fix the oops?  What
kernels should this be applied to where this problem has been seen?

> diff --git a/lib/kobject.c b/lib/kobject.c
> index c33d7a1..e5f86c0 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -192,13 +192,14 @@ static int kobject_add_internal(struct kobject *kobj)
>  
>  		/* be noisy on error issues */
>  		if (error == -EEXIST)
> -			printk(KERN_ERR "%s failed for %s with "
> +			pr_err("%s failed for %s with "
>  			       "-EEXIST, don't try to register things with "
>  			       "the same name in the same directory.\n",
>  			       __func__, kobject_name(kobj));
>  		else
> -			printk(KERN_ERR "%s failed for %s (%d)\n",
> -			       __func__, kobject_name(kobj), error);
> +			pr_err("%s failed for %s (error: %d parent: %s)\n",
> +			       __func__, kobject_name(kobj), error,
> +			       parent ? kobject_name(parent) : "'none'");
>  		dump_stack();
>  	} else
>  		kobj->state_in_sysfs = 1;

These changes have nothing to do with the above fix, so why include them
here?

And note, I hate pr_err(), what's wrong with printk() in this instance?

greg k-h

  reply	other threads:[~2012-03-22 14:39 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-22  6:31 [libsas PATCH v12 00/11] libsas eh, discovery, suspend, and fixes Dan Williams
2012-03-22  6:31 ` [libsas PATCH v12 01/11] libsas: cleanup spurious calls to scsi_schedule_eh Dan Williams
2012-03-22  6:32 ` [libsas PATCH v12 02/11] libsas: trim sas_task of slow path infrastructure Dan Williams
2012-03-22  6:32 ` [libsas PATCH v12 03/11] libata: reset once Dan Williams
2012-03-22  6:32 ` [libsas PATCH v12 04/11] sysfs: handle 'parent deleted before child added' Dan Williams
2012-03-22 14:39   ` Greg Kroah-Hartman [this message]
2012-03-22 16:27     ` Williams, Dan J
2012-03-22 22:51       ` Stefan Richter
2012-03-22 23:11         ` Williams, Dan J
2012-03-22 23:26         ` Stefan Richter
2012-03-23 18:43     ` Dan Williams
2012-03-23 20:54       ` Greg Kroah-Hartman
2012-03-23 21:15         ` Williams, Dan J
2012-03-22 14:47   ` James Bottomley
2012-03-22 16:34     ` Williams, Dan J
2012-03-22  6:32 ` [libsas PATCH v12 05/11] scsi_transport_sas: fix delete vs scan race Dan Williams
2012-03-22  6:32 ` [libsas PATCH v12 06/11] libsas: unify domain_device sas_rphy lifetimes Dan Williams
2012-03-22  6:32 ` [libsas PATCH v12 07/11] libsas: fix false positive 'device attached' conditions Dan Williams
2012-03-22  6:32 ` [libsas PATCH v12 08/11] libsas: fix ata_eh clobbering ex_phys via smp_ata_check_ready Dan Williams
2012-03-22  6:32 ` [libsas PATCH v12 09/11] libsas, libata: fix start of life for a sas ata_port Dan Williams
2012-03-22  6:32 ` [libsas PATCH v12 10/11] scsi, sd: limit the scope of the async probe domain Dan Williams
2012-03-22 14:20   ` Alan Stern
2012-03-22 19:09     ` Williams, Dan J
2012-03-22  6:32 ` [libsas PATCH v12 11/11] libsas: suspend / resume support Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120322143959.GF19835@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox