linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiang Liu <liuj97@gmail.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Jiang Liu <jiang.liu@huawei.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Yinghai Lu <yinghai@kernel.org>,
	"Alexander E . Patrakov" <patrakov@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Yijing Wang <wangyijing@huawei.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	stable@vger.kernel.org
Subject: Re: [BUGFIX 2/9] ACPIPHP: fix device destroying order issue when handling dock notification
Date: Fri, 14 Jun 2013 21:53:57 +0800	[thread overview]
Message-ID: <51BB2075.5040600@gmail.com> (raw)
In-Reply-To: <2448481.HbOiE9Npmq@vostro.rjw.lan>

On 06/14/2013 03:59 AM, Rafael J. Wysocki wrote:
> On Friday, June 14, 2013 12:32:25 AM Jiang Liu wrote:
>> Current ACPI glue logic expects that physical devices are destroyed
>> before destroying companion ACPI devices, otherwise it will break the
>> ACPI unbind logic and cause following warning messages:
>> [  185.026073] usb usb5: Oops, 'acpi_handle' corrupt
>> [  185.035150] pci 0000:1b:00.0: Oops, 'acpi_handle' corrupt
>> [  185.035515] pci 0000:18:02.0: Oops, 'acpi_handle' corrupt
>> [  180.013656]  port1: Oops, 'acpi_handle' corrupt
>> Please refer to https://bugzilla.kernel.org/attachment.cgi?id=104321
>> for full log message.
> 
> So my question is, did we have this problem before commit 3b63aaa70e1?
> 
> If we did, then when did it start?  Or was it present forever?
I think this issue should exist before commit "PCI: acpiphp: Do not use
ACPI PCI subdriver mechanism". It may trace back to the changes to kill
acpi_pci_bind()/acpi_pci_unbind().

> 
>> Above warning messages are caused by following scenario:
>> 1) acpi_dock_notifier_call() queues a task (T1) onto kacpi_hotplug_wq
>> 2) kacpi_hotplug_wq handles T1, which invokes acpi_dock_deferred_cb()
>>    ->dock_notify()-> handle_eject_request()->hotplug_dock_devices()
>> 3) hotplug_dock_devices() first invokes registered hotplug callbacks to
>>    destroy physical devices, then destroys all affected ACPI devices.
>>    Everything seems perfect until now. But the acpiphp dock notification
>>    handler will queue another task (T2) onto kacpi_hotplug_wq to really
>>    destroy affected physical devices.
> 
> Would not the solution be to modify it so that it didn't spawn the other
> task (T2), but removed the affected physical devices synchronously?
Yes, that's the way I'm going to fix this issue.

> 
>> 4) kacpi_hotplug_wq finishes T1, and all affected ACPI devices have
>>    been destroyed.
>> 5) kacpi_hotplug_wq handles T2, which destroys all affected physical
>>    devices.
>>
>> So it breaks ACPI glue logic's expection because ACPI devices are destroyed
>> in step 3 and physical devices are destroyed in step 5.
>>
>> Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
>> Reported-by: Alexander E. Patrakov <patrakov@gmail.com>
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Cc: Yinghai Lu <yinghai@kernel.org>
>> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
>> Cc: linux-pci@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: stable@vger.kernel.org
>> ---
>> Hi Bjorn and Rafael,
>>      The recursive lock changes haven't been tested yet, need help
>> from Alexander for testing.
> 
> Well, let's just say I'm not a fan of recursive locks.  Is that unavoidable
> here?
Yeah, you are right, we encounter other deadlock issue here, as reported
by Alexander. So need to find new solution here.

> 
> Rafael
> 
> 
>> ---
>>  drivers/acpi/dock.c                | 33 +++++++++++++++++++++++++++------
>>  drivers/pci/hotplug/acpiphp_glue.c | 32 ++++++++++++++++++--------------
>>  2 files changed, 45 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/acpi/dock.c b/drivers/acpi/dock.c
>> index 02b0563..79c8d9e 100644
>> --- a/drivers/acpi/dock.c
>> +++ b/drivers/acpi/dock.c
>> @@ -65,6 +65,7 @@ struct dock_station {
>>  	u32 flags;
>>  	spinlock_t dd_lock;
>>  	struct mutex hp_lock;
>> +	struct task_struct *owner;
>>  	struct list_head dependent_devices;
>>  	struct list_head hotplug_devices;
>>  
>> @@ -131,9 +132,13 @@ static void
>>  dock_add_hotplug_device(struct dock_station *ds,
>>  			struct dock_dependent_device *dd)
>>  {
>> -	mutex_lock(&ds->hp_lock);
>> -	list_add_tail(&dd->hotplug_list, &ds->hotplug_devices);
>> -	mutex_unlock(&ds->hp_lock);
>> +	if (mutex_is_locked(&ds->hp_lock) && ds->owner == current) {
>> +		list_add_tail(&dd->hotplug_list, &ds->hotplug_devices);
>> +	} else {
>> +		mutex_lock(&ds->hp_lock);
>> +		list_add_tail(&dd->hotplug_list, &ds->hotplug_devices);
>> +		mutex_unlock(&ds->hp_lock);
>> +	}
>>  }
>>  
>>  /**
>> @@ -147,9 +152,13 @@ static void
>>  dock_del_hotplug_device(struct dock_station *ds,
>>  			struct dock_dependent_device *dd)
>>  {
>> -	mutex_lock(&ds->hp_lock);
>> -	list_del(&dd->hotplug_list);
>> -	mutex_unlock(&ds->hp_lock);
>> +	if (mutex_is_locked(&ds->hp_lock) && ds->owner == current) {
>> +		list_del_init(&dd->hotplug_list);
>> +	} else {
>> +		mutex_lock(&ds->hp_lock);
>> +		list_del_init(&dd->hotplug_list);
>> +		mutex_unlock(&ds->hp_lock);
>> +	}
>>  }
>>  
>>  /**
>> @@ -355,7 +364,17 @@ static void hotplug_dock_devices(struct dock_station *ds, u32 event)
>>  {
>>  	struct dock_dependent_device *dd;
>>  
>> +	/*
>> +	 * There is a deadlock scenario as below:
>> +	 *	hotplug_dock_devices()
>> +	 *	    mutex_lock(&ds->hp_lock)
>> +	 *		dd->ops->handler()
>> +	 *		    register_hotplug_dock_device()
>> +	 *			mutex_lock(&ds->hp_lock)
>> +	 * So we need recursive lock scematics here, do it by ourselves.
>> +	 */
>>  	mutex_lock(&ds->hp_lock);
>> +	ds->owner = current;
>>  
>>  	/*
>>  	 * First call driver specific hotplug functions
>> @@ -376,6 +395,8 @@ static void hotplug_dock_devices(struct dock_station *ds, u32 event)
>>  		else
>>  			dock_create_acpi_device(dd->handle);
>>  	}
>> +
>> +	ds->owner = NULL;
>>  	mutex_unlock(&ds->hp_lock);
>>  }
>>  
>> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
>> index 716aa93..699b8ca 100644
>> --- a/drivers/pci/hotplug/acpiphp_glue.c
>> +++ b/drivers/pci/hotplug/acpiphp_glue.c
>> @@ -61,6 +61,8 @@ static DEFINE_MUTEX(bridge_mutex);
>>  static void handle_hotplug_event_bridge (acpi_handle, u32, void *);
>>  static void acpiphp_sanitize_bus(struct pci_bus *bus);
>>  static void acpiphp_set_hpp_values(struct pci_bus *bus);
>> +static void _handle_hotplug_event_func(acpi_handle handle, u32 type,
>> +				       void *context);
>>  static void handle_hotplug_event_func(acpi_handle handle, u32 type, void *context);
>>  static void free_bridge(struct kref *kref);
>>  
>> @@ -147,7 +149,7 @@ static int post_dock_fixups(struct notifier_block *nb, unsigned long val,
>>  
>>  
>>  static const struct acpi_dock_ops acpiphp_dock_ops = {
>> -	.handler = handle_hotplug_event_func,
>> +	.handler = _handle_hotplug_event_func,
>>  };
>>  
>>  /* Check whether the PCI device is managed by native PCIe hotplug driver */
>> @@ -1065,22 +1067,13 @@ static void handle_hotplug_event_bridge(acpi_handle handle, u32 type,
>>  	alloc_acpi_hp_work(handle, type, context, _handle_hotplug_event_bridge);
>>  }
>>  
>> -static void _handle_hotplug_event_func(struct work_struct *work)
>> +static void _handle_hotplug_event_func(acpi_handle handle, u32 type,
>> +				       void *context)
>>  {
>> -	struct acpiphp_func *func;
>> +	struct acpiphp_func *func = context;
>>  	char objname[64];
>>  	struct acpi_buffer buffer = { .length = sizeof(objname),
>>  				      .pointer = objname };
>> -	struct acpi_hp_work *hp_work;
>> -	acpi_handle handle;
>> -	u32 type;
>> -
>> -	hp_work = container_of(work, struct acpi_hp_work, work);
>> -	handle = hp_work->handle;
>> -	type = hp_work->type;
>> -	func = (struct acpiphp_func *)hp_work->context;
>> -
>> -	acpi_scan_lock_acquire();
>>  
>>  	acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer);
>>  
>> @@ -1113,7 +1106,18 @@ static void _handle_hotplug_event_func(struct work_struct *work)
>>  		warn("notify_handler: unknown event type 0x%x for %s\n", type, objname);
>>  		break;
>>  	}
>> +}
>> +
>> +static void _handle_hotplug_event_cb(struct work_struct *work)
>> +{
>> +	struct acpiphp_func *func;
>> +	struct acpi_hp_work *hp_work;
>>  
>> +	hp_work = container_of(work, struct acpi_hp_work, work);
>> +	func = (struct acpiphp_func *)hp_work->context;
>> +	acpi_scan_lock_acquire();
>> +	_handle_hotplug_event_func(hp_work->handle, hp_work->type,
>> +				    hp_work->context);
>>  	acpi_scan_lock_release();
>>  	kfree(hp_work); /* allocated in handle_hotplug_event_func */
>>  	put_bridge(func->slot->bridge);
>> @@ -1141,7 +1145,7 @@ static void handle_hotplug_event_func(acpi_handle handle, u32 type,
>>  	 * don't deadlock on hotplug actions.
>>  	 */
>>  	get_bridge(func->slot->bridge);
>> -	alloc_acpi_hp_work(handle, type, context, _handle_hotplug_event_func);
>> +	alloc_acpi_hp_work(handle, type, context, _handle_hotplug_event_cb);
>>  }
>>  
>>  /*
>>


  parent reply	other threads:[~2013-06-14 13:54 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-13 16:32 [BUGFIX 0/9] Fix bug 59501 and code improvement for dock driver Jiang Liu
2013-06-13 16:32 ` [BUGFIX 1/9] ACPI, DOCK: initialize dock subsystem before scanning PCI root buses Jiang Liu
2013-06-13 18:22   ` Rafael J. Wysocki
2013-06-13 18:24   ` Rafael J. Wysocki
2013-06-13 16:32 ` [BUGFIX 2/9] ACPIPHP: fix device destroying order issue when handling dock notification Jiang Liu
2013-06-13 19:59   ` Rafael J. Wysocki
2013-06-14 12:23     ` Rafael J. Wysocki
2013-06-14 12:30       ` Alexander E. Patrakov
2013-06-14 12:53         ` Rafael J. Wysocki
2013-06-14 16:58           ` Alexander E. Patrakov
2013-06-14 13:57       ` Jiang Liu
2013-06-14 14:12         ` Rafael J. Wysocki
2013-06-14 15:30           ` Jiang Liu
2013-06-14 23:12             ` Rafael J. Wysocki
2013-06-14 13:53     ` Jiang Liu [this message]
2013-06-14 14:05       ` Rafael J. Wysocki
2013-06-13 16:32 ` [BUGFIX 3/9] ACPI, DOCK: clean up unused module related code Jiang Liu
2013-06-13 18:26   ` Rafael J. Wysocki
2013-06-13 18:39     ` Rafael J. Wysocki
2013-06-14 14:04     ` Jiang Liu
2013-06-14 14:16       ` Rafael J. Wysocki
2013-06-13 16:32 ` [BUGFIX 4/9] ACPI, DOCK: avoid initializing acpi_dock_notifier_list multiple times Jiang Liu
2013-06-13 18:27   ` Rafael J. Wysocki
2013-06-13 16:32 ` [BUGFIX 5/9] ACPI, DOCK: kill redundant spin lock in dock device object Jiang Liu
2013-06-13 18:28   ` Rafael J. Wysocki
2013-06-14 14:05     ` Jiang Liu
2013-06-14 14:16       ` Rafael J. Wysocki
2013-06-13 16:32 ` [BUGFIX 6/9] ACPI, DOCK: mark initialization functions with __init Jiang Liu
2013-06-13 18:29   ` Rafael J. Wysocki
2013-06-13 16:32 ` [BUGFIX 7/9] ACPI, DOCK: simplify implementation of dock_create_acpi_device() Jiang Liu
2013-06-13 16:32 ` [BUGFIX 8/9] ACPI: introduce several helper functions Jiang Liu
2013-06-13 18:36   ` Rafael J. Wysocki
2013-06-13 16:32 ` [BUGFIX 9/9] ACPI: use new helper functions to simpilify code Jiang Liu
2013-06-13 17:34   ` Alexander E. Patrakov
2013-06-13 18:38   ` Rafael J. Wysocki
2013-06-13 17:43 ` [BUGFIX 0/9] Fix bug 59501 and code improvement for dock driver Alexander E. Patrakov
2013-06-13 18:26   ` Alexander E. Patrakov
2013-06-13 18:42 ` Yinghai Lu
2013-06-13 19:02   ` Rafael J. Wysocki
2013-06-13 19:08     ` Yinghai Lu
2013-06-14  2:06   ` Alexander E. Patrakov
2013-06-14  3:22     ` Yinghai Lu
2013-06-14  3:57       ` Alexander E. Patrakov
2013-06-14  2:09   ` Jiang Liu (Gerry)
2013-06-14  2:30     ` Yinghai Lu
2013-06-14  2:40       ` Alexander E. Patrakov
2013-06-14  2:51       ` Jiang Liu (Gerry)
2013-06-14  3:30         ` Yinghai Lu
2013-06-14  3:43           ` Yinghai Lu
2013-06-14  3:56             ` Jiang Liu (Gerry)
2013-06-14  3:53           ` Yinghai Lu
2013-06-14  4:07         ` Alexander E. Patrakov
2013-06-14  4:14           ` Jiang Liu (Gerry)
2013-06-14  4:43             ` Alexander E. Patrakov
2013-06-14  5:11               ` Jiang Liu (Gerry)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51BB2075.5040600@gmail.com \
    --to=liuj97@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jiang.liu@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=patrakov@gmail.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rjw@sisk.pl \
    --cc=stable@vger.kernel.org \
    --cc=wangyijing@huawei.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).