public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
From: Toshi Kani <toshi.kani@hp.com>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: ACPI Devel Maling List <linux-acpi@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Bjorn Helgaas <bhelgaas@google.com>, Jiang Liu <liuj97@gmail.com>,
	Yinghai Lu <yinghai@kernel.org>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
	Myron Stowe <mstowe@redhat.com>,
	linux-pci@vger.kernel.org
Subject: Re: [Update][PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks
Date: Wed, 13 Feb 2013 10:43:58 -0700	[thread overview]
Message-ID: <1360777438.3869.100.camel@misato.fc.hp.com> (raw)
In-Reply-To: <1459286.L93riBCyLC@vostro.rjw.lan>

On Wed, 2013-02-13 at 14:16 +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> This changeset is aimed at fixing a few different but related
> problems in the ACPI hotplug infrastructure.
> 
> First of all, since notify handlers may be run in parallel with
> acpi_bus_scan(), acpi_bus_trim() and acpi_bus_hot_remove_device()
> and some of them are installed for ACPI handles that have no struct
> acpi_device objects attached (i.e. before those objects are created),
> those notify handlers have to take acpi_scan_lock to prevent races
> from taking place (e.g. a struct acpi_device is found to be present
> for the given ACPI handle, but right after that it is removed by
> acpi_bus_trim() running in parallel to the given notify handler).
> Moreover, since some of them call acpi_bus_scan() and
> acpi_bus_trim(), this leads to the conclusion that acpi_scan_lock
> should be acquired by the callers of these two funtions rather by
> these functions themselves.
> 
> For these reasons, make all notify handlers that can handle device
> addition and eject events take acpi_scan_lock and remove the
> acpi_scan_lock locking from acpi_bus_scan() and acpi_bus_trim().
> Accordingly, update all of their users to make sure that they
> are always called under acpi_scan_lock.
> 
> Furthermore, since eject operations are carried out asynchronously
> with respect to the notify events that trigger them, with the help
> of acpi_bus_hot_remove_device(), even if notify handlers take the
> ACPI scan lock, it still is possible that, for example,
> acpi_bus_trim() will run between acpi_bus_hot_remove_device() and
> the notify handler that scheduled its execution and that
> acpi_bus_trim() will remove the device node passed to
> acpi_bus_hot_remove_device() for ejection.  In that case, the struct
> acpi_device object obtained by acpi_bus_hot_remove_device() will be
> invalid and not-so-funny things will ensue.  To protect agaist that,
> make the users of acpi_bus_hot_remove_device() run get_device() on
> ACPI device node objects that are about to be passed to it and make
> acpi_bus_hot_remove_device() run put_device() on them and check if
> their ACPI handles are not NULL (make acpi_device_unregister() clear
> the device nodes' ACPI handles for that check to work).
> 
> Finally, observe that acpi_os_hotplug_execute() actually can fail,
> in which case its caller ought to free memory allocated for the
> context object to prevent leaks from happening.  It also needs to
> run put_device() on the device node that it ran get_device() on
> previously in that case.  Modify the code accordingly.

I am concerned with this approach.  ACPICA calls notify handlers through
kacpi_notify_wq, which has the max active set to 1.  We then use
kacpi_hotplug_wq (which also has the max active set to 1) so that a
hotplug procedure does not block the notify handlers since they can be
used for non-hotplug events as well.  Acquiring the scan lock in a
notify handler means that a hotplug procedure can block any notify
events.

So, I'd prefer the following approach.

 - Change all hot-plug procedures (i.e. both add and delete) to proceed
under kacpi_hotplug_wq by calling acpi_os_hotplug_execute(). This
serializes all hotplug procedures, and prevents blocking other notify
events.  (Ideally, we should also run all online/offline procedures
under a same work-queue, just like my RFC patchset did, but this is a
different topic for now.)

 - Revert 5993c4670 unless this change is absolutely necessary.  From
the change log, it is not clear to me why this change was needed.  It
changed acpi_bus_hot_remove_device() to take an acpi_device, instead of
an acpi_handle, which introduced a race condition with acpi_device.
acpi_bus_hot_remove_device() should take an acpi_handle, and then obtain
its acpi_device from the acpi_handle since this function is serialized.

 - Remove sanity checks with an acpi_device in the notify handlers,
which have a race condition with acpi_device.  These type-specific
checks will need to be removed when we have a common notify handler
anyway.  The notify handler can continue to check the status of ACPI
device object with an acpi_handle.  Type-specific sanity checks /
validations can be performed within a hotplug procedure, instead.

Thanks,
-Toshi

  reply	other threads:[~2013-02-13 17:43 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-13  0:19 [PATCH] ACPI / hotplug: Fix concurrency issues and memory leaks Rafael J. Wysocki
2013-02-13  1:55 ` Yinghai Lu
2013-02-13 13:08   ` Rafael J. Wysocki
2013-02-13  3:08 ` Yasuaki Ishimatsu
2013-02-13  3:31   ` Yasuaki Ishimatsu
2013-02-13 13:12     ` Rafael J. Wysocki
2013-02-13 13:16 ` [Update][PATCH] " Rafael J. Wysocki
2013-02-13 17:43   ` Toshi Kani [this message]
2013-02-13 20:52     ` Rafael J. Wysocki
2013-02-13 23:09       ` Toshi Kani
2013-02-13 23:42         ` Rafael J. Wysocki
2013-02-14  0:16           ` Toshi Kani
2013-02-14  2:31           ` Moore, Robert
2013-02-14 12:03             ` Rafael J. Wysocki
2013-02-14 20:45               ` Moore, Robert
2013-02-14 20:59                 ` Rafael J. Wysocki
2013-02-14 23:45                   ` Moore, Robert
2013-02-15  0:23                     ` Rafael J. Wysocki
2013-02-15  0:28                       ` Toshi Kani
2013-02-15 12:49                         ` Rafael J. Wysocki
2013-02-15 15:18                           ` Toshi Kani
2013-02-15 16:33                             ` Moore, Robert
2013-02-15 17:22                               ` Toshi Kani
2013-02-14 20:05   ` Yinghai Lu
2013-02-14 20:17     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1360777438.3869.100.camel@misato.fc.hp.com \
    --to=toshi.kani@hp.com \
    --cc=bhelgaas@google.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuj97@gmail.com \
    --cc=mstowe@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox