public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Saravana Kannan <saravanak@google.com>,
	Heikki Krogerus <heikki.krogerus@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH v1] driver core: check for dead devices before onlining/offlining
Date: Fri, 24 Jan 2020 10:00:52 +0100	[thread overview]
Message-ID: <20200124090052.GA2958140@kroah.com> (raw)
In-Reply-To: <20200120104909.13991-1-david@redhat.com>

On Mon, Jan 20, 2020 at 11:49:09AM +0100, David Hildenbrand wrote:
> We can have rare cases where the removal of a device races with
> somebody trying to online it (esp. via sysfs). We can simply check
> if the device is already removed or getting removed under the dev->lock.
> 
> E.g., right now, if memory block devices are removed (remove_memory()),
> we do a:
> 
> remove_memory() -> lock_device_hotplug() -> mem_hotplug_begin() ->
> lock_device() -> dev->dead = true
> 
> Somebody coming via sysfs (/sys/devices/system/memory/memoryX/online)
> triggers a:
> 
> lock_device_hotplug_sysfs() -> device_online() -> lock_device() ...
> 
> So if we made it just before the lock_device_hotplug_sysfs() but get
> delayed until remove_memory() released all locks, we will continue
> taking locks and trying to online the device - which is then a zombie
> device.
> 
> Note that at least the memory onlining path seems to be protected by
> checking if all memory sections are still present (something we can then
> get rid of). We do have other sysfs attributes
> (e.g., /sys/devices/system/memory/memoryX/valid_zones) that don't do any
> such locking yet and might race with memory removal in a similar way. For
> these users, we can then do a
> 
> device_lock(dev);
> if (!device_is_dead(dev)) {
> 	/* magic /*
> }
> device_unlock(dev);
> 
> Introduce and use device_is_dead() right away.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Saravana Kannan <saravanak@google.com>
> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> 
> Am I missing any obvious mechanism in the device core that handles
> something like this already? (especially also for other sysfs attributes?)

So is a sysfs attribute causing the device itself to go away?  We have
problems with that in the past, look at how the scsi layer handled it, I
think there's a specific call you should be making instead of trying to
rely on this "dead" flag.

thanks,

greg k-h

  reply	other threads:[~2020-01-24  9:00 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-20 10:49 [PATCH v1] driver core: check for dead devices before onlining/offlining David Hildenbrand
2020-01-24  9:00 ` Greg Kroah-Hartman [this message]
2020-01-24  9:09   ` David Hildenbrand
2020-01-24  9:12     ` Greg Kroah-Hartman
2020-01-24 13:31       ` David Hildenbrand
2020-01-24 13:48         ` David Hildenbrand
2020-01-24 16:31           ` David Hildenbrand
2020-01-24  9:38     ` Rafael J. Wysocki
2020-01-24 13:29       ` David Hildenbrand
2020-01-24 17:14 ` David Hildenbrand
2020-01-24 18:40   ` Dan Williams
2020-01-24 18:58     ` David Hildenbrand
2020-01-24 22:36       ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200124090052.GA2958140@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=heikki.krogerus@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=rafael@kernel.org \
    --cc=saravanak@google.com \
    --cc=suzuki.poulose@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox