linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
@ 2023-06-27 11:22 David Hildenbrand
  2023-06-27 11:22 ` [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages() David Hildenbrand
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: David Hildenbrand @ 2023-06-27 11:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, virtualization, David Hildenbrand, Andrew Morton,
	Michael S. Tsirkin, John Hubbard, Oscar Salvador, Michal Hocko,
	Jason Wang, Xuan Zhuo

As raised by John Hubbard [1], offline_and_remove_memory() failing on
fatal signals can be sub-optimal for out-of-tree drivers: dying user space
might be the last one holding a device node open.

As that device node gets closed, the driver might unplug the device
and trigger offline_and_remove_memory() to unplug previously
hotplugged device memory. This, however, will fail reliably when fatal
signals are pending on the dying process, turning the device unusable until
the machine gets rebooted.

That can be optizied easily by ignoring fatal signals. In fact, checking
for fatal signals in the case of offline_and_remove_memory() doesn't
make too much sense; the check makes sense when offlining is triggered
directly via sysfs.  However, we actually do want a way to not end up
stuck in offline_and_remove_memory() forever.

What offline_and_remove_memory() users actually want is fail after some
given timeout and not care about fatal signals.

So let's implement that, optimizing virtio-mem along the way.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

[1] https://lkml.kernel.org/r/20230620011719.155379-1-jhubbard@nvidia.com

David Hildenbrand (5):
  mm/memory_hotplug: check for fatal signals only in offline_pages()
  virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
  mm/memory_hotplug: make offline_and_remove_memory() timeout instead of
    failing on fatal signals
  virtio-mem: set the timeout for offline_and_remove_memory() to 10
    seconds
  virtio-mem: check if the config changed before (fake) offlining memory

 drivers/virtio/virtio_mem.c    | 22 +++++++++++++--
 include/linux/memory_hotplug.h |  2 +-
 mm/memory_hotplug.c            | 50 ++++++++++++++++++++++++++++++++--
 3 files changed, 68 insertions(+), 6 deletions(-)


base-commit: 6995e2de6891c724bfeb2db33d7b87775f913ad1
-- 
2.40.1



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-06-28  2:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-27 11:22 [PATCH v1 0/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals David Hildenbrand
2023-06-27 11:22 ` [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only in offline_pages() David Hildenbrand
2023-06-27 12:34   ` Michal Hocko
2023-06-27 13:28     ` David Hildenbrand
2023-06-27 14:07       ` Michal Hocko
2023-06-27 11:22 ` [PATCH v1 2/5] virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY David Hildenbrand
2023-06-27 11:22 ` [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals David Hildenbrand
2023-06-27 12:40   ` Michal Hocko
2023-06-27 13:14     ` David Hildenbrand
2023-06-27 14:17       ` Michal Hocko
2023-06-27 14:57         ` David Hildenbrand
2023-06-27 15:14           ` Michal Hocko
2023-06-27 21:34             ` John Hubbard
2023-06-28  2:00   ` kernel test robot
2023-06-27 11:22 ` [PATCH v1 4/5] virtio-mem: set the timeout for offline_and_remove_memory() to 10 seconds David Hildenbrand
2023-06-27 11:22 ` [PATCH v1 5/5] virtio-mem: check if the config changed before (fake) offlining memory David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).