Re: [PATCH 06/10] mm: page_alloc: Distinguish between being unable to sleep, unwilling to unwilling and avoiding waking kswapd

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Linux-MM <linux-mm@kvack.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 06/10] mm: page_alloc: Distinguish between being unable to sleep, unwilling to unwilling and avoiding waking kswapd
Date: Fri, 21 Aug 2015 21:39:42 +0100	[thread overview]
Message-ID: <20150821203942.GF12432@techsingularity.net> (raw)
In-Reply-To: <55D72ABD.8020205@suse.cz>

On Fri, Aug 21, 2015 at 03:42:21PM +0200, Vlastimil Babka wrote:
> On 08/12/2015 12:45 PM, Mel Gorman wrote:
> >__GFP_WAIT has been used to identify atomic context in callers that hold
> >spinlocks or are in interrupts. They are expected to be high priority and
> >have access one of two watermarks lower than "min". __GFP_HIGH users get
> >access to the first lower watermark and can be called the "high priority
> >reserve". Atomic users and interrupts access yet another lower watermark
> >that can be called the "atomic reserve".
> >
> >Over time, callers had a requirement to not block when fallback options
> >were available. Some have abused __GFP_WAIT leading to a situation where
> >an optimisitic allocation with a fallback option can access atomic reserves.
> >
> >This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
> >cannot sleep and have no alternative. High priority users continue to use
> >__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are
> >willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers
> >that want to wake kswapd for background reclaim. __GFP_WAIT is redefined
> >as a caller that is willing to enter direct reclaim and wake kswapd for
> >background reclaim.
> >
> >This patch then converts a number of sites
> >
> >o __GFP_ATOMIC is used by callers that are high priority and have memory
> >   pools for those requests. GFP_ATOMIC uses this flag. Callers with
> >   interrupts disabled still automatically use the atomic reserves.
> 
> Hm I can't see where the latter happens? In gfp_to_alloc_flags(),
> ALLOC_HARDER is set for __GFP_ATOMIC, or rt-tasks *not* in interrupt? What
> am I missing?
> 

It was a mistake from an earlier version of the patch that was itself
buggy. I forgot to fix the changelog properly.

> >o Callers that have a limited mempool to guarantee forward progress use
> >   __GFP_DIRECT_RECLAIM. bio allocations fall into this category where
> >   kswapd will still be woken but atomic reserves are not used as there
> >   is a one-entry mempool to guarantee progress.
> >
> >o Callers that are checking if they are non-blocking should use the
> >   helper gfpflags_allows_blocking() where possible. This is because
> 
> A bit subjective but gfpflags_allow_blocking() sounds better to me.
> Or shorter gfp_allows_blocking()?
> 

I'll use gfpflags_allow_blocking.

> >   checking for __GFP_WAIT as was done historically now can trigger false
> >   positives. Some exceptions like dm-crypt.c exist where the code intent
> >   is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
> >   flag manipulations.
> >
> >The key hazard to watch out for is callers that removed __GFP_WAIT and
> >was depending on access to atomic reserves for inconspicuous reasons.
> >In some cases it may be appropriate for them to use __GFP_HIGH.
> 
> Hm we might also have a (non-fatal) hazard of callers that directly combined
> __GFP_* flags that didn't include __GFP_WAIT, but did wake up kswapd, and
> now might be missing __GFP_KSWAPD_RECLAIM. Did you try checking for those? I
> imagine it's not a simple task...
> 

I hadn't searched but there are a small number of callers that
potentially care. I fixed them.

> >Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> 
> >
> >diff --git a/Documentation/vm/balance b/Documentation/vm/balance
> >index c46e68cf9344..6f1f6fae30f5 100644
> >--- a/Documentation/vm/balance
> >+++ b/Documentation/vm/balance
> >@@ -1,12 +1,14 @@
> >  Started Jan 2000 by Kanoj Sarcar <kanoj@sgi.com>
> >
> >-Memory balancing is needed for non __GFP_WAIT as well as for non
> >-__GFP_IO allocations.
> >+Memory balancing is needed for !__GFP_ATOMIC and !__GFP_KSWAPD_RECLAIM as
> >+well as for non __GFP_IO allocations.
> >
> >-There are two reasons to be requesting non __GFP_WAIT allocations:
> >-the caller can not sleep (typically intr context), or does not want
> >-to incur cost overheads of page stealing and possible swap io for
> >-whatever reasons.
> >+The first reason why a caller may avoid reclaim is that the caller can not
> >+sleep due to holding a spinlock or is in interrupt context. The second may
> >+be that the caller is willing to fail the allocation without incurring the
> >+overhead of page stealing. This may happen for opportunistic high-order
> 
> I think "page stealing" has nowadays a different meaning in the
> anti-fragmentation context? Should it just say "reclaim"?
> 

Good point, corrected.

> >+allocation requests that have order-0 fallback options. In such cases,
> >+the caller may also wish to avoid waking kswapd.
> >
> >  __GFP_IO allocation requests are made to prevent file system deadlocks.
> >
> >diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> >index cba12f34ff77..100d3fbaebae 100644
> >--- a/arch/arm/mm/dma-mapping.c
> >+++ b/arch/arm/mm/dma-mapping.c
> >@@ -650,7 +650,7 @@ static void *__dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> >
> >  	if (is_coherent || nommu())
> >  		addr = __alloc_simple_buffer(dev, size, gfp, &page);
> >-	else if (!(gfp & __GFP_WAIT))
> >+	else if (gfp & __GFP_ATOMIC)
> >  		addr = __alloc_from_pool(size, &page);
> >  	else if (!dev_get_cma_area(dev))
> >  		addr = __alloc_remap_buffer(dev, size, gfp, prot, &page, caller, want_vaddr);
> >@@ -1369,7 +1369,7 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
> >  	*handle = DMA_ERROR_CODE;
> >  	size = PAGE_ALIGN(size);
> >
> >-	if (!(gfp & __GFP_WAIT))
> >+	if (gfp & __GFP_ATOMIC)
> >  		return __iommu_alloc_atomic(dev, size, handle);
> >
> >  	/*
> >diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> >index d16a1cead23f..713d963fb96b 100644
> >--- a/arch/arm64/mm/dma-mapping.c
> >+++ b/arch/arm64/mm/dma-mapping.c
> >@@ -100,7 +100,7 @@ static void *__dma_alloc_coherent(struct device *dev, size_t size,
> >  	if (IS_ENABLED(CONFIG_ZONE_DMA) &&
> >  	    dev->coherent_dma_mask <= DMA_BIT_MASK(32))
> >  		flags |= GFP_DMA;
> >-	if (IS_ENABLED(CONFIG_DMA_CMA) && (flags & __GFP_WAIT)) {
> >+	if (IS_ENABLED(CONFIG_DMA_CMA) && (flags & __GFP_DIRECT_RECLAIM)) {
> >  		struct page *page;
> >  		void *addr;
> >
> >@@ -147,7 +147,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
> >
> >  	size = PAGE_ALIGN(size);
> >
> >-	if (!coherent && !(flags & __GFP_WAIT)) {
> >+	if (!coherent && (flags & __GFP_ATOMIC)) {
> >  		struct page *page = NULL;
> >  		void *addr = __alloc_from_pool(size, &page, flags);
> >
> 
> Hmm these change the lack of __GFP_WAIT to expect __GFP_ATOMIC, so it's
> potentially one of those "key hazards" mentioned in the changelog, right?
> But here it's not just about using atomic reserves, but using a completely
> different allocation function.
> E.g. in case of arch/arm/mm/dma-mapping.c:__dma_alloc() I see it can go to
> __alloc_remap_buffer -> __dma_alloc_remap -> dma_common_contiguous_remap
> which does kmalloc(..., GFP_KERNEL) and has comment "Cannot be used in
> non-sleeping contexts".
> 

I completely missed that. It needs to be a check for
__GFP_DIRECT_RECLAIM and similar in the other arm file.

> So I think callers that cannot sleep and did clear __GFP_WAIT before, are
> now dangerous unless they set __GFP_ATOMIC?
> 
> >diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
> >index 2a3973a7c441..dc611c8cad10 100644
> >--- a/drivers/firewire/core-cdev.c
> >+++ b/drivers/firewire/core-cdev.c
> >@@ -486,7 +486,7 @@ static int ioctl_get_info(struct client *client, union ioctl_arg *arg)
> >  static int add_client_resource(struct client *client,
> >  			       struct client_resource *resource, gfp_t gfp_mask)
> >  {
> >-	bool preload = !!(gfp_mask & __GFP_WAIT);
> >+	bool preload = !!(gfp_mask & __GFP_DIRECT_RECLAIM);
> 
> Use the helper here to avoid !! as a bonus?
> 

Done.

> >--- a/drivers/infiniband/core/sa_query.c
> >+++ b/drivers/infiniband/core/sa_query.c
> >@@ -619,7 +619,7 @@ static void init_mad(struct ib_sa_mad *mad, struct ib_mad_agent *agent)
> >
> >  static int send_mad(struct ib_sa_query *query, int timeout_ms, gfp_t gfp_mask)
> >  {
> >-	bool preload = !!(gfp_mask & __GFP_WAIT);
> >+	bool preload = !!(gfp_mask & __GFP_DIRECT_RECLAIM);
> >  	unsigned long flags;
> >  	int ret, id;
> >
> 
> Same here.
> 

Done.

> >diff --git a/drivers/usb/host/u132-hcd.c b/drivers/usb/host/u132-hcd.c
> >index d51687780b61..06badad3ab75 100644
> >--- a/drivers/usb/host/u132-hcd.c
> >+++ b/drivers/usb/host/u132-hcd.c
> >@@ -2247,7 +2247,7 @@ static int u132_urb_enqueue(struct usb_hcd *hcd, struct urb *urb,
> >  {
> >  	struct u132 *u132 = hcd_to_u132(hcd);
> >  	if (irqs_disabled()) {
> >-		if (__GFP_WAIT & mem_flags) {
> >+		if (__GFP_DIRECT_RECLAIM & mem_flags) {
> >  			printk(KERN_ERR "invalid context for function that migh"
> >  				"t sleep\n");
> >  			return -EINVAL;
> 
> And here - no other flag manipulations and it would match the printk.

Fixed

> >--- a/fs/btrfs/extent_io.c
> >+++ b/fs/btrfs/extent_io.c
> >@@ -594,7 +594,7 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
> >  	if (bits & (EXTENT_IOBITS | EXTENT_BOUNDARY))
> >  		clear = 1;
> >  again:
> >-	if (!prealloc && (mask & __GFP_WAIT)) {
> >+	if (!prealloc && (mask & __GFP_DIRECT_RECLAIM)) {
> >  		/*
> >  		 * Don't care for allocation failure here because we might end
> >  		 * up not needing the pre-allocated extent state at all, which
> >@@ -850,7 +850,7 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
> >
> >  	bits |= EXTENT_FIRST_DELALLOC;
> >  again:
> >-	if (!prealloc && (mask & __GFP_WAIT)) {
> >+	if (!prealloc && (mask & __GFP_DIRECT_RECLAIM)) {
> >  		prealloc = alloc_extent_state(mask);
> >  		BUG_ON(!prealloc);
> >  	}
> >@@ -1076,7 +1076,7 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
> >  	btrfs_debug_check_extent_io_range(tree, start, end);
> >
> >  again:
> >-	if (!prealloc && (mask & __GFP_WAIT)) {
> >+	if (!prealloc && (mask & __GFP_DIRECT_RECLAIM)) {
> >  		/*
> >  		 * Best effort, don't worry if extent state allocation fails
> >  		 * here for the first iteration. We might have a cached state
> >@@ -4265,7 +4265,7 @@ int try_release_extent_mapping(struct extent_map_tree *map,
> >  	u64 start = page_offset(page);
> >  	u64 end = start + PAGE_CACHE_SIZE - 1;
> >
> >-	if ((mask & __GFP_WAIT) &&
> >+	if ((mask & __GFP_DIRECT_RECLAIM) &&
> >  	    page->mapping->host->i_size > 16 * 1024 * 1024) {
> >  		u64 len;
> >  		while (start <= end) {
> 
> Why not here as well.
> 

Done

> >--- a/kernel/smp.c
> >+++ b/kernel/smp.c
> >@@ -669,7 +669,7 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
> >  	cpumask_var_t cpus;
> >  	int cpu, ret;
> >
> >-	might_sleep_if(gfp_flags & __GFP_WAIT);
> >+	might_sleep_if(gfp_flags & __GFP_DIRECT_RECLAIM);
> >
> >  	if (likely(zalloc_cpumask_var(&cpus, (gfp_flags|__GFP_NOWARN)))) {
> >  		preempt_disable();
> >diff --git a/lib/idr.c b/lib/idr.c
> >index 5335c43adf46..e5118fc82961 100644
> >--- a/lib/idr.c
> >+++ b/lib/idr.c
> >@@ -399,7 +399,7 @@ void idr_preload(gfp_t gfp_mask)
> >  	 * allocation guarantee.  Disallow usage from those contexts.
> >  	 */
> >  	WARN_ON_ONCE(in_interrupt());
> >-	might_sleep_if(gfp_mask & __GFP_WAIT);
> >+	might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
> >
> >  	preempt_disable();
> >
> >@@ -453,7 +453,7 @@ int idr_alloc(struct idr *idr, void *ptr, int start, int end, gfp_t gfp_mask)
> >  	struct idr_layer *pa[MAX_IDR_LEVEL + 1];
> >  	int id;
> >
> >-	might_sleep_if(gfp_mask & __GFP_WAIT);
> >+	might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
> >
> >  	/* sanity checks */
> >  	if (WARN_ON_ONCE(start < 0))
> >diff --git a/lib/radix-tree.c b/lib/radix-tree.c
> >index f9ebe1c82060..cc5fdc3fb734 100644
> >--- a/lib/radix-tree.c
> >+++ b/lib/radix-tree.c
> >@@ -188,7 +188,7 @@ radix_tree_node_alloc(struct radix_tree_root *root)
> >  	 * preloading in the interrupt anyway as all the allocations have to
> >  	 * be atomic. So just do normal allocation when in interrupt.
> >  	 */
> >-	if (!(gfp_mask & __GFP_WAIT) && !in_interrupt()) {
> >+	if (!(gfp_mask & __GFP_DIRECT_RECLAIM) && !in_interrupt()) {
> >  		struct radix_tree_preload *rtp;
> >
> >  		/*
> 
> These too?
> 

Yep

> >diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> >index dac5bf59309d..2056d16807de 100644
> >--- a/mm/backing-dev.c
> >+++ b/mm/backing-dev.c
> >@@ -632,7 +632,7 @@ struct bdi_writeback *wb_get_create(struct backing_dev_info *bdi,
> >  {
> >  	struct bdi_writeback *wb;
> >
> >-	might_sleep_if(gfp & __GFP_WAIT);
> >+	might_sleep_if(gfp & __GFP_DIRECT_RECLAIM);
> >
> >  	if (!memcg_css->parent)
> >  		return &bdi->wb;
> 
> ditto
> 

Indeed.

> >--- a/mm/page_alloc.c
> >+++ b/mm/page_alloc.c
> >@@ -2143,7 +2143,7 @@ static bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
> >  		return false;
> >  	if (fail_page_alloc.ignore_gfp_highmem && (gfp_mask & __GFP_HIGHMEM))
> >  		return false;
> >-	if (fail_page_alloc.ignore_gfp_wait && (gfp_mask & __GFP_WAIT))
> >+	if (fail_page_alloc.ignore_gfp_wait && (gfp_mask & (__GFP_ATOMIC|__GFP_DIRECT_RECLAIM)))
> 
> Should __GFP_ATOMIC really be here?
> 

I felt it was safer because it felt in line with the intent of
alloc.ignore_gfp_wait.

> >diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> >index 197c3f59ecbf..c5fcdd6f85b7 100644
> >--- a/net/sctp/associola.c
> >+++ b/net/sctp/associola.c
> >@@ -1588,7 +1588,7 @@ int sctp_assoc_lookup_laddr(struct sctp_association *asoc,
> >  /* Set an association id for a given association */
> >  int sctp_assoc_set_id(struct sctp_association *asoc, gfp_t gfp)
> >  {
> >-	bool preload = !!(gfp & __GFP_WAIT);
> >+	bool preload = !!(gfp & __GFP_DIRECT_RECLAIM);
> >  	int ret;
> >
> >  	/* If the id is already assigned, keep it. */
> 
> helper?
> 

Yes. Thanks very much

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Linux-MM <linux-mm@kvack.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 06/10] mm: page_alloc: Distinguish between being unable to sleep, unwilling to unwilling and avoiding waking kswapd
Date: Fri, 21 Aug 2015 21:39:42 +0100	[thread overview]
Message-ID: <20150821203942.GF12432@techsingularity.net> (raw)
In-Reply-To: <55D72ABD.8020205@suse.cz>

On Fri, Aug 21, 2015 at 03:42:21PM +0200, Vlastimil Babka wrote:
> On 08/12/2015 12:45 PM, Mel Gorman wrote:
> >__GFP_WAIT has been used to identify atomic context in callers that hold
> >spinlocks or are in interrupts. They are expected to be high priority and
> >have access one of two watermarks lower than "min". __GFP_HIGH users get
> >access to the first lower watermark and can be called the "high priority
> >reserve". Atomic users and interrupts access yet another lower watermark
> >that can be called the "atomic reserve".
> >
> >Over time, callers had a requirement to not block when fallback options
> >were available. Some have abused __GFP_WAIT leading to a situation where
> >an optimisitic allocation with a fallback option can access atomic reserves.
> >
> >This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
> >cannot sleep and have no alternative. High priority users continue to use
> >__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are
> >willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers
> >that want to wake kswapd for background reclaim. __GFP_WAIT is redefined
> >as a caller that is willing to enter direct reclaim and wake kswapd for
> >background reclaim.
> >
> >This patch then converts a number of sites
> >
> >o __GFP_ATOMIC is used by callers that are high priority and have memory
> >   pools for those requests. GFP_ATOMIC uses this flag. Callers with
> >   interrupts disabled still automatically use the atomic reserves.
> 
> Hm I can't see where the latter happens? In gfp_to_alloc_flags(),
> ALLOC_HARDER is set for __GFP_ATOMIC, or rt-tasks *not* in interrupt? What
> am I missing?
> 

It was a mistake from an earlier version of the patch that was itself
buggy. I forgot to fix the changelog properly.

> >o Callers that have a limited mempool to guarantee forward progress use
> >   __GFP_DIRECT_RECLAIM. bio allocations fall into this category where
> >   kswapd will still be woken but atomic reserves are not used as there
> >   is a one-entry mempool to guarantee progress.
> >
> >o Callers that are checking if they are non-blocking should use the
> >   helper gfpflags_allows_blocking() where possible. This is because
> 
> A bit subjective but gfpflags_allow_blocking() sounds better to me.
> Or shorter gfp_allows_blocking()?
> 

I'll use gfpflags_allow_blocking.

> >   checking for __GFP_WAIT as was done historically now can trigger false
> >   positives. Some exceptions like dm-crypt.c exist where the code intent
> >   is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
> >   flag manipulations.
> >
> >The key hazard to watch out for is callers that removed __GFP_WAIT and
> >was depending on access to atomic reserves for inconspicuous reasons.
> >In some cases it may be appropriate for them to use __GFP_HIGH.
> 
> Hm we might also have a (non-fatal) hazard of callers that directly combined
> __GFP_* flags that didn't include __GFP_WAIT, but did wake up kswapd, and
> now might be missing __GFP_KSWAPD_RECLAIM. Did you try checking for those? I
> imagine it's not a simple task...
> 

I hadn't searched but there are a small number of callers that
potentially care. I fixed them.

> >Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> 
> >
> >diff --git a/Documentation/vm/balance b/Documentation/vm/balance
> >index c46e68cf9344..6f1f6fae30f5 100644
> >--- a/Documentation/vm/balance
> >+++ b/Documentation/vm/balance
> >@@ -1,12 +1,14 @@
> >  Started Jan 2000 by Kanoj Sarcar <kanoj@sgi.com>
> >
> >-Memory balancing is needed for non __GFP_WAIT as well as for non
> >-__GFP_IO allocations.
> >+Memory balancing is needed for !__GFP_ATOMIC and !__GFP_KSWAPD_RECLAIM as
> >+well as for non __GFP_IO allocations.
> >
> >-There are two reasons to be requesting non __GFP_WAIT allocations:
> >-the caller can not sleep (typically intr context), or does not want
> >-to incur cost overheads of page stealing and possible swap io for
> >-whatever reasons.
> >+The first reason why a caller may avoid reclaim is that the caller can not
> >+sleep due to holding a spinlock or is in interrupt context. The second may
> >+be that the caller is willing to fail the allocation without incurring the
> >+overhead of page stealing. This may happen for opportunistic high-order
> 
> I think "page stealing" has nowadays a different meaning in the
> anti-fragmentation context? Should it just say "reclaim"?
> 

Good point, corrected.

> >+allocation requests that have order-0 fallback options. In such cases,
> >+the caller may also wish to avoid waking kswapd.
> >
> >  __GFP_IO allocation requests are made to prevent file system deadlocks.
> >
> >diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> >index cba12f34ff77..100d3fbaebae 100644
> >--- a/arch/arm/mm/dma-mapping.c
> >+++ b/arch/arm/mm/dma-mapping.c
> >@@ -650,7 +650,7 @@ static void *__dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
> >
> >  	if (is_coherent || nommu())
> >  		addr = __alloc_simple_buffer(dev, size, gfp, &page);
> >-	else if (!(gfp & __GFP_WAIT))
> >+	else if (gfp & __GFP_ATOMIC)
> >  		addr = __alloc_from_pool(size, &page);
> >  	else if (!dev_get_cma_area(dev))
> >  		addr = __alloc_remap_buffer(dev, size, gfp, prot, &page, caller, want_vaddr);
> >@@ -1369,7 +1369,7 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
> >  	*handle = DMA_ERROR_CODE;
> >  	size = PAGE_ALIGN(size);
> >
> >-	if (!(gfp & __GFP_WAIT))
> >+	if (gfp & __GFP_ATOMIC)
> >  		return __iommu_alloc_atomic(dev, size, handle);
> >
> >  	/*
> >diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> >index d16a1cead23f..713d963fb96b 100644
> >--- a/arch/arm64/mm/dma-mapping.c
> >+++ b/arch/arm64/mm/dma-mapping.c
> >@@ -100,7 +100,7 @@ static void *__dma_alloc_coherent(struct device *dev, size_t size,
> >  	if (IS_ENABLED(CONFIG_ZONE_DMA) &&
> >  	    dev->coherent_dma_mask <= DMA_BIT_MASK(32))
> >  		flags |= GFP_DMA;
> >-	if (IS_ENABLED(CONFIG_DMA_CMA) && (flags & __GFP_WAIT)) {
> >+	if (IS_ENABLED(CONFIG_DMA_CMA) && (flags & __GFP_DIRECT_RECLAIM)) {
> >  		struct page *page;
> >  		void *addr;
> >
> >@@ -147,7 +147,7 @@ static void *__dma_alloc(struct device *dev, size_t size,
> >
> >  	size = PAGE_ALIGN(size);
> >
> >-	if (!coherent && !(flags & __GFP_WAIT)) {
> >+	if (!coherent && (flags & __GFP_ATOMIC)) {
> >  		struct page *page = NULL;
> >  		void *addr = __alloc_from_pool(size, &page, flags);
> >
> 
> Hmm these change the lack of __GFP_WAIT to expect __GFP_ATOMIC, so it's
> potentially one of those "key hazards" mentioned in the changelog, right?
> But here it's not just about using atomic reserves, but using a completely
> different allocation function.
> E.g. in case of arch/arm/mm/dma-mapping.c:__dma_alloc() I see it can go to
> __alloc_remap_buffer -> __dma_alloc_remap -> dma_common_contiguous_remap
> which does kmalloc(..., GFP_KERNEL) and has comment "Cannot be used in
> non-sleeping contexts".
> 

I completely missed that. It needs to be a check for
__GFP_DIRECT_RECLAIM and similar in the other arm file.

> So I think callers that cannot sleep and did clear __GFP_WAIT before, are
> now dangerous unless they set __GFP_ATOMIC?
> 
> >diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
> >index 2a3973a7c441..dc611c8cad10 100644
> >--- a/drivers/firewire/core-cdev.c
> >+++ b/drivers/firewire/core-cdev.c
> >@@ -486,7 +486,7 @@ static int ioctl_get_info(struct client *client, union ioctl_arg *arg)
> >  static int add_client_resource(struct client *client,
> >  			       struct client_resource *resource, gfp_t gfp_mask)
> >  {
> >-	bool preload = !!(gfp_mask & __GFP_WAIT);
> >+	bool preload = !!(gfp_mask & __GFP_DIRECT_RECLAIM);
> 
> Use the helper here to avoid !! as a bonus?
> 

Done.

> >--- a/drivers/infiniband/core/sa_query.c
> >+++ b/drivers/infiniband/core/sa_query.c
> >@@ -619,7 +619,7 @@ static void init_mad(struct ib_sa_mad *mad, struct ib_mad_agent *agent)
> >
> >  static int send_mad(struct ib_sa_query *query, int timeout_ms, gfp_t gfp_mask)
> >  {
> >-	bool preload = !!(gfp_mask & __GFP_WAIT);
> >+	bool preload = !!(gfp_mask & __GFP_DIRECT_RECLAIM);
> >  	unsigned long flags;
> >  	int ret, id;
> >
> 
> Same here.
> 

Done.

> >diff --git a/drivers/usb/host/u132-hcd.c b/drivers/usb/host/u132-hcd.c
> >index d51687780b61..06badad3ab75 100644
> >--- a/drivers/usb/host/u132-hcd.c
> >+++ b/drivers/usb/host/u132-hcd.c
> >@@ -2247,7 +2247,7 @@ static int u132_urb_enqueue(struct usb_hcd *hcd, struct urb *urb,
> >  {
> >  	struct u132 *u132 = hcd_to_u132(hcd);
> >  	if (irqs_disabled()) {
> >-		if (__GFP_WAIT & mem_flags) {
> >+		if (__GFP_DIRECT_RECLAIM & mem_flags) {
> >  			printk(KERN_ERR "invalid context for function that migh"
> >  				"t sleep\n");
> >  			return -EINVAL;
> 
> And here - no other flag manipulations and it would match the printk.

Fixed

> >--- a/fs/btrfs/extent_io.c
> >+++ b/fs/btrfs/extent_io.c
> >@@ -594,7 +594,7 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
> >  	if (bits & (EXTENT_IOBITS | EXTENT_BOUNDARY))
> >  		clear = 1;
> >  again:
> >-	if (!prealloc && (mask & __GFP_WAIT)) {
> >+	if (!prealloc && (mask & __GFP_DIRECT_RECLAIM)) {
> >  		/*
> >  		 * Don't care for allocation failure here because we might end
> >  		 * up not needing the pre-allocated extent state at all, which
> >@@ -850,7 +850,7 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
> >
> >  	bits |= EXTENT_FIRST_DELALLOC;
> >  again:
> >-	if (!prealloc && (mask & __GFP_WAIT)) {
> >+	if (!prealloc && (mask & __GFP_DIRECT_RECLAIM)) {
> >  		prealloc = alloc_extent_state(mask);
> >  		BUG_ON(!prealloc);
> >  	}
> >@@ -1076,7 +1076,7 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
> >  	btrfs_debug_check_extent_io_range(tree, start, end);
> >
> >  again:
> >-	if (!prealloc && (mask & __GFP_WAIT)) {
> >+	if (!prealloc && (mask & __GFP_DIRECT_RECLAIM)) {
> >  		/*
> >  		 * Best effort, don't worry if extent state allocation fails
> >  		 * here for the first iteration. We might have a cached state
> >@@ -4265,7 +4265,7 @@ int try_release_extent_mapping(struct extent_map_tree *map,
> >  	u64 start = page_offset(page);
> >  	u64 end = start + PAGE_CACHE_SIZE - 1;
> >
> >-	if ((mask & __GFP_WAIT) &&
> >+	if ((mask & __GFP_DIRECT_RECLAIM) &&
> >  	    page->mapping->host->i_size > 16 * 1024 * 1024) {
> >  		u64 len;
> >  		while (start <= end) {
> 
> Why not here as well.
> 

Done

> >--- a/kernel/smp.c
> >+++ b/kernel/smp.c
> >@@ -669,7 +669,7 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
> >  	cpumask_var_t cpus;
> >  	int cpu, ret;
> >
> >-	might_sleep_if(gfp_flags & __GFP_WAIT);
> >+	might_sleep_if(gfp_flags & __GFP_DIRECT_RECLAIM);
> >
> >  	if (likely(zalloc_cpumask_var(&cpus, (gfp_flags|__GFP_NOWARN)))) {
> >  		preempt_disable();
> >diff --git a/lib/idr.c b/lib/idr.c
> >index 5335c43adf46..e5118fc82961 100644
> >--- a/lib/idr.c
> >+++ b/lib/idr.c
> >@@ -399,7 +399,7 @@ void idr_preload(gfp_t gfp_mask)
> >  	 * allocation guarantee.  Disallow usage from those contexts.
> >  	 */
> >  	WARN_ON_ONCE(in_interrupt());
> >-	might_sleep_if(gfp_mask & __GFP_WAIT);
> >+	might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
> >
> >  	preempt_disable();
> >
> >@@ -453,7 +453,7 @@ int idr_alloc(struct idr *idr, void *ptr, int start, int end, gfp_t gfp_mask)
> >  	struct idr_layer *pa[MAX_IDR_LEVEL + 1];
> >  	int id;
> >
> >-	might_sleep_if(gfp_mask & __GFP_WAIT);
> >+	might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
> >
> >  	/* sanity checks */
> >  	if (WARN_ON_ONCE(start < 0))
> >diff --git a/lib/radix-tree.c b/lib/radix-tree.c
> >index f9ebe1c82060..cc5fdc3fb734 100644
> >--- a/lib/radix-tree.c
> >+++ b/lib/radix-tree.c
> >@@ -188,7 +188,7 @@ radix_tree_node_alloc(struct radix_tree_root *root)
> >  	 * preloading in the interrupt anyway as all the allocations have to
> >  	 * be atomic. So just do normal allocation when in interrupt.
> >  	 */
> >-	if (!(gfp_mask & __GFP_WAIT) && !in_interrupt()) {
> >+	if (!(gfp_mask & __GFP_DIRECT_RECLAIM) && !in_interrupt()) {
> >  		struct radix_tree_preload *rtp;
> >
> >  		/*
> 
> These too?
> 

Yep

> >diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> >index dac5bf59309d..2056d16807de 100644
> >--- a/mm/backing-dev.c
> >+++ b/mm/backing-dev.c
> >@@ -632,7 +632,7 @@ struct bdi_writeback *wb_get_create(struct backing_dev_info *bdi,
> >  {
> >  	struct bdi_writeback *wb;
> >
> >-	might_sleep_if(gfp & __GFP_WAIT);
> >+	might_sleep_if(gfp & __GFP_DIRECT_RECLAIM);
> >
> >  	if (!memcg_css->parent)
> >  		return &bdi->wb;
> 
> ditto
> 

Indeed.

> >--- a/mm/page_alloc.c
> >+++ b/mm/page_alloc.c
> >@@ -2143,7 +2143,7 @@ static bool should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
> >  		return false;
> >  	if (fail_page_alloc.ignore_gfp_highmem && (gfp_mask & __GFP_HIGHMEM))
> >  		return false;
> >-	if (fail_page_alloc.ignore_gfp_wait && (gfp_mask & __GFP_WAIT))
> >+	if (fail_page_alloc.ignore_gfp_wait && (gfp_mask & (__GFP_ATOMIC|__GFP_DIRECT_RECLAIM)))
> 
> Should __GFP_ATOMIC really be here?
> 

I felt it was safer because it felt in line with the intent of
alloc.ignore_gfp_wait.

> >diff --git a/net/sctp/associola.c b/net/sctp/associola.c
> >index 197c3f59ecbf..c5fcdd6f85b7 100644
> >--- a/net/sctp/associola.c
> >+++ b/net/sctp/associola.c
> >@@ -1588,7 +1588,7 @@ int sctp_assoc_lookup_laddr(struct sctp_association *asoc,
> >  /* Set an association id for a given association */
> >  int sctp_assoc_set_id(struct sctp_association *asoc, gfp_t gfp)
> >  {
> >-	bool preload = !!(gfp & __GFP_WAIT);
> >+	bool preload = !!(gfp & __GFP_DIRECT_RECLAIM);
> >  	int ret;
> >
> >  	/* If the id is already assigned, keep it. */
> 
> helper?
> 

Yes. Thanks very much

-- 
Mel Gorman
SUSE Labs

next prev parent reply	other threads:[~2015-08-21 20:39 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-12 10:45 [PATCH 00/10] Remove zonelist cache and high-order watermark checking v2 Mel Gorman
2015-08-12 10:45 ` Mel Gorman
2015-08-12 10:45 ` [PATCH 01/10] mm, page_alloc: Delete the zonelist_cache Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-20 13:18   ` Michal Hocko
2015-08-20 13:18     ` Michal Hocko
2015-08-20 13:42     ` Mel Gorman
2015-08-20 13:42       ` Mel Gorman
2015-08-21  9:29       ` Michal Hocko
2015-08-21  9:29         ` Michal Hocko
2015-08-20 13:30   ` Vlastimil Babka
2015-08-20 13:30     ` Vlastimil Babka
2015-08-20 14:17     ` Mel Gorman
2015-08-20 14:17       ` Mel Gorman
2015-08-20 14:45       ` Vlastimil Babka
2015-08-20 14:45         ` Vlastimil Babka
2015-08-12 10:45 ` [PATCH 02/10] mm, page_alloc: Remove unnecessary parameter from zone_watermark_ok_safe Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-20 12:30   ` Michal Hocko
2015-08-20 12:30     ` Michal Hocko
2015-08-12 10:45 ` [PATCH 03/10] mm, page_alloc: Remove unnecessary recalculations for dirty zone balancing Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-20 12:45   ` Michal Hocko
2015-08-20 12:45     ` Michal Hocko
2015-08-20 13:45     ` Mel Gorman
2015-08-20 13:45       ` Mel Gorman
2015-08-20 14:25       ` Michal Hocko
2015-08-20 14:25         ` Michal Hocko
2015-08-12 10:45 ` [PATCH 04/10] mm, page_alloc: Remove unnecessary taking of a seqlock when cpusets are disabled Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-13  0:16   ` David Rientjes
2015-08-13  0:16     ` David Rientjes
2015-08-17 11:58     ` Mel Gorman
2015-08-17 11:58       ` Mel Gorman
2015-08-12 10:45 ` [PATCH 05/10] mm, page_alloc: Use masks and shifts when converting GFP flags to migrate types Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-12 14:45   ` Christoph Lameter
2015-08-12 14:45     ` Christoph Lameter
2015-08-12 10:45 ` [PATCH 06/10] mm: page_alloc: Distinguish between being unable to sleep, unwilling to unwilling and avoiding waking kswapd Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-12 13:22   ` Michal Hocko
2015-08-12 13:22     ` Michal Hocko
2015-08-19 14:44   ` Vlastimil Babka
2015-08-19 14:44     ` Vlastimil Babka
2015-08-20  9:14     ` Mel Gorman
2015-08-20  9:14       ` Mel Gorman
2015-08-21 13:42   ` Vlastimil Babka
2015-08-21 13:42     ` Vlastimil Babka
2015-08-21 20:39     ` Mel Gorman [this message]
2015-08-21 20:39       ` Mel Gorman
2015-08-12 10:45 ` [PATCH 07/10] mm: page_alloc: Rename __GFP_WAIT to __GFP_RECLAIM Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-20 12:28   ` Michal Hocko
2015-08-20 12:28     ` Michal Hocko
2015-08-20 12:46     ` Michal Hocko
2015-08-20 12:46       ` Michal Hocko
2015-08-21 14:20   ` Vlastimil Babka
2015-08-21 14:20     ` Vlastimil Babka
2015-08-21 20:56     ` Mel Gorman
2015-08-21 20:56       ` Mel Gorman
2015-08-12 10:45 ` [PATCH 08/10] mm, page_alloc: Remove MIGRATE_RESERVE Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-12 10:45 ` [PATCH 09/10] mm, page_alloc: Reserve pageblocks for high-order atomic allocations on demand Mel Gorman
2015-08-12 10:45   ` Mel Gorman
2015-08-12 10:45 ` [PATCH 10/10] mm, page_alloc: Only enforce watermarks for order-0 allocations Mel Gorman
2015-08-12 10:45   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150821203942.GF12432@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.