* [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
[not found] ` <20170405074700.29871-1-vbabka-AlSwsSmVLrQ@public.gmane.org>
@ 2017-04-05 7:46 ` Vlastimil Babka
2017-04-05 11:21 ` Michal Hocko
` (2 more replies)
2017-04-05 7:46 ` [PATCH 3/4] treewide: convert PF_MEMALLOC manipulations to new helpers Vlastimil Babka
2017-04-05 7:47 ` [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*() Vlastimil Babka
2 siblings, 3 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-04-05 7:46 UTC (permalink / raw)
To: Andrew Morton
Cc: nbd-general-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
linux-scsi-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA,
stable-u79uwXL29TY76Z2rM5mHXA, Michal Hocko,
linux-block-u79uwXL29TY76Z2rM5mHXA,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Johannes Weiner, Andrey Ryabinin,
open-iscsi-/JYPxA39Uh5TLH3MbocFFw, Mel Gorman, Vlastimil Babka
The function __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent
deadlock during page migration by lock_page() (see the comment in
__unmap_and_move()). Then it unconditionally clears the flag, which can clear a
pre-existing PF_MEMALLOC flag and result in recursive reclaim. This was not a
problem until commit a8161d1ed609 ("mm, page_alloc: restructure direct
compaction handling in slowpath"), because direct compation was called only
after direct reclaim, which was skipped when PF_MEMALLOC flag was set.
Even now it's only a theoretical issue, as the new callsite of
__alloc_pages_direct_compact() is reached only for costly orders and when
gfp_pfmemalloc_allowed() is true, which means either __GFP_NOMEMALLOC is in
gfp_flags or in_interrupt() is true. There is no such known context, but let's
play it safe and make __alloc_pages_direct_compact() robust for cases where
PF_MEMALLOC is already set.
Fixes: a8161d1ed609 ("mm, page_alloc: restructure direct compaction handling in slowpath")
Reported-by: Andrey Ryabinin <aryabinin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>
Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
---
mm/page_alloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3589f8be53be..b84e6ffbe756 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3288,6 +3288,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
enum compact_priority prio, enum compact_result *compact_result)
{
struct page *page;
+ unsigned int noreclaim_flag = current->flags & PF_MEMALLOC;
if (!order)
return NULL;
@@ -3295,7 +3296,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
current->flags |= PF_MEMALLOC;
*compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
prio);
- current->flags &= ~PF_MEMALLOC;
+ current->flags = (current->flags & ~PF_MEMALLOC) | noreclaim_flag;
if (*compact_result <= COMPACT_INACTIVE)
return NULL;
--
2.12.2
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
2017-04-05 7:46 ` [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC Vlastimil Babka
@ 2017-04-05 11:21 ` Michal Hocko
2017-04-05 11:40 ` Andrey Ryabinin
2017-04-07 7:33 ` Hillf Danton
2 siblings, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-04-05 11:21 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman,
Johannes Weiner, linux-block, nbd-general, open-iscsi, linux-scsi,
netdev, stable, Andrey Ryabinin
On Wed 05-04-17 09:46:57, Vlastimil Babka wrote:
> The function __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent
> deadlock during page migration by lock_page() (see the comment in
> __unmap_and_move()). Then it unconditionally clears the flag, which can clear a
> pre-existing PF_MEMALLOC flag and result in recursive reclaim. This was not a
> problem until commit a8161d1ed609 ("mm, page_alloc: restructure direct
> compaction handling in slowpath"), because direct compation was called only
> after direct reclaim, which was skipped when PF_MEMALLOC flag was set.
>
> Even now it's only a theoretical issue, as the new callsite of
> __alloc_pages_direct_compact() is reached only for costly orders and when
> gfp_pfmemalloc_allowed() is true, which means either __GFP_NOMEMALLOC is in
> gfp_flags or in_interrupt() is true. There is no such known context, but let's
> play it safe and make __alloc_pages_direct_compact() robust for cases where
> PF_MEMALLOC is already set.
>
> Fixes: a8161d1ed609 ("mm, page_alloc: restructure direct compaction handling in slowpath")
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: <stable@vger.kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/page_alloc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3589f8be53be..b84e6ffbe756 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3288,6 +3288,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> enum compact_priority prio, enum compact_result *compact_result)
> {
> struct page *page;
> + unsigned int noreclaim_flag = current->flags & PF_MEMALLOC;
>
> if (!order)
> return NULL;
> @@ -3295,7 +3296,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> current->flags |= PF_MEMALLOC;
> *compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
> prio);
> - current->flags &= ~PF_MEMALLOC;
> + current->flags = (current->flags & ~PF_MEMALLOC) | noreclaim_flag;
>
> if (*compact_result <= COMPACT_INACTIVE)
> return NULL;
> --
> 2.12.2
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
2017-04-05 7:46 ` [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC Vlastimil Babka
2017-04-05 11:21 ` Michal Hocko
@ 2017-04-05 11:40 ` Andrey Ryabinin
2017-04-07 9:21 ` Vlastimil Babka
2017-04-07 7:33 ` Hillf Danton
2 siblings, 1 reply; 20+ messages in thread
From: Andrey Ryabinin @ 2017-04-05 11:40 UTC (permalink / raw)
To: Vlastimil Babka, Andrew Morton
Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Johannes Weiner,
linux-block, nbd-general, open-iscsi, linux-scsi, netdev, stable
On 04/05/2017 10:46 AM, Vlastimil Babka wrote:
> The function __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent
> deadlock during page migration by lock_page() (see the comment in
> __unmap_and_move()). Then it unconditionally clears the flag, which can clear a
> pre-existing PF_MEMALLOC flag and result in recursive reclaim. This was not a
> problem until commit a8161d1ed609 ("mm, page_alloc: restructure direct
> compaction handling in slowpath"), because direct compation was called only
> after direct reclaim, which was skipped when PF_MEMALLOC flag was set.
>
> Even now it's only a theoretical issue, as the new callsite of
> __alloc_pages_direct_compact() is reached only for costly orders and when
> gfp_pfmemalloc_allowed() is true, which means either __GFP_NOMEMALLOC is in
is false
> gfp_flags or in_interrupt() is true. There is no such known context, but let's
> play it safe and make __alloc_pages_direct_compact() robust for cases where
> PF_MEMALLOC is already set.
>
> Fixes: a8161d1ed609 ("mm, page_alloc: restructure direct compaction handling in slowpath")
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: <stable@vger.kernel.org>
> ---
> mm/page_alloc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3589f8be53be..b84e6ffbe756 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3288,6 +3288,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> enum compact_priority prio, enum compact_result *compact_result)
> {
> struct page *page;
> + unsigned int noreclaim_flag = current->flags & PF_MEMALLOC;
>
> if (!order)
> return NULL;
> @@ -3295,7 +3296,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> current->flags |= PF_MEMALLOC;
> *compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
> prio);
> - current->flags &= ~PF_MEMALLOC;
> + current->flags = (current->flags & ~PF_MEMALLOC) | noreclaim_flag;
Perhaps this would look better:
tsk_restore_flags(current, noreclaim_flag, PF_MEMALLOC);
?
> if (*compact_result <= COMPACT_INACTIVE)
> return NULL;
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
2017-04-05 11:40 ` Andrey Ryabinin
@ 2017-04-07 9:21 ` Vlastimil Babka
0 siblings, 0 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-04-07 9:21 UTC (permalink / raw)
To: Andrey Ryabinin, Andrew Morton
Cc: linux-mm, linux-kernel, Michal Hocko, Mel Gorman, Johannes Weiner,
linux-block, nbd-general, open-iscsi, linux-scsi, netdev, stable
On 04/05/2017 01:40 PM, Andrey Ryabinin wrote:
> On 04/05/2017 10:46 AM, Vlastimil Babka wrote:
>> The function __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent
>> deadlock during page migration by lock_page() (see the comment in
>> __unmap_and_move()). Then it unconditionally clears the flag, which can clear a
>> pre-existing PF_MEMALLOC flag and result in recursive reclaim. This was not a
>> problem until commit a8161d1ed609 ("mm, page_alloc: restructure direct
>> compaction handling in slowpath"), because direct compation was called only
>> after direct reclaim, which was skipped when PF_MEMALLOC flag was set.
>>
>> Even now it's only a theoretical issue, as the new callsite of
>> __alloc_pages_direct_compact() is reached only for costly orders and when
>> gfp_pfmemalloc_allowed() is true, which means either __GFP_NOMEMALLOC is in
> is false
>
>> gfp_flags or in_interrupt() is true. There is no such known context, but let's
>> play it safe and make __alloc_pages_direct_compact() robust for cases where
>> PF_MEMALLOC is already set.
>>
>> Fixes: a8161d1ed609 ("mm, page_alloc: restructure direct compaction handling in slowpath")
>> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> Cc: <stable@vger.kernel.org>
>> ---
>> mm/page_alloc.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 3589f8be53be..b84e6ffbe756 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3288,6 +3288,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
>> enum compact_priority prio, enum compact_result *compact_result)
>> {
>> struct page *page;
>> + unsigned int noreclaim_flag = current->flags & PF_MEMALLOC;
>>
>> if (!order)
>> return NULL;
>> @@ -3295,7 +3296,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
>> current->flags |= PF_MEMALLOC;
>> *compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
>> prio);
>> - current->flags &= ~PF_MEMALLOC;
>> + current->flags = (current->flags & ~PF_MEMALLOC) | noreclaim_flag;
>
> Perhaps this would look better:
>
> tsk_restore_flags(current, noreclaim_flag, PF_MEMALLOC);
>
> ?
Well, I didn't care much considering this is for stable only, and patch 2/4
rewrites this to the new api.
>> if (*compact_result <= COMPACT_INACTIVE)
>> return NULL;
>>
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
2017-04-05 7:46 ` [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC Vlastimil Babka
2017-04-05 11:21 ` Michal Hocko
2017-04-05 11:40 ` Andrey Ryabinin
@ 2017-04-07 7:33 ` Hillf Danton
2 siblings, 0 replies; 20+ messages in thread
From: Hillf Danton @ 2017-04-07 7:33 UTC (permalink / raw)
To: 'Vlastimil Babka', 'Andrew Morton'
Cc: linux-mm, linux-kernel, 'Michal Hocko',
'Mel Gorman', 'Johannes Weiner', linux-block,
nbd-general, open-iscsi, linux-scsi, netdev, stable,
'Andrey Ryabinin'
On April 05, 2017 3:47 PM Vlastimil Babka wrote:
>
> The function __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent
> deadlock during page migration by lock_page() (see the comment in
> __unmap_and_move()). Then it unconditionally clears the flag, which can clear a
> pre-existing PF_MEMALLOC flag and result in recursive reclaim. This was not a
> problem until commit a8161d1ed609 ("mm, page_alloc: restructure direct
> compaction handling in slowpath"), because direct compation was called only
> after direct reclaim, which was skipped when PF_MEMALLOC flag was set.
>
> Even now it's only a theoretical issue, as the new callsite of
> __alloc_pages_direct_compact() is reached only for costly orders and when
> gfp_pfmemalloc_allowed() is true, which means either __GFP_NOMEMALLOC is in
> gfp_flags or in_interrupt() is true. There is no such known context, but let's
> play it safe and make __alloc_pages_direct_compact() robust for cases where
> PF_MEMALLOC is already set.
>
> Fixes: a8161d1ed609 ("mm, page_alloc: restructure direct compaction handling in slowpath")
> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: <stable@vger.kernel.org>
> ---
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
> mm/page_alloc.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3589f8be53be..b84e6ffbe756 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3288,6 +3288,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> enum compact_priority prio, enum compact_result *compact_result)
> {
> struct page *page;
> + unsigned int noreclaim_flag = current->flags & PF_MEMALLOC;
>
> if (!order)
> return NULL;
> @@ -3295,7 +3296,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> current->flags |= PF_MEMALLOC;
> *compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
> prio);
> - current->flags &= ~PF_MEMALLOC;
> + current->flags = (current->flags & ~PF_MEMALLOC) | noreclaim_flag;
>
> if (*compact_result <= COMPACT_INACTIVE)
> return NULL;
> --
> 2.12.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 3/4] treewide: convert PF_MEMALLOC manipulations to new helpers
[not found] ` <20170405074700.29871-1-vbabka-AlSwsSmVLrQ@public.gmane.org>
2017-04-05 7:46 ` [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC Vlastimil Babka
@ 2017-04-05 7:46 ` Vlastimil Babka
[not found] ` <20170405074700.29871-4-vbabka-AlSwsSmVLrQ@public.gmane.org>
2017-04-05 7:47 ` [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*() Vlastimil Babka
2 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2017-04-05 7:46 UTC (permalink / raw)
To: Andrew Morton
Cc: nbd-general-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Chris Leech,
linux-scsi-u79uwXL29TY76Z2rM5mHXA, Josef Bacik,
netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Michal Hocko,
linux-block-u79uwXL29TY76Z2rM5mHXA,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Eric Dumazet, Lee Duncan,
Johannes Weiner, open-iscsi-/JYPxA39Uh5TLH3MbocFFw, Mel Gorman,
David S. Miller, Vlastimil Babka
We now have memalloc_noreclaim_{save,restore} helpers for robust setting and
clearing of PF_MEMALLOC. Let's convert the code which was using the generic
tsk_restore_flags(). No functional change.
Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>
Cc: Josef Bacik <jbacik-b10kYP2dOMg@public.gmane.org>
Cc: Lee Duncan <lduncan-IBi9RG/b67k@public.gmane.org>
Cc: Chris Leech <cleech-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
---
drivers/block/nbd.c | 7 ++++---
drivers/scsi/iscsi_tcp.c | 7 ++++---
net/core/dev.c | 7 ++++---
net/core/sock.c | 7 ++++---
4 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 03ae72985c79..929fc548c7fb 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -18,6 +18,7 @@
#include <linux/module.h>
#include <linux/init.h>
#include <linux/sched.h>
+#include <linux/sched/mm.h>
#include <linux/fs.h>
#include <linux/bio.h>
#include <linux/stat.h>
@@ -210,7 +211,7 @@ static int sock_xmit(struct nbd_device *nbd, int index, int send,
struct socket *sock = nbd->socks[index]->sock;
int result;
struct msghdr msg;
- unsigned long pflags = current->flags;
+ unsigned int noreclaim_flag;
if (unlikely(!sock)) {
dev_err_ratelimited(disk_to_dev(nbd->disk),
@@ -221,7 +222,7 @@ static int sock_xmit(struct nbd_device *nbd, int index, int send,
msg.msg_iter = *iter;
- current->flags |= PF_MEMALLOC;
+ noreclaim_flag = memalloc_noreclaim_save();
do {
sock->sk->sk_allocation = GFP_NOIO | __GFP_MEMALLOC;
msg.msg_name = NULL;
@@ -244,7 +245,7 @@ static int sock_xmit(struct nbd_device *nbd, int index, int send,
*sent += result;
} while (msg_data_left(&msg));
- tsk_restore_flags(current, pflags, PF_MEMALLOC);
+ memalloc_noreclaim_restore(noreclaim_flag);
return result;
}
diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 4228aba1f654..4842fc0e809d 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -30,6 +30,7 @@
#include <linux/types.h>
#include <linux/inet.h>
#include <linux/slab.h>
+#include <linux/sched/mm.h>
#include <linux/file.h>
#include <linux/blkdev.h>
#include <linux/delay.h>
@@ -371,10 +372,10 @@ static inline int iscsi_sw_tcp_xmit_qlen(struct iscsi_conn *conn)
static int iscsi_sw_tcp_pdu_xmit(struct iscsi_task *task)
{
struct iscsi_conn *conn = task->conn;
- unsigned long pflags = current->flags;
+ unsigned int noreclaim_flag;
int rc = 0;
- current->flags |= PF_MEMALLOC;
+ noreclaim_flag = memalloc_noreclaim_save();
while (iscsi_sw_tcp_xmit_qlen(conn)) {
rc = iscsi_sw_tcp_xmit(conn);
@@ -387,7 +388,7 @@ static int iscsi_sw_tcp_pdu_xmit(struct iscsi_task *task)
rc = 0;
}
- tsk_restore_flags(current, pflags, PF_MEMALLOC);
+ memalloc_noreclaim_restore(noreclaim_flag);
return rc;
}
diff --git a/net/core/dev.c b/net/core/dev.c
index fde8b3f7136b..e0705a126b24 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -81,6 +81,7 @@
#include <linux/hash.h>
#include <linux/slab.h>
#include <linux/sched.h>
+#include <linux/sched/mm.h>
#include <linux/mutex.h>
#include <linux/string.h>
#include <linux/mm.h>
@@ -4227,7 +4228,7 @@ static int __netif_receive_skb(struct sk_buff *skb)
int ret;
if (sk_memalloc_socks() && skb_pfmemalloc(skb)) {
- unsigned long pflags = current->flags;
+ unsigned int noreclaim_flag;
/*
* PFMEMALLOC skbs are special, they should
@@ -4238,9 +4239,9 @@ static int __netif_receive_skb(struct sk_buff *skb)
* Use PF_MEMALLOC as this saves us from propagating the allocation
* context down to all allocation sites.
*/
- current->flags |= PF_MEMALLOC;
+ noreclaim_flag = memalloc_noreclaim_save();
ret = __netif_receive_skb_core(skb, true);
- tsk_restore_flags(current, pflags, PF_MEMALLOC);
+ memalloc_noreclaim_restore(noreclaim_flag);
} else
ret = __netif_receive_skb_core(skb, false);
diff --git a/net/core/sock.c b/net/core/sock.c
index 392f9b6f96e2..0b2d06b4c308 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -102,6 +102,7 @@
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/sched.h>
+#include <linux/sched/mm.h>
#include <linux/timer.h>
#include <linux/string.h>
#include <linux/sockios.h>
@@ -372,14 +373,14 @@ EXPORT_SYMBOL_GPL(sk_clear_memalloc);
int __sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
{
int ret;
- unsigned long pflags = current->flags;
+ unsigned int noreclaim_flag;
/* these should have been dropped before queueing */
BUG_ON(!sock_flag(sk, SOCK_MEMALLOC));
- current->flags |= PF_MEMALLOC;
+ noreclaim_flag = memalloc_noreclaim_save();
ret = sk->sk_backlog_rcv(sk, skb);
- tsk_restore_flags(current, pflags, PF_MEMALLOC);
+ memalloc_noreclaim_restore(noreclaim_flag);
return ret;
}
--
2.12.2
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()
[not found] ` <20170405074700.29871-1-vbabka-AlSwsSmVLrQ@public.gmane.org>
2017-04-05 7:46 ` [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC Vlastimil Babka
2017-04-05 7:46 ` [PATCH 3/4] treewide: convert PF_MEMALLOC manipulations to new helpers Vlastimil Babka
@ 2017-04-05 7:47 ` Vlastimil Babka
2017-04-05 11:31 ` Michal Hocko
2 siblings, 1 reply; 20+ messages in thread
From: Vlastimil Babka @ 2017-04-05 7:47 UTC (permalink / raw)
To: Andrew Morton
Cc: nbd-general-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Boris Brezillon,
Richard Weinberger, linux-scsi-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA, Michal Hocko,
linux-block-u79uwXL29TY76Z2rM5mHXA,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Johannes Weiner,
open-iscsi-/JYPxA39Uh5TLH3MbocFFw, Mel Gorman, Vlastimil Babka
Nandsim has own functions set_memalloc() and clear_memalloc() for robust
setting and clearing of PF_MEMALLOC. Replace them by the new generic helpers.
No functional change.
Signed-off-by: Vlastimil Babka <vbabka-AlSwsSmVLrQ@public.gmane.org>
Cc: Boris Brezillon <boris.brezillon-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8@public.gmane.org>
Cc: Richard Weinberger <richard-/L3Ra7n9ekc@public.gmane.org>
---
drivers/mtd/nand/nandsim.c | 29 +++++++++--------------------
1 file changed, 9 insertions(+), 20 deletions(-)
diff --git a/drivers/mtd/nand/nandsim.c b/drivers/mtd/nand/nandsim.c
index cef818f535ed..03a0d057bf2f 100644
--- a/drivers/mtd/nand/nandsim.c
+++ b/drivers/mtd/nand/nandsim.c
@@ -40,6 +40,7 @@
#include <linux/list.h>
#include <linux/random.h>
#include <linux/sched.h>
+#include <linux/sched/mm.h>
#include <linux/fs.h>
#include <linux/pagemap.h>
#include <linux/seq_file.h>
@@ -1368,31 +1369,18 @@ static int get_pages(struct nandsim *ns, struct file *file, size_t count, loff_t
return 0;
}
-static int set_memalloc(void)
-{
- if (current->flags & PF_MEMALLOC)
- return 0;
- current->flags |= PF_MEMALLOC;
- return 1;
-}
-
-static void clear_memalloc(int memalloc)
-{
- if (memalloc)
- current->flags &= ~PF_MEMALLOC;
-}
-
static ssize_t read_file(struct nandsim *ns, struct file *file, void *buf, size_t count, loff_t pos)
{
ssize_t tx;
- int err, memalloc;
+ int err;
+ unsigned int noreclaim_flag;
err = get_pages(ns, file, count, pos);
if (err)
return err;
- memalloc = set_memalloc();
+ noreclaim_flag = memalloc_noreclaim_save();
tx = kernel_read(file, pos, buf, count);
- clear_memalloc(memalloc);
+ memalloc_noreclaim_restore(noreclaim_flag);
put_pages(ns);
return tx;
}
@@ -1400,14 +1388,15 @@ static ssize_t read_file(struct nandsim *ns, struct file *file, void *buf, size_
static ssize_t write_file(struct nandsim *ns, struct file *file, void *buf, size_t count, loff_t pos)
{
ssize_t tx;
- int err, memalloc;
+ int err;
+ unsigned int noreclaim_flag;
err = get_pages(ns, file, count, pos);
if (err)
return err;
- memalloc = set_memalloc();
+ noreclaim_flag = memalloc_noreclaim_save();
tx = kernel_write(file, buf, count, pos);
- clear_memalloc(memalloc);
+ memalloc_noreclaim_restore(noreclaim_flag);
put_pages(ns);
return tx;
}
--
2.12.2
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()
2017-04-05 7:47 ` [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*() Vlastimil Babka
@ 2017-04-05 11:31 ` Michal Hocko
2017-04-05 11:36 ` Richard Weinberger
0 siblings, 1 reply; 20+ messages in thread
From: Michal Hocko @ 2017-04-05 11:31 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman,
Johannes Weiner, linux-block, nbd-general, open-iscsi, linux-scsi,
netdev, Boris Brezillon, Richard Weinberger
On Wed 05-04-17 09:47:00, Vlastimil Babka wrote:
> Nandsim has own functions set_memalloc() and clear_memalloc() for robust
> setting and clearing of PF_MEMALLOC. Replace them by the new generic helpers.
> No functional change.
This one smells like an abuser. Why the hell should read/write path
touch memory reserves at all!
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
> Cc: Richard Weinberger <richard@nod.at>
> ---
> drivers/mtd/nand/nandsim.c | 29 +++++++++--------------------
> 1 file changed, 9 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/mtd/nand/nandsim.c b/drivers/mtd/nand/nandsim.c
> index cef818f535ed..03a0d057bf2f 100644
> --- a/drivers/mtd/nand/nandsim.c
> +++ b/drivers/mtd/nand/nandsim.c
> @@ -40,6 +40,7 @@
> #include <linux/list.h>
> #include <linux/random.h>
> #include <linux/sched.h>
> +#include <linux/sched/mm.h>
> #include <linux/fs.h>
> #include <linux/pagemap.h>
> #include <linux/seq_file.h>
> @@ -1368,31 +1369,18 @@ static int get_pages(struct nandsim *ns, struct file *file, size_t count, loff_t
> return 0;
> }
>
> -static int set_memalloc(void)
> -{
> - if (current->flags & PF_MEMALLOC)
> - return 0;
> - current->flags |= PF_MEMALLOC;
> - return 1;
> -}
> -
> -static void clear_memalloc(int memalloc)
> -{
> - if (memalloc)
> - current->flags &= ~PF_MEMALLOC;
> -}
> -
> static ssize_t read_file(struct nandsim *ns, struct file *file, void *buf, size_t count, loff_t pos)
> {
> ssize_t tx;
> - int err, memalloc;
> + int err;
> + unsigned int noreclaim_flag;
>
> err = get_pages(ns, file, count, pos);
> if (err)
> return err;
> - memalloc = set_memalloc();
> + noreclaim_flag = memalloc_noreclaim_save();
> tx = kernel_read(file, pos, buf, count);
> - clear_memalloc(memalloc);
> + memalloc_noreclaim_restore(noreclaim_flag);
> put_pages(ns);
> return tx;
> }
> @@ -1400,14 +1388,15 @@ static ssize_t read_file(struct nandsim *ns, struct file *file, void *buf, size_
> static ssize_t write_file(struct nandsim *ns, struct file *file, void *buf, size_t count, loff_t pos)
> {
> ssize_t tx;
> - int err, memalloc;
> + int err;
> + unsigned int noreclaim_flag;
>
> err = get_pages(ns, file, count, pos);
> if (err)
> return err;
> - memalloc = set_memalloc();
> + noreclaim_flag = memalloc_noreclaim_save();
> tx = kernel_write(file, buf, count, pos);
> - clear_memalloc(memalloc);
> + memalloc_noreclaim_restore(noreclaim_flag);
> put_pages(ns);
> return tx;
> }
> --
> 2.12.2
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()
2017-04-05 11:31 ` Michal Hocko
@ 2017-04-05 11:36 ` Richard Weinberger
2017-04-05 11:39 ` Vlastimil Babka
0 siblings, 1 reply; 20+ messages in thread
From: Richard Weinberger @ 2017-04-05 11:36 UTC (permalink / raw)
To: Michal Hocko, Vlastimil Babka
Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman,
Johannes Weiner, linux-block, nbd-general, open-iscsi, linux-scsi,
netdev, Boris Brezillon, Adrian Hunter
Michal,
Am 05.04.2017 um 13:31 schrieb Michal Hocko:
> On Wed 05-04-17 09:47:00, Vlastimil Babka wrote:
>> Nandsim has own functions set_memalloc() and clear_memalloc() for robust
>> setting and clearing of PF_MEMALLOC. Replace them by the new generic helpers.
>> No functional change.
>
> This one smells like an abuser. Why the hell should read/write path
> touch memory reserves at all!
Could be. Let's ask Adrian, AFAIK he wrote that code.
Adrian, can you please clarify why nandsim needs to play with PF_MEMALLOC?
Thanks,
//richard
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()
2017-04-05 11:36 ` Richard Weinberger
@ 2017-04-05 11:39 ` Vlastimil Babka
2017-04-05 12:09 ` Michal Hocko
2017-04-06 6:33 ` Adrian Hunter
0 siblings, 2 replies; 20+ messages in thread
From: Vlastimil Babka @ 2017-04-05 11:39 UTC (permalink / raw)
To: Richard Weinberger, Michal Hocko
Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman,
Johannes Weiner, linux-block, nbd-general, open-iscsi, linux-scsi,
netdev, Boris Brezillon, Adrian Hunter
On 04/05/2017 01:36 PM, Richard Weinberger wrote:
> Michal,
>
> Am 05.04.2017 um 13:31 schrieb Michal Hocko:
>> On Wed 05-04-17 09:47:00, Vlastimil Babka wrote:
>>> Nandsim has own functions set_memalloc() and clear_memalloc() for robust
>>> setting and clearing of PF_MEMALLOC. Replace them by the new generic helpers.
>>> No functional change.
>>
>> This one smells like an abuser. Why the hell should read/write path
>> touch memory reserves at all!
>
> Could be. Let's ask Adrian, AFAIK he wrote that code.
> Adrian, can you please clarify why nandsim needs to play with PF_MEMALLOC?
I was thinking about it and concluded that since the simulator can be
used as a block device where reclaimed pages go to, writing the data out
is a memalloc operation. Then reading can be called as part of r-m-w
cycle, so reading as well. But it would be great if somebody more
knowledgeable confirmed this.
> Thanks,
> //richard
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()
2017-04-05 11:39 ` Vlastimil Babka
@ 2017-04-05 12:09 ` Michal Hocko
2017-04-06 6:33 ` Adrian Hunter
1 sibling, 0 replies; 20+ messages in thread
From: Michal Hocko @ 2017-04-05 12:09 UTC (permalink / raw)
To: Vlastimil Babka
Cc: Richard Weinberger, Andrew Morton, linux-mm, linux-kernel,
Mel Gorman, Johannes Weiner, linux-block, nbd-general, open-iscsi,
linux-scsi, netdev, Boris Brezillon, Adrian Hunter
On Wed 05-04-17 13:39:16, Vlastimil Babka wrote:
> On 04/05/2017 01:36 PM, Richard Weinberger wrote:
> > Michal,
> >
> > Am 05.04.2017 um 13:31 schrieb Michal Hocko:
> >> On Wed 05-04-17 09:47:00, Vlastimil Babka wrote:
> >>> Nandsim has own functions set_memalloc() and clear_memalloc() for robust
> >>> setting and clearing of PF_MEMALLOC. Replace them by the new generic helpers.
> >>> No functional change.
> >>
> >> This one smells like an abuser. Why the hell should read/write path
> >> touch memory reserves at all!
> >
> > Could be. Let's ask Adrian, AFAIK he wrote that code.
> > Adrian, can you please clarify why nandsim needs to play with PF_MEMALLOC?
>
> I was thinking about it and concluded that since the simulator can be
> used as a block device where reclaimed pages go to, writing the data out
> is a memalloc operation. Then reading can be called as part of r-m-w
> cycle, so reading as well. But it would be great if somebody more
> knowledgeable confirmed this.
then this deserves a big fat comment explaining all the details,
including how the complete depletion of reserves is prevented.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()
2017-04-05 11:39 ` Vlastimil Babka
2017-04-05 12:09 ` Michal Hocko
@ 2017-04-06 6:33 ` Adrian Hunter
[not found] ` <fe1c21a4-0bc6-529c-5446-382b01d4c99e-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
1 sibling, 1 reply; 20+ messages in thread
From: Adrian Hunter @ 2017-04-06 6:33 UTC (permalink / raw)
To: Vlastimil Babka, Richard Weinberger, Michal Hocko
Cc: Andrew Morton, linux-mm, linux-kernel, Mel Gorman,
Johannes Weiner, linux-block, nbd-general, open-iscsi, linux-scsi,
netdev, Boris Brezillon
On 05/04/17 14:39, Vlastimil Babka wrote:
> On 04/05/2017 01:36 PM, Richard Weinberger wrote:
>> Michal,
>>
>> Am 05.04.2017 um 13:31 schrieb Michal Hocko:
>>> On Wed 05-04-17 09:47:00, Vlastimil Babka wrote:
>>>> Nandsim has own functions set_memalloc() and clear_memalloc() for robust
>>>> setting and clearing of PF_MEMALLOC. Replace them by the new generic helpers.
>>>> No functional change.
>>>
>>> This one smells like an abuser. Why the hell should read/write path
>>> touch memory reserves at all!
>>
>> Could be. Let's ask Adrian, AFAIK he wrote that code.
>> Adrian, can you please clarify why nandsim needs to play with PF_MEMALLOC?
>
> I was thinking about it and concluded that since the simulator can be
> used as a block device where reclaimed pages go to, writing the data out
> is a memalloc operation. Then reading can be called as part of r-m-w
> cycle, so reading as well.
IIRC it was to avoid getting stuck with nandsim waiting on memory reclaim
and memory reclaim waiting on nandsim.
^ permalink raw reply [flat|nested] 20+ messages in thread