* [PATCH -mmotm 00/30] [RFC] swap over nfs -v21
@ 2010-07-13 10:16 Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 01/30] mm: serialize access to min_free_kbytes Xiaotian Feng
` (30 more replies)
0 siblings, 31 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:16 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
Hi,
Here's the latest version of swap over NFS series since -v20 last October. We decide to push
this feature as it is useful for NAS or virt environment.
The patches are against the mmotm-2010-07-01. We can split the patchset into following parts:
Patch 1 - 12: provides a generic reserve framework. This framework
could also be used to get rid of some of the __GFP_NOFAIL users.
Patch 13 - 15: Provide some generic network infrastructure needed later on.
Patch 16 - 21: reserve a little pool to act as a receive buffer, this allows us to
inspect packets before tossing them.
Patch 22 - 23: Generic vm infrastructure to handle swapping to a filesystem instead of a block
device.
Patch 24 - 27: convert NFS to make use of the new network and vm infrastructure to
provide swap over NFS.
Patch 28 - 30: minor bug fixing with latest -mmotm.
[some history]
v19: http://lwn.net/Articles/301915/
v20: http://lwn.net/Articles/355350/
Changes since v20:
- rebased to mmotm-2010-07-01
- dropped the null pointer deref patch for the root cause is wrong SWP_FILE enum
- some minor build fixes
- fix a null pointer deref with mmotm-2010-07-01
- fix a bug when swap with multi files on the same nfs server
Regards
Xiaotian
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 01/30] mm: serialize access to min_free_kbytes
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
@ 2010-07-13 10:17 ` Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 02/30] Swap over network documentation Xiaotian Feng
` (29 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:17 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 02/30] Swap over network documentation
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 01/30] mm: serialize access to min_free_kbytes Xiaotian Feng
@ 2010-07-13 10:17 ` Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 03/30] mm: expose gfp_to_alloc_flags() Xiaotian Feng
` (28 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:17 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 03/30] mm: expose gfp_to_alloc_flags()
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 01/30] mm: serialize access to min_free_kbytes Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 02/30] Swap over network documentation Xiaotian Feng
@ 2010-07-13 10:17 ` Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 04/30] mm: tag reseve pages Xiaotian Feng
` (27 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:17 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 04/30] mm: tag reseve pages
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (2 preceding siblings ...)
2010-07-13 10:17 ` [PATCH -mmotm 03/30] mm: expose gfp_to_alloc_flags() Xiaotian Feng
@ 2010-07-13 10:17 ` Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages Xiaotian Feng
` (26 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:17 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (3 preceding siblings ...)
2010-07-13 10:17 ` [PATCH -mmotm 04/30] mm: tag reseve pages Xiaotian Feng
@ 2010-07-13 10:17 ` Xiaotian Feng
2010-07-13 20:33 ` Pekka Enberg
2010-07-13 10:17 ` [PATCH -mmotm 06/30] mm: kmem_alloc_estimate() Xiaotian Feng
` (25 subsequent siblings)
30 siblings, 1 reply; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:17 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 06/30] mm: kmem_alloc_estimate()
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (4 preceding siblings ...)
2010-07-13 10:17 ` [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages Xiaotian Feng
@ 2010-07-13 10:17 ` Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 07/30] mm: allow PF_MEMALLOC from softirq context Xiaotian Feng
` (24 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:17 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 07/30] mm: allow PF_MEMALLOC from softirq context
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (5 preceding siblings ...)
2010-07-13 10:17 ` [PATCH -mmotm 06/30] mm: kmem_alloc_estimate() Xiaotian Feng
@ 2010-07-13 10:18 ` Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 08/30] mm: emergency pool Xiaotian Feng
` (23 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:18 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 08/30] mm: emergency pool
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (6 preceding siblings ...)
2010-07-13 10:18 ` [PATCH -mmotm 07/30] mm: allow PF_MEMALLOC from softirq context Xiaotian Feng
@ 2010-07-13 10:18 ` Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 09/30] mm: system wide ALLOC_NO_WATERMARK Xiaotian Feng
` (22 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:18 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 09/30] mm: system wide ALLOC_NO_WATERMARK
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (7 preceding siblings ...)
2010-07-13 10:18 ` [PATCH -mmotm 08/30] mm: emergency pool Xiaotian Feng
@ 2010-07-13 10:18 ` Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 10/30] mm: __GFP_MEMALLOC Xiaotian Feng
` (21 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:18 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 10/30] mm: __GFP_MEMALLOC
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (8 preceding siblings ...)
2010-07-13 10:18 ` [PATCH -mmotm 09/30] mm: system wide ALLOC_NO_WATERMARK Xiaotian Feng
@ 2010-07-13 10:18 ` Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 11/30] mm: memory reserve management Xiaotian Feng
` (20 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:18 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 11/30] mm: memory reserve management
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (9 preceding siblings ...)
2010-07-13 10:18 ` [PATCH -mmotm 10/30] mm: __GFP_MEMALLOC Xiaotian Feng
@ 2010-07-13 10:18 ` Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical Xiaotian Feng
` (19 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:18 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (10 preceding siblings ...)
2010-07-13 10:18 ` [PATCH -mmotm 11/30] mm: memory reserve management Xiaotian Feng
@ 2010-07-13 10:19 ` Xiaotian Feng
2010-07-13 10:55 ` Mitchell Erblich
2010-07-13 10:19 ` [PATCH -mmotm 13/30] net: packet split receive api Xiaotian Feng
` (18 subsequent siblings)
30 siblings, 1 reply; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:19 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 13/30] net: packet split receive api
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (11 preceding siblings ...)
2010-07-13 10:19 ` [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical Xiaotian Feng
@ 2010-07-13 10:19 ` Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 14/30] net: sk_allocation() - concentrate socket related allocations Xiaotian Feng
` (17 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:19 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 14/30] net: sk_allocation() - concentrate socket related allocations
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (12 preceding siblings ...)
2010-07-13 10:19 ` [PATCH -mmotm 13/30] net: packet split receive api Xiaotian Feng
@ 2010-07-13 10:19 ` Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 15/30] netvm: network reserve infrastructure Xiaotian Feng
` (16 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:19 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 15/30] netvm: network reserve infrastructure
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (13 preceding siblings ...)
2010-07-13 10:19 ` [PATCH -mmotm 14/30] net: sk_allocation() - concentrate socket related allocations Xiaotian Feng
@ 2010-07-13 10:19 ` Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 16/30] netvm: INET reserves Xiaotian Feng
` (15 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:19 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 16/30] netvm: INET reserves
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (14 preceding siblings ...)
2010-07-13 10:19 ` [PATCH -mmotm 15/30] netvm: network reserve infrastructure Xiaotian Feng
@ 2010-07-13 10:19 ` Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 17/30] netvm: hook skb allocation to reserves Xiaotian Feng
` (14 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:19 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 17/30] netvm: hook skb allocation to reserves
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (15 preceding siblings ...)
2010-07-13 10:19 ` [PATCH -mmotm 16/30] netvm: INET reserves Xiaotian Feng
@ 2010-07-13 10:20 ` Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 18/30] netvm: filter emergency skbs Xiaotian Feng
` (13 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:20 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 18/30] netvm: filter emergency skbs
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (16 preceding siblings ...)
2010-07-13 10:20 ` [PATCH -mmotm 17/30] netvm: hook skb allocation to reserves Xiaotian Feng
@ 2010-07-13 10:20 ` Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 19/30] netvm: prevent a stream specific deadlock Xiaotian Feng
` (12 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:20 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 19/30] netvm: prevent a stream specific deadlock
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (17 preceding siblings ...)
2010-07-13 10:20 ` [PATCH -mmotm 18/30] netvm: filter emergency skbs Xiaotian Feng
@ 2010-07-13 10:20 ` Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 20/30] netfilter: NF_QUEUE vs emergency skbs Xiaotian Feng
` (11 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:20 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 20/30] netfilter: NF_QUEUE vs emergency skbs
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (18 preceding siblings ...)
2010-07-13 10:20 ` [PATCH -mmotm 19/30] netvm: prevent a stream specific deadlock Xiaotian Feng
@ 2010-07-13 10:20 ` Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 21/30] netvm: skb processing Xiaotian Feng
` (10 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:20 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 21/30] netvm: skb processing
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (19 preceding siblings ...)
2010-07-13 10:20 ` [PATCH -mmotm 20/30] netfilter: NF_QUEUE vs emergency skbs Xiaotian Feng
@ 2010-07-13 10:20 ` Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 22/30] mm: add support for non block device backed swap files Xiaotian Feng
` (9 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:20 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 22/30] mm: add support for non block device backed swap files
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (20 preceding siblings ...)
2010-07-13 10:20 ` [PATCH -mmotm 21/30] netvm: skb processing Xiaotian Feng
@ 2010-07-13 10:20 ` Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 23/30] mm: methods for teaching filesystems about PG_swapcache pages Xiaotian Feng
` (8 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:20 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 23/30] mm: methods for teaching filesystems about PG_swapcache pages
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (21 preceding siblings ...)
2010-07-13 10:20 ` [PATCH -mmotm 22/30] mm: add support for non block device backed swap files Xiaotian Feng
@ 2010-07-13 10:21 ` Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 24/30] nfs: teach the NFS client how to treat " Xiaotian Feng
` (7 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:21 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 24/30] nfs: teach the NFS client how to treat PG_swapcache pages
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (22 preceding siblings ...)
2010-07-13 10:21 ` [PATCH -mmotm 23/30] mm: methods for teaching filesystems about PG_swapcache pages Xiaotian Feng
@ 2010-07-13 10:21 ` Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 25/30] nfs: disable data cache revalidation for swapfiles Xiaotian Feng
` (6 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:21 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 25/30] nfs: disable data cache revalidation for swapfiles
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (23 preceding siblings ...)
2010-07-13 10:21 ` [PATCH -mmotm 24/30] nfs: teach the NFS client how to treat " Xiaotian Feng
@ 2010-07-13 10:21 ` Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 26/30] nfs: enable swap on NFS Xiaotian Feng
` (5 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:21 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 26/30] nfs: enable swap on NFS
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (24 preceding siblings ...)
2010-07-13 10:21 ` [PATCH -mmotm 25/30] nfs: disable data cache revalidation for swapfiles Xiaotian Feng
@ 2010-07-13 10:21 ` Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 27/30] nfs: fix various memory recursions possible with swap over NFS Xiaotian Feng
` (4 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:21 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 27/30] nfs: fix various memory recursions possible with swap over NFS
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (25 preceding siblings ...)
2010-07-13 10:21 ` [PATCH -mmotm 26/30] nfs: enable swap on NFS Xiaotian Feng
@ 2010-07-13 10:21 ` Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 28/30] build fix for skb_emergency_protocol Xiaotian Feng
` (3 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:21 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 28/30] build fix for skb_emergency_protocol
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (26 preceding siblings ...)
2010-07-13 10:21 ` [PATCH -mmotm 27/30] nfs: fix various memory recursions possible with swap over NFS Xiaotian Feng
@ 2010-07-13 10:22 ` Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 29/30] fix null pointer deref in swap_entry_free Xiaotian Feng
` (2 subsequent siblings)
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:22 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 29/30] fix null pointer deref in swap_entry_free
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (27 preceding siblings ...)
2010-07-13 10:22 ` [PATCH -mmotm 28/30] build fix for skb_emergency_protocol Xiaotian Feng
@ 2010-07-13 10:22 ` Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 30/30] fix mess up on swap with multi files from same nfs server Xiaotian Feng
2010-07-13 12:53 ` [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Américo Wang
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:22 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH -mmotm 30/30] fix mess up on swap with multi files from same nfs server
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (28 preceding siblings ...)
2010-07-13 10:22 ` [PATCH -mmotm 29/30] fix null pointer deref in swap_entry_free Xiaotian Feng
@ 2010-07-13 10:22 ` Xiaotian Feng
2010-07-13 12:53 ` [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Américo Wang
30 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-13 10:22 UTC (permalink / raw)
To: linux-mm, linux-nfs, netdev
Cc: riel, cl, a.p.zijlstra, Xiaotian Feng, linux-kernel, lwang,
penberg, akpm, davem
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical
2010-07-13 10:19 ` [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical Xiaotian Feng
@ 2010-07-13 10:55 ` Mitchell Erblich
2010-07-15 11:51 ` Xiaotian Feng
0 siblings, 1 reply; 37+ messages in thread
From: Mitchell Erblich @ 2010-07-13 10:55 UTC (permalink / raw)
To: Xiaotian Feng
Cc: linux-mm, linux-nfs, netdev, riel, cl, a.p.zijlstra, linux-kernel,
lwang, penberg, akpm, davem
On Jul 13, 2010, at 3:19 AM, Xiaotian Feng wrote:
> From 6c3a91091b2910c23908a9f9953efcf3df14e522 Mon Sep 17 00:00:00 2001
> From: Xiaotian Feng <dfeng@redhat.com>
> Date: Tue, 13 Jul 2010 11:02:41 +0800
> Subject: [PATCH 12/30] selinux: tag avc cache alloc as non-critical
>
> Failing to allocate a cache entry will only harm performance not correctness.
> Do not consume valuable reserve pages for something like that.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de>
> Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
> ---
> security/selinux/avc.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/security/selinux/avc.c b/security/selinux/avc.c
> index 3662b0f..9029395 100644
> --- a/security/selinux/avc.c
> +++ b/security/selinux/avc.c
> @@ -284,7 +284,7 @@ static struct avc_node *avc_alloc_node(void)
> {
> struct avc_node *node;
>
> - node = kmem_cache_zalloc(avc_node_cachep, GFP_ATOMIC);
> + node = kmem_cache_zalloc(avc_node_cachep, GFP_ATOMIC|__GFP_NOMEMALLOC);
> if (!node)
> goto out;
>
> --
> 1.7.1.1
>
Why not just replace GFP_ATOMIC with GFP_NOWAIT?
This would NOT consume the valuable last pages.
Mitchell Erblich
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH -mmotm 00/30] [RFC] swap over nfs -v21
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
` (29 preceding siblings ...)
2010-07-13 10:22 ` [PATCH -mmotm 30/30] fix mess up on swap with multi files from same nfs server Xiaotian Feng
@ 2010-07-13 12:53 ` Américo Wang
30 siblings, 0 replies; 37+ messages in thread
From: Américo Wang @ 2010-07-13 12:53 UTC (permalink / raw)
To: Xiaotian Feng
Cc: linux-mm, linux-nfs, netdev, riel, cl, a.p.zijlstra, linux-kernel,
lwang, penberg, akpm, davem
On Tue, Jul 13, 2010 at 6:16 PM, Xiaotian Feng <dfeng@redhat.com> wrote:
> Hi,
>
> Here's the latest version of swap over NFS series since -v20 last October. We decide to push
> this feature as it is useful for NAS or virt environment.
>
> The patches are against the mmotm-2010-07-01. We can split the patchset into following parts:
>
> Patch 1 - 12: provides a generic reserve framework. This framework
> could also be used to get rid of some of the __GFP_NOFAIL users.
>
> Patch 13 - 15: Provide some generic network infrastructure needed later on.
>
> Patch 16 - 21: reserve a little pool to act as a receive buffer, this allows us to
> inspect packets before tossing them.
>
> Patch 22 - 23: Generic vm infrastructure to handle swapping to a filesystem instead of a block
> device.
>
> Patch 24 - 27: convert NFS to make use of the new network and vm infrastructure to
> provide swap over NFS.
>
> Patch 28 - 30: minor bug fixing with latest -mmotm.
>
> [some history]
> v19: http://lwn.net/Articles/301915/
> v20: http://lwn.net/Articles/355350/
>
> Changes since v20:
> - rebased to mmotm-2010-07-01
> - dropped the null pointer deref patch for the root cause is wrong SWP_FILE enum
> - some minor build fixes
> - fix a null pointer deref with mmotm-2010-07-01
> - fix a bug when swap with multi files on the same nfs server
Please use the "From:" line correctly, as stated in
Documentation/SubmittingPatches:
The "from" line must be the very first line in the message body,
and has the form:
From: Original Author <author@example.com>
The "from" line specifies who will be credited as the author of the
patch in the permanent changelog. If the "from" line is missing,
then the "From:" line from the email header will be used to determine
the patch author in the changelog.
I think you are using git format-patch to generate those patches, please supply
--author=<author> to git commit when you commit them to your local
tree. (or git am
if the patches you received already had the correct From: line.)
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages
2010-07-13 10:17 ` [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages Xiaotian Feng
@ 2010-07-13 20:33 ` Pekka Enberg
2010-07-15 12:37 ` Xiaotian Feng
2010-08-03 1:44 ` Neil Brown
0 siblings, 2 replies; 37+ messages in thread
From: Pekka Enberg @ 2010-07-13 20:33 UTC (permalink / raw)
To: Xiaotian Feng
Cc: linux-mm, linux-nfs, netdev, riel, cl, a.p.zijlstra, linux-kernel,
lwang, akpm, davem
Hi Xiaotian!
I would actually prefer that the SLAB, SLOB, and SLUB changes were in
separate patches to make reviewing easier.
Looking at SLUB:
On Tue, Jul 13, 2010 at 1:17 PM, Xiaotian Feng <dfeng@redhat.com> wrote:
> diff --git a/mm/slub.c b/mm/slub.c
> index 7bb7940..7a5d6dc 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -27,6 +27,8 @@
> #include <linux/memory.h>
> #include <linux/math64.h>
> #include <linux/fault-inject.h>
> +#include "internal.h"
> +
>
> /*
> * Lock order:
> @@ -1139,7 +1141,8 @@ static void setup_object(struct kmem_cache *s, struct page *page,
> s->ctor(object);
> }
>
> -static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
> +static
> +struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node, int *reserve)
> {
> struct page *page;
> void *start;
> @@ -1153,6 +1156,8 @@ static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
> if (!page)
> goto out;
>
> + *reserve = page->reserve;
> +
> inc_slabs_node(s, page_to_nid(page), page->objects);
> page->slab = s;
> page->flags |= 1 << PG_slab;
> @@ -1606,10 +1611,20 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> {
> void **object;
> struct page *new;
> + int reserve;
>
> /* We handle __GFP_ZERO in the caller */
> gfpflags &= ~__GFP_ZERO;
>
> + if (unlikely(c->reserve)) {
> + /*
> + * If the current slab is a reserve slab and the current
> + * allocation context does not allow access to the reserves we
> + * must force an allocation to test the current levels.
> + */
> + if (!(gfp_to_alloc_flags(gfpflags) & ALLOC_NO_WATERMARKS))
> + goto grow_slab;
OK, so assume that:
(1) c->reserve is set to one
(2) GFP flags don't allow dipping into the reserves
(3) we've managed to free enough pages so normal
allocations are fine
(4) the page from reserves is not yet empty
we will call flush_slab() and put the "emergency page" on partial list
and clear c->reserve. This effectively means that now some other
allocation can fetch the partial page and start to use it. Is this OK?
Who makes sure the emergency reserves are large enough for the next
out-of-memory condition where we swap over NFS?
> + }
> if (!c->page)
> goto new_slab;
>
> @@ -1623,8 +1638,8 @@ load_freelist:
> object = c->page->freelist;
> if (unlikely(!object))
> goto another_slab;
> - if (unlikely(SLABDEBUG && PageSlubDebug(c->page)))
> - goto debug;
> + if (unlikely(SLABDEBUG && PageSlubDebug(c->page) || c->reserve))
> + goto slow_path;
>
> c->freelist = get_freepointer(s, object);
> c->page->inuse = c->page->objects;
> @@ -1646,16 +1661,18 @@ new_slab:
> goto load_freelist;
> }
>
> +grow_slab:
> if (gfpflags & __GFP_WAIT)
> local_irq_enable();
>
> - new = new_slab(s, gfpflags, node);
> + new = new_slab(s, gfpflags, node, &reserve);
>
> if (gfpflags & __GFP_WAIT)
> local_irq_disable();
>
> if (new) {
> c = __this_cpu_ptr(s->cpu_slab);
> + c->reserve = reserve;
> stat(s, ALLOC_SLAB);
> if (c->page)
> flush_slab(s, c);
> @@ -1667,10 +1684,20 @@ new_slab:
> if (!(gfpflags & __GFP_NOWARN) && printk_ratelimit())
> slab_out_of_memory(s, gfpflags, node);
> return NULL;
> -debug:
> - if (!alloc_debug_processing(s, c->page, object, addr))
> +
> +slow_path:
> + if (!c->reserve && !alloc_debug_processing(s, c->page, object, addr))
> goto another_slab;
>
> + /*
> + * Avoid the slub fast path in slab_alloc() by not setting
> + * c->freelist and the fast path in slab_free() by making
> + * node_match() fail by setting c->node to -1.
> + *
> + * We use this for for debug and reserve checks which need
> + * to be done for each allocation.
> + */
> +
> c->page->inuse++;
> c->page->freelist = get_freepointer(s, object);
> c->node = -1;
> @@ -2095,10 +2122,11 @@ static void early_kmem_cache_node_alloc(gfp_t gfpflags, int node)
> struct page *page;
> struct kmem_cache_node *n;
> unsigned long flags;
> + int reserve;
>
> BUG_ON(kmalloc_caches->size < sizeof(struct kmem_cache_node));
>
> - page = new_slab(kmalloc_caches, gfpflags, node);
> + page = new_slab(kmalloc_caches, gfpflags, node, &reserve);
>
> BUG_ON(!page);
> if (page_to_nid(page) != node) {
> --
> 1.7.1.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical
2010-07-13 10:55 ` Mitchell Erblich
@ 2010-07-15 11:51 ` Xiaotian Feng
0 siblings, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-15 11:51 UTC (permalink / raw)
To: Mitchell Erblich
Cc: linux-mm, linux-nfs, netdev, riel, cl, a.p.zijlstra, linux-kernel,
lwang, penberg, akpm, davem
On 07/13/2010 06:55 PM, Mitchell Erblich wrote:
>
> On Jul 13, 2010, at 3:19 AM, Xiaotian Feng wrote:
>
>> From 6c3a91091b2910c23908a9f9953efcf3df14e522 Mon Sep 17 00:00:00 2001
>> From: Xiaotian Feng<dfeng@redhat.com>
>> Date: Tue, 13 Jul 2010 11:02:41 +0800
>> Subject: [PATCH 12/30] selinux: tag avc cache alloc as non-critical
>>
>> Failing to allocate a cache entry will only harm performance not correctness.
>> Do not consume valuable reserve pages for something like that.
>>
>> Signed-off-by: Peter Zijlstra<a.p.zijlstra@chello.nl>
>> Signed-off-by: Suresh Jayaraman<sjayaraman@suse.de>
>> Signed-off-by: Xiaotian Feng<dfeng@redhat.com>
>> ---
>> security/selinux/avc.c | 2 +-
>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/security/selinux/avc.c b/security/selinux/avc.c
>> index 3662b0f..9029395 100644
>> --- a/security/selinux/avc.c
>> +++ b/security/selinux/avc.c
>> @@ -284,7 +284,7 @@ static struct avc_node *avc_alloc_node(void)
>> {
>> struct avc_node *node;
>>
>> - node = kmem_cache_zalloc(avc_node_cachep, GFP_ATOMIC);
>> + node = kmem_cache_zalloc(avc_node_cachep, GFP_ATOMIC|__GFP_NOMEMALLOC);
>> if (!node)
>> goto out;
>>
>> --
>> 1.7.1.1
>>
>
> Why not just replace GFP_ATOMIC with GFP_NOWAIT?
>
> This would NOT consume the valuable last pages.
But replace GFP_ATOMIC with GFP_NOWAIT can not prevent avc_alloc_node
consume reserved pages.
>
> Mitchell Erblich
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages
2010-07-13 20:33 ` Pekka Enberg
@ 2010-07-15 12:37 ` Xiaotian Feng
2010-08-03 1:44 ` Neil Brown
1 sibling, 0 replies; 37+ messages in thread
From: Xiaotian Feng @ 2010-07-15 12:37 UTC (permalink / raw)
To: Pekka Enberg
Cc: linux-mm, linux-nfs, netdev, riel, cl, a.p.zijlstra, linux-kernel,
lwang, akpm, davem
On 07/14/2010 04:33 AM, Pekka Enberg wrote:
> Hi Xiaotian!
>
> I would actually prefer that the SLAB, SLOB, and SLUB changes were in
> separate patches to make reviewing easier.
>
> Looking at SLUB:
>
> On Tue, Jul 13, 2010 at 1:17 PM, Xiaotian Feng<dfeng@redhat.com> wrote:
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 7bb7940..7a5d6dc 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -27,6 +27,8 @@
>> #include<linux/memory.h>
>> #include<linux/math64.h>
>> #include<linux/fault-inject.h>
>> +#include "internal.h"
>> +
>>
>> /*
>> * Lock order:
>> @@ -1139,7 +1141,8 @@ static void setup_object(struct kmem_cache *s, struct page *page,
>> s->ctor(object);
>> }
>>
>> -static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
>> +static
>> +struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node, int *reserve)
>> {
>> struct page *page;
>> void *start;
>> @@ -1153,6 +1156,8 @@ static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
>> if (!page)
>> goto out;
>>
>> + *reserve = page->reserve;
>> +
>> inc_slabs_node(s, page_to_nid(page), page->objects);
>> page->slab = s;
>> page->flags |= 1<< PG_slab;
>> @@ -1606,10 +1611,20 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
>> {
>> void **object;
>> struct page *new;
>> + int reserve;
>>
>> /* We handle __GFP_ZERO in the caller */
>> gfpflags&= ~__GFP_ZERO;
>>
>> + if (unlikely(c->reserve)) {
>> + /*
>> + * If the current slab is a reserve slab and the current
>> + * allocation context does not allow access to the reserves we
>> + * must force an allocation to test the current levels.
>> + */
>> + if (!(gfp_to_alloc_flags(gfpflags)& ALLOC_NO_WATERMARKS))
>> + goto grow_slab;
>
> OK, so assume that:
>
> (1) c->reserve is set to one
>
> (2) GFP flags don't allow dipping into the reserves
>
> (3) we've managed to free enough pages so normal
> allocations are fine
>
> (4) the page from reserves is not yet empty
>
> we will call flush_slab() and put the "emergency page" on partial list
> and clear c->reserve. This effectively means that now some other
> allocation can fetch the partial page and start to use it. Is this OK?
> Who makes sure the emergency reserves are large enough for the next
> out-of-memory condition where we swap over NFS?
>
Good catch. I'm just wondering if above check is necessary. For
"emergency page", we don't set c->freelist. How can we get a
reserved slab, if GPF flags don't allow dipping into reserves?
>> + }
>> if (!c->page)
>> goto new_slab;
>>
>> @@ -1623,8 +1638,8 @@ load_freelist:
>> object = c->page->freelist;
>> if (unlikely(!object))
>> goto another_slab;
>> - if (unlikely(SLABDEBUG&& PageSlubDebug(c->page)))
>> - goto debug;
>> + if (unlikely(SLABDEBUG&& PageSlubDebug(c->page) || c->reserve))
>> + goto slow_path;
>>
>> c->freelist = get_freepointer(s, object);
>> c->page->inuse = c->page->objects;
>> @@ -1646,16 +1661,18 @@ new_slab:
>> goto load_freelist;
>> }
>>
>> +grow_slab:
>> if (gfpflags& __GFP_WAIT)
>> local_irq_enable();
>>
>> - new = new_slab(s, gfpflags, node);
>> + new = new_slab(s, gfpflags, node,&reserve);
>>
>> if (gfpflags& __GFP_WAIT)
>> local_irq_disable();
>>
>> if (new) {
>> c = __this_cpu_ptr(s->cpu_slab);
>> + c->reserve = reserve;
>> stat(s, ALLOC_SLAB);
>> if (c->page)
>> flush_slab(s, c);
>> @@ -1667,10 +1684,20 @@ new_slab:
>> if (!(gfpflags& __GFP_NOWARN)&& printk_ratelimit())
>> slab_out_of_memory(s, gfpflags, node);
>> return NULL;
>> -debug:
>> - if (!alloc_debug_processing(s, c->page, object, addr))
>> +
>> +slow_path:
>> + if (!c->reserve&& !alloc_debug_processing(s, c->page, object, addr))
>> goto another_slab;
>>
>> + /*
>> + * Avoid the slub fast path in slab_alloc() by not setting
>> + * c->freelist and the fast path in slab_free() by making
>> + * node_match() fail by setting c->node to -1.
>> + *
>> + * We use this for for debug and reserve checks which need
>> + * to be done for each allocation.
>> + */
>> +
>> c->page->inuse++;
>> c->page->freelist = get_freepointer(s, object);
>> c->node = -1;
>> @@ -2095,10 +2122,11 @@ static void early_kmem_cache_node_alloc(gfp_t gfpflags, int node)
>> struct page *page;
>> struct kmem_cache_node *n;
>> unsigned long flags;
>> + int reserve;
>>
>> BUG_ON(kmalloc_caches->size< sizeof(struct kmem_cache_node));
>>
>> - page = new_slab(kmalloc_caches, gfpflags, node);
>> + page = new_slab(kmalloc_caches, gfpflags, node,&reserve);
>>
>> BUG_ON(!page);
>> if (page_to_nid(page) != node) {
>> --
>> 1.7.1.1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email:<a href=mailto:"dont@kvack.org"> email@kvack.org</a>
>>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages
2010-07-13 20:33 ` Pekka Enberg
2010-07-15 12:37 ` Xiaotian Feng
@ 2010-08-03 1:44 ` Neil Brown
1 sibling, 0 replies; 37+ messages in thread
From: Neil Brown @ 2010-08-03 1:44 UTC (permalink / raw)
To: Pekka Enberg
Cc: Xiaotian Feng, linux-mm, linux-nfs, netdev, riel, cl,
a.p.zijlstra, linux-kernel, lwang, akpm, davem
On Tue, 13 Jul 2010 23:33:14 +0300
Pekka Enberg <penberg@cs.helsinki.fi> wrote:
> Hi Xiaotian!
>
> I would actually prefer that the SLAB, SLOB, and SLUB changes were in
> separate patches to make reviewing easier.
>
> Looking at SLUB:
>
> On Tue, Jul 13, 2010 at 1:17 PM, Xiaotian Feng <dfeng@redhat.com> wrote:
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 7bb7940..7a5d6dc 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -27,6 +27,8 @@
> > #include <linux/memory.h>
> > #include <linux/math64.h>
> > #include <linux/fault-inject.h>
> > +#include "internal.h"
> > +
> >
> > /*
> > * Lock order:
> > @@ -1139,7 +1141,8 @@ static void setup_object(struct kmem_cache *s, struct page *page,
> > s->ctor(object);
> > }
> >
> > -static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
> > +static
> > +struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node, int *reserve)
> > {
> > struct page *page;
> > void *start;
> > @@ -1153,6 +1156,8 @@ static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
> > if (!page)
> > goto out;
> >
> > + *reserve = page->reserve;
> > +
> > inc_slabs_node(s, page_to_nid(page), page->objects);
> > page->slab = s;
> > page->flags |= 1 << PG_slab;
> > @@ -1606,10 +1611,20 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> > {
> > void **object;
> > struct page *new;
> > + int reserve;
> >
> > /* We handle __GFP_ZERO in the caller */
> > gfpflags &= ~__GFP_ZERO;
> >
> > + if (unlikely(c->reserve)) {
> > + /*
> > + * If the current slab is a reserve slab and the current
> > + * allocation context does not allow access to the reserves we
> > + * must force an allocation to test the current levels.
> > + */
> > + if (!(gfp_to_alloc_flags(gfpflags) & ALLOC_NO_WATERMARKS))
> > + goto grow_slab;
>
> OK, so assume that:
>
> (1) c->reserve is set to one
>
> (2) GFP flags don't allow dipping into the reserves
>
> (3) we've managed to free enough pages so normal
> allocations are fine
>
> (4) the page from reserves is not yet empty
>
> we will call flush_slab() and put the "emergency page" on partial list
> and clear c->reserve. This effectively means that now some other
> allocation can fetch the partial page and start to use it. Is this OK?
> Who makes sure the emergency reserves are large enough for the next
> out-of-memory condition where we swap over NFS?
Yes, this is OK. The emergency reserves are maintained at a lower level -
within alloc_page.
The fact that (3) normal allocations are fine means that there are enough
free pages to satisfy any swap-out allocation - so any pages that were
previously allocated as 'emergency' pages can have their emergency status
forgotten (the emergency has passed).
This is a subtle but important aspect of the emergency reservation scheme in
swap-over-NFS. It is the act-of-allocating that is emergency-or-not. The
memory itself, once allocated, is not special.
c->reserve means "the last page allocated required an emergency allocation".
This means that parts of that page, or any other page, can only be given as
emergency allocations. Once the slab succeeds at a non-emergency allocation,
the flag should obviously be cleared.
Similarly the page->reserve flag does not mean "this is a reserve page", but
simply "when this page was allocated, it was an emergency allocation". The
flag is often soon lost as it is in a union with e.g. freelist. But that
doesn't matter as it is only really meaningful at the moment of allocation.
I hope that clarifies the situation,
NeilBrown
>
> > + }
> > if (!c->page)
> > goto new_slab;
> >
> > @@ -1623,8 +1638,8 @@ load_freelist:
> > object = c->page->freelist;
> > if (unlikely(!object))
> > goto another_slab;
> > - if (unlikely(SLABDEBUG && PageSlubDebug(c->page)))
> > - goto debug;
> > + if (unlikely(SLABDEBUG && PageSlubDebug(c->page) || c->reserve))
> > + goto slow_path;
> >
> > c->freelist = get_freepointer(s, object);
> > c->page->inuse = c->page->objects;
> > @@ -1646,16 +1661,18 @@ new_slab:
> > goto load_freelist;
> > }
> >
> > +grow_slab:
> > if (gfpflags & __GFP_WAIT)
> > local_irq_enable();
> >
> > - new = new_slab(s, gfpflags, node);
> > + new = new_slab(s, gfpflags, node, &reserve);
> >
> > if (gfpflags & __GFP_WAIT)
> > local_irq_disable();
> >
> > if (new) {
> > c = __this_cpu_ptr(s->cpu_slab);
> > + c->reserve = reserve;
> > stat(s, ALLOC_SLAB);
> > if (c->page)
> > flush_slab(s, c);
> > @@ -1667,10 +1684,20 @@ new_slab:
> > if (!(gfpflags & __GFP_NOWARN) && printk_ratelimit())
> > slab_out_of_memory(s, gfpflags, node);
> > return NULL;
> > -debug:
> > - if (!alloc_debug_processing(s, c->page, object, addr))
> > +
> > +slow_path:
> > + if (!c->reserve && !alloc_debug_processing(s, c->page, object, addr))
> > goto another_slab;
> >
> > + /*
> > + * Avoid the slub fast path in slab_alloc() by not setting
> > + * c->freelist and the fast path in slab_free() by making
> > + * node_match() fail by setting c->node to -1.
> > + *
> > + * We use this for for debug and reserve checks which need
> > + * to be done for each allocation.
> > + */
> > +
> > c->page->inuse++;
> > c->page->freelist = get_freepointer(s, object);
> > c->node = -1;
> > @@ -2095,10 +2122,11 @@ static void early_kmem_cache_node_alloc(gfp_t gfpflags, int node)
> > struct page *page;
> > struct kmem_cache_node *n;
> > unsigned long flags;
> > + int reserve;
> >
> > BUG_ON(kmalloc_caches->size < sizeof(struct kmem_cache_node));
> >
> > - page = new_slab(kmalloc_caches, gfpflags, node);
> > + page = new_slab(kmalloc_caches, gfpflags, node, &reserve);
> >
> > BUG_ON(!page);
> > if (page_to_nid(page) != node) {
> > --
> > 1.7.1.1
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org. For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2010-08-03 1:40 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-13 10:16 [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 01/30] mm: serialize access to min_free_kbytes Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 02/30] Swap over network documentation Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 03/30] mm: expose gfp_to_alloc_flags() Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 04/30] mm: tag reseve pages Xiaotian Feng
2010-07-13 10:17 ` [PATCH -mmotm 05/30] mm: sl[au]b: add knowledge of reserve pages Xiaotian Feng
2010-07-13 20:33 ` Pekka Enberg
2010-07-15 12:37 ` Xiaotian Feng
2010-08-03 1:44 ` Neil Brown
2010-07-13 10:17 ` [PATCH -mmotm 06/30] mm: kmem_alloc_estimate() Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 07/30] mm: allow PF_MEMALLOC from softirq context Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 08/30] mm: emergency pool Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 09/30] mm: system wide ALLOC_NO_WATERMARK Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 10/30] mm: __GFP_MEMALLOC Xiaotian Feng
2010-07-13 10:18 ` [PATCH -mmotm 11/30] mm: memory reserve management Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 12/30] selinux: tag avc cache alloc as non-critical Xiaotian Feng
2010-07-13 10:55 ` Mitchell Erblich
2010-07-15 11:51 ` Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 13/30] net: packet split receive api Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 14/30] net: sk_allocation() - concentrate socket related allocations Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 15/30] netvm: network reserve infrastructure Xiaotian Feng
2010-07-13 10:19 ` [PATCH -mmotm 16/30] netvm: INET reserves Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 17/30] netvm: hook skb allocation to reserves Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 18/30] netvm: filter emergency skbs Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 19/30] netvm: prevent a stream specific deadlock Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 20/30] netfilter: NF_QUEUE vs emergency skbs Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 21/30] netvm: skb processing Xiaotian Feng
2010-07-13 10:20 ` [PATCH -mmotm 22/30] mm: add support for non block device backed swap files Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 23/30] mm: methods for teaching filesystems about PG_swapcache pages Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 24/30] nfs: teach the NFS client how to treat " Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 25/30] nfs: disable data cache revalidation for swapfiles Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 26/30] nfs: enable swap on NFS Xiaotian Feng
2010-07-13 10:21 ` [PATCH -mmotm 27/30] nfs: fix various memory recursions possible with swap over NFS Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 28/30] build fix for skb_emergency_protocol Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 29/30] fix null pointer deref in swap_entry_free Xiaotian Feng
2010-07-13 10:22 ` [PATCH -mmotm 30/30] fix mess up on swap with multi files from same nfs server Xiaotian Feng
2010-07-13 12:53 ` [PATCH -mmotm 00/30] [RFC] swap over nfs -v21 Américo Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).