* [PATCH 1/2 v4][rfc] shmem: provide vm_ops when also providing a mem policy
2012-07-02 20:26 [PATCH 0/2 v4][rfc] tmpfs not interleaving properly Nathan Zimmer
@ 2012-07-02 20:28 ` Nathan Zimmer
2012-07-02 20:28 ` [PATCH 2/2 v4][rfc] tmpfs: interleave the starting node of /dev/shmem Nathan Zimmer
2012-07-02 20:54 ` [PATCH 0/2 v4][rfc] tmpfs not interleaving properly Nathan Zimmer
2 siblings, 0 replies; 6+ messages in thread
From: Nathan Zimmer @ 2012-07-02 20:28 UTC (permalink / raw)
To: Nathan Zimmer
Cc: linux-mm, linux-kernel, Christoph Lameter, Nick Piggin,
Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, Rik van Riel
Updating shmem_get_policy to use the vma_policy if provided.
This is to allows us to safely provide shmem_vm_ops to the vma when the vm_file
has not been setup which is the case on the pseudo vmas.
Cc: Christoph Lameter <cl@linux.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Nathan T Zimmer <nzimmer@sgi.com>
---
mm/shmem.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index a15a466..9bd599b 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -921,8 +921,11 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
/* Create a pseudo vma that just contains the policy */
pvma.vm_start = 0;
pvma.vm_pgoff = index;
- pvma.vm_ops = NULL;
pvma.vm_policy = spol;
+ if( pvma.vm_policy )
+ pvma.vm_ops = &shmem_vm_ops;
+ else
+ pvma.vm_ops = NULL;
return swapin_readahead(swap, gfp, &pvma, 0);
}
@@ -934,8 +937,11 @@ static struct page *shmem_alloc_page(gfp_t gfp,
/* Create a pseudo vma that just contains the policy */
pvma.vm_start = 0;
pvma.vm_pgoff = index;
- pvma.vm_ops = NULL;
pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, index);
+ if( pvma.vm_policy )
+ pvma.vm_ops = &shmem_vm_ops;
+ else
+ pvma.vm_ops = NULL;
/*
* alloc_page_vma() will drop the shared policy reference
@@ -1296,8 +1302,14 @@ static int shmem_set_policy(struct vm_area_struct *vma, struct mempolicy *mpol)
static struct mempolicy *shmem_get_policy(struct vm_area_struct *vma,
unsigned long addr)
{
- struct inode *inode = vma->vm_file->f_path.dentry->d_inode;
pgoff_t index;
+ struct inode *inode;
+
+ // If the vma knows what policy it wants use that one.
+ if (vma->vm_policy)
+ return vma->vm_policy;
+
+ inode = vma->vm_file->f_path.dentry->d_inode;
index = ((addr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
return mpol_shared_policy_lookup(&SHMEM_I(inode)->policy, index);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2 v4][rfc] tmpfs: interleave the starting node of /dev/shmem
2012-07-02 20:26 [PATCH 0/2 v4][rfc] tmpfs not interleaving properly Nathan Zimmer
2012-07-02 20:28 ` [PATCH 1/2 v4][rfc] shmem: provide vm_ops when also providing a mem policy Nathan Zimmer
@ 2012-07-02 20:28 ` Nathan Zimmer
2012-07-03 12:49 ` Cong Wang
2012-07-02 20:54 ` [PATCH 0/2 v4][rfc] tmpfs not interleaving properly Nathan Zimmer
2 siblings, 1 reply; 6+ messages in thread
From: Nathan Zimmer @ 2012-07-02 20:28 UTC (permalink / raw)
To: Nathan Zimmer
Cc: linux-mm, linux-kernel, Christoph Lameter, Nick Piggin,
Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, Rik van Riel
The tmpfs superblock grants an offset for each inode as they are created. Each
inode then uses that offset to provide a preferred first node for its interleave
in the shmem_interleave.
Cc: Christoph Lameter <cl@linux.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Nathan T Zimmer <nzimmer@sgi.com>
---
include/linux/mm.h | 6 ++++++
include/linux/shmem_fs.h | 2 ++
mm/mempolicy.c | 4 ++++
mm/shmem.c | 14 ++++++++++++++
4 files changed, 26 insertions(+), 0 deletions(-)
Index: linux/include/linux/mm.h
===================================================================
--- linux.orig/include/linux/mm.h 2012-07-02 10:38:25.090169183 -0500
+++ linux/include/linux/mm.h 2012-07-02 10:38:30.714072182 -0500
@@ -238,6 +238,12 @@ struct vm_operations_struct {
*/
struct mempolicy *(*get_policy)(struct vm_area_struct *vma,
unsigned long addr);
+
+ /*
+ * If the policy is interleave allow the vma to suggest a node.
+ */
+ unsigned long (*interleave)( struct vm_area_struct *vma, unsigned long addr);
+
int (*migrate)(struct vm_area_struct *vma, const nodemask_t *from,
const nodemask_t *to, unsigned long flags);
#endif
Index: linux/include/linux/shmem_fs.h
===================================================================
--- linux.orig/include/linux/shmem_fs.h 2012-07-02 10:38:25.090169183 -0500
+++ linux/include/linux/shmem_fs.h 2012-07-02 10:38:30.714072182 -0500
@@ -17,6 +17,7 @@ struct shmem_inode_info {
char *symlink; /* unswappable short symlink */
};
struct shared_policy policy; /* NUMA memory alloc policy */
+ unsigned long node_offset; /* bias for interleaved nodes */
struct list_head swaplist; /* chain of maybes on swap */
struct list_head xattr_list; /* list of shmem_xattr */
struct inode vfs_inode;
@@ -32,6 +33,7 @@ struct shmem_sb_info {
kgid_t gid; /* Mount gid for root directory */
umode_t mode; /* Mount mode for root directory */
struct mempolicy *mpol; /* default memory policy for mappings */
+ unsigned long next_pref_node; /* next interleave bias to suggest for inodes */
};
static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
Index: linux/mm/mempolicy.c
===================================================================
--- linux.orig/mm/mempolicy.c 2012-07-02 10:38:25.090169183 -0500
+++ linux/mm/mempolicy.c 2012-07-02 10:38:30.738071768 -0500
@@ -1663,6 +1663,10 @@ static inline unsigned interleave_nid(st
{
if (vma) {
unsigned long off;
+ if (vma->vm_ops && vma->vm_ops->interleave) {
+ off = vma->vm_ops->interleave( vma, addr );
+ return offset_il_node(pol, vma, off );
+ }
/*
* for small pages, there is no difference between
Index: linux/mm/shmem.c
===================================================================
--- linux.orig/mm/shmem.c 2012-07-02 10:38:25.090169183 -0500
+++ linux/mm/shmem.c 2012-07-02 10:40:44.635767155 -0500
@@ -922,6 +922,7 @@ static struct page *shmem_swapin(swp_ent
pvma.vm_start = 0;
pvma.vm_pgoff = index;
pvma.vm_policy = spol;
+ pvma.vm_private_data = (void*)info->node_offset;
if( pvma.vm_policy )
pvma.vm_ops = &shmem_vm_ops;
else
@@ -938,6 +939,7 @@ static struct page *shmem_alloc_page(gfp
pvma.vm_start = 0;
pvma.vm_pgoff = index;
pvma.vm_policy = mpol_shared_policy_lookup(&info->policy, index);
+ pvma.vm_private_data = (void*)info->node_offset;
if( pvma.vm_policy )
pvma.vm_ops = &shmem_vm_ops;
else
@@ -1314,6 +1316,18 @@ static struct mempolicy *shmem_get_polic
index = ((addr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
return mpol_shared_policy_lookup(&SHMEM_I(inode)->policy, index);
}
+
+static unsigned long shmem_interleave( struct vm_area_struct *vma, unsigned long addr)
+{
+ unsigned offset;
+
+ // Use the vm_files prefered node as the initial offset
+ offset = (unsigned long)vma->vm_private_data;
+
+ offset += ((addr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
+
+ return offset;
+}
#endif
int shmem_lock(struct file *file, int lock, struct user_struct *user)
@@ -2871,6 +2885,7 @@ static const struct super_operations shm
static const struct vm_operations_struct shmem_vm_ops = {
.fault = shmem_fault,
#ifdef CONFIG_NUMA
+ .interleave = shmem_interleave,
.set_policy = shmem_set_policy,
.get_policy = shmem_get_policy,
#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 0/2 v4][rfc] tmpfs not interleaving properly
2012-07-02 20:26 [PATCH 0/2 v4][rfc] tmpfs not interleaving properly Nathan Zimmer
2012-07-02 20:28 ` [PATCH 1/2 v4][rfc] shmem: provide vm_ops when also providing a mem policy Nathan Zimmer
2012-07-02 20:28 ` [PATCH 2/2 v4][rfc] tmpfs: interleave the starting node of /dev/shmem Nathan Zimmer
@ 2012-07-02 20:54 ` Nathan Zimmer
2012-07-03 10:32 ` Cong Wang
2 siblings, 1 reply; 6+ messages in thread
From: Nathan Zimmer @ 2012-07-02 20:54 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: Christoph Lameter, Nick Piggin, Hugh Dickins, Lee Schermerhorn,
KOSAKI Motohiro, Rik van Riel
On 07/02/2012 03:26 PM, Nathan Zimmer wrote:
> When tmpfs has the memory policy interleaved it always starts allocating at each
> file at node 0. When there are many small files the lower nodes fill up
> disproportionately.
> This patch spreads out node usage by starting files at nodes other then 0.
> The tmpfs superblock grants an offset for each inode as they are created. Each
> then uses that offset to proved a prefered first node for its interleave in
> the shmem_interleave.
>
> v2: passed preferred node via addr
> v3: using current->cpuset_mem_spread_rotor instead of random_node
> v4: Switching the rotor and attempting to provide an interleave function
> Also splitting the patch into two sections.
>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Nick Piggin <npiggin@gmail.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Nathan T Zimmer <nzimmer@sgi.com>
> ---
>
> include/linux/mm.h | 6 ++++++
> include/linux/shmem_fs.h | 2 ++
> mm/mempolicy.c | 4 ++++
> mm/shmem.c | 33 ++++++++++++++++++++++++++++++---
> 4 files changed, 42 insertions(+), 3 deletions(-)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
I apologize, it seems I have sent the patch before running checkpatch.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 6+ messages in thread