public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Paolo Zeppegno <zeppegno.paolo@seat.it>
To: Hugh Dickins <hugh@veritas.com>
Cc: Matthew Dobson <colpatch@us.ibm.com>,
	linux-kernel@vger.kernel.org,
	"Martin J. Bligh" <mbligh@aracnet.com>,
	Andrew Morton <akpm@digeo.com>,
	Christoph Hellwig <hch@infradead.org>, Andi Kleen <ak@muc.de>,
	lse-tech <lse-tech@lists.sourceforge.net>
Subject: Re: [rfc][patch] Memory Binding Take 2 (1/1)
Date: Thu, 03 Apr 2003 15:25:26 +0200	[thread overview]
Message-ID: <3E8C3646.3010906@seat.it> (raw)
In-Reply-To: <Pine.LNX.4.44.0304031317290.1718-100000@localhost.localdomain>

I suggested mbind for consistency with mmap, munmap, mremap and msync, 
that is IFF the mbind operation is in some ways related with these other 
syscalls.


Hugh Dickins wrote:

>On Wed, 2 Apr 2003, Matthew Dobson wrote:
>+/*
>+ * membind -  Bind a range of a process' VM space to a set of memory blocks according to
>
>membind or mbind?  Me, I like mbind (modulo remarks below), but you may
>find Linus does not (he was rather caustic when I suggested that fremap
>should be mseek, and it ended up as remap_file_pages instead).
>
>+ *            a predefined policy.
>+ * @start:    beginning address of memory region to bind
>+ * @len:      length of memory region to bind
>
>Oh really? len is unused in the code below.  If you were to use it,
>you'd need to loop over vmas, splitting where necessary.
>
>+ * @mask_ptr: pointer to bitmask of cpus
>+ * @mask_len: length of the bitmask
>+ * @policy:   flag specifying the policy to use for the segment
>
>I think you already remarked that policy is currently unused,
>fair enough.
>
>+ */
>+asmlinkage unsigned long sys_mbind(unsigned long start, unsigned long len, 
>+		unsigned long *mask_ptr, unsigned int mask_len, unsigned long policy)
>+{
>+	DECLARE_BITMAP(cpu_mask, NR_CPUS);
>+	DECLARE_BITMAP(node_mask, MAX_NUMNODES);
>+	struct vm_area_struct *vma = NULL;
>+	struct address_space *mapping;
>+	int copy_len, error = 0;
>+
>+	/* Deal with getting cpu_mask from userspace & translating to node_mask */
>+	copy_len = min(mask_len, (unsigned int)NR_CPUS);
>+	CLEAR_BITMAP(cpu_mask, NR_CPUS);
>+	CLEAR_BITMAP(node_mask, MAX_NUMNODES);
>+	if (copy_from_user(cpu_mask, mask_ptr, (copy_len+7)/8)) {
>+		error = -EFAULT;
>+		goto out;
>+	}
>
>Shouldn't there be some capability restriction?  Is it right that
>anyone who can mmap a file for reading can determine its binding
>(until the next does it differently)?
>
>+	cpumask_to_nodemask(cpu_mask, node_mask);
>+
>+	vma = find_vma(current->mm, start);
>
>You must not scan the vma list without at least
>down_read(&current->mm->mmap_sem).
>
>+	if (!(vma && vma->vm_file && vma->vm_ops && 
>+		vma->vm_ops->nopage == shmem_nopage)) {
>+		/* This isn't a shm segment.  For now, we bail. */
>
>So you're allowing this on any file on tmpfs,
>but on no file on any other filesystem: curious.
>
>+		error = -EINVAL;
>+		goto out;
>+	}
>+
>+	mapping = vma->vm_file->f_dentry->d_inode->i_mapping;
>+	mapping->binding = alloc_binding(node_mask);
>
>Your NUMA machines clearly have more memory than is good for you:
>nowhere is there an equivalent free_binding: which in particular
>would need to be called first here if binding is already set (or
>else old structure reused), and when inode is freed.
>
>So... mapping->binding conditions every page_cache_alloc for that
>inode.  Hmm, what on earth does this have to do with mbind or membind?
>It looks to me like fbind, except that you've dressed up the interface
>to use an address in the caller's address space: presumably because you
>couldn't get a file handle on SysV shared memory, and that's what you
>were really wanting to bind, hence the shmem_nopage test?
>
>I think this interface is confused (but it probably thinks I am).
>
>+	if (!mapping->binding)
>+		error = -EFAULT;
>+
>+out:
>+	return error;
>+}
>diff -Nur --exclude-from=/usr/src/.dontdiff linux-2.5.66-pre_membind/mm/swap_state.c linux-2.5.66-membind/mm/swap_state.c
>--- linux-2.5.66-pre_membind/mm/swap_state.c	Mon Mar 24 14:00:21 2003
>+++ linux-2.5.66-membind/mm/swap_state.c	Tue Apr  1 17:12:00 2003
>@@ -47,6 +47,9 @@
> 	.i_shared_sem	= __MUTEX_INITIALIZER(swapper_space.i_shared_sem),
> 	.private_lock	= SPIN_LOCK_UNLOCKED,
> 	.private_list	= LIST_HEAD_INIT(swapper_space.private_list),
>+#ifdef CONFIG_NUMA
>+	.binding	= NULL,
>+#endif
> };
> 
> #define INC_CACHE_INFO(x)	do { swap_cache_info.x++; } while (0)
>
>Please leave swap_state.c out of it: this patch does nothing but add
>an ugly #ifdef to initialize something to 0 which would be 0 anyway.
>
>Hugh
>
>  
>



  reply	other threads:[~2003-04-03 13:32 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-03  5:50 [rfc][patch] Memory Binding Take 2 (0/1) Matthew Dobson
2003-04-03  5:56 ` [rfc][patch] Memory Binding Take 2 (1/1) Matthew Dobson
2003-04-03  6:37   ` Andrew Morton
2003-04-03 23:30     ` Matthew Dobson
2003-04-03 12:20   ` Hugh Dickins
2003-04-03 13:25     ` Paolo Zeppegno [this message]
2003-04-03 23:57     ` Matthew Dobson
2003-04-04 13:40   ` Christoph Hellwig
2003-04-04 13:34 ` [rfc][patch] Memory Binding Take 2 (0/1) Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E8C3646.3010906@seat.it \
    --to=zeppegno.paolo@seat.it \
    --cc=ak@muc.de \
    --cc=akpm@digeo.com \
    --cc=colpatch@us.ibm.com \
    --cc=hch@infradead.org \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    --cc=mbligh@aracnet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox