public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Matthew Dobson <colpatch@us.ibm.com>
To: Andrew Morton <akpm@digeo.com>
Cc: linux-kernel@vger.kernel.org, mbligh@aracnet.com,
	hch@infradead.org, zeppegno.paolo@seat.it, ak@muc.de,
	lse-tech@lists.sourceforge.net, Hugh Dickins <hugh@veritas.com>
Subject: Re: [rfc][patch] Memory Binding Take 2 (1/1)
Date: Thu, 03 Apr 2003 15:30:37 -0800	[thread overview]
Message-ID: <3E8CC41D.90003@us.ibm.com> (raw)
In-Reply-To: 20030402223736.1277755f.akpm@digeo.com

Andrew Morton wrote:
> Matthew Dobson <colpatch@us.ibm.com> wrote:
> 
>>+#define __NR_mbind		223
> 
> 
> What was wrong with "membind"?
Well, there was nothing wrong with it, per se, I just liked Paolo's 
suggestion to align the naming with mmap, munmap, mremap, etc.  The 
syscalls that manipulate a processes address space tend to be called 
m(something).  Right now it isn't as generic as I'd like it to be, but 
all in good time.

>>+/* Translate a cpumask to a nodemask */
>>+static inline void cpumask_to_nodemask(bitmap_t cpumask, bitmap_t nodemask)
>>+{
>>+	int i;
>>+
>>+	for (i = 0; i < NR_CPUS; i++)
>>+		if (test_bit(i, cpumask))
> 
> 
> That's a bit weird.  test_bit is only permitted on longs, so why introduce
> bitmap_t?
Erm...  Good point.  I really wanted to try and maintain the abstraction 
of a bitmap type.  I hoped that we could, via macros and typedefs, keep 
the underlying data type obscured, and have a good facsimile of variable 
length bitmaps.  It's proving too difficult to hide the fact that 
they're just unsigned long[]'s, so I'll give up the ghost and pass them 
as unsigned long *'s.

>>+/* Top-level function for allocating a binding for a region of memory */
>>+static inline struct binding *alloc_binding(bitmap_t nodemask)
>>+{
>>+	struct binding *binding;
>>+	int node, zone_num;
>>+
>>+	binding = (struct binding *)kmalloc(sizeof(struct binding), GFP_KERNEL);
>>+	if (!binding)
>>+		return NULL;
>>+	memset(binding, 0, sizeof(struct binding));
>>+
>>+	/* Build binding zonelist */
>>+	for (node = 0, zone_num = 0; node < MAX_NUMNODES; node++)
>>+		if (test_bit(node, nodemask) && node_online(node))
>>+			zone_num = add_node(NODE_DATA(node), 
>>+				&binding->zonelist, zone_num);
>>+	binding->zonelist.zones[zone_num] = NULL;
>>+
>>+	if (zone_num == 0) {
>>+		/* No zones were added to the zonelist.  Let the caller know. */
>>+		kfree(binding);
>>+		binding = NULL;
>>+	}
>>+	return binding;
>>+} 
> 
> It looks like this function needs to be able to return a real errno (see
> below).
True.  EFAULT is a sorta decent catchall, but not appropriate for 
something like no memory, etc.

>>+	struct vm_area_struct *vma = NULL;
>>+	struct address_space *mapping;
>>+	int copy_len, error = 0;
>>+
>>+	/* Deal with getting cpu_mask from userspace & translating to node_mask */
>>+	copy_len = min(mask_len, (unsigned int)NR_CPUS);
>>+	CLEAR_BITMAP(cpu_mask, NR_CPUS);
>>+	CLEAR_BITMAP(node_mask, MAX_NUMNODES);
>>+	if (copy_from_user(cpu_mask, mask_ptr, (copy_len+7)/8)) {
>>+		error = -EFAULT;
>>+		goto out;
>>+	}
>>+	cpumask_to_nodemask(cpu_mask, node_mask);
>>+
>>+	vma = find_vma(current->mm, start);
>>+	if (!(vma && vma->vm_file && vma->vm_ops && 
>>+		vma->vm_ops->nopage == shmem_nopage)) {
>>+		/* This isn't a shm segment.  For now, we bail. */
>>+		error = -EINVAL;
>>+		goto out;
>>+	}
>>+
>>+	mapping = vma->vm_file->f_dentry->d_inode->i_mapping;
>>+	mapping->binding = alloc_binding(node_mask);
>>+	if (!mapping->binding)
>>+		error = -EFAULT;
> 
> 
> It returns EFAULT on memory exhaustion?
No longer...  That'll be fixed in version 3.

> btw, can you remind me again why this is only available to tmpfs pagecache?
I can try! ;)  I originally wanted to do just a shared memory binding 
call, but people (correctly) suggested a more generic memory binding 
would be more useful.  So I've basically just set up a lot of the 
infrastructure for a more generic call, but haven't fully implemented 
it.  This patch is intended to be a starting point, from which it will 
be easy to incrementally add more functionality and power to the binding 
call.  The underlying code (syscalls, structures, .c files, allocator 
changes) won't have to change too much.  So this patch works for any 
shared memory segment.  It'd be straightforward to extend this to any 
file-backed vma (because it already has a struct address_space, with a 
struct binding in it), so I hope to grow this into something more.

Cheers!

-Matt


  reply	other threads:[~2003-04-03 23:29 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-03  5:50 [rfc][patch] Memory Binding Take 2 (0/1) Matthew Dobson
2003-04-03  5:56 ` [rfc][patch] Memory Binding Take 2 (1/1) Matthew Dobson
2003-04-03  6:37   ` Andrew Morton
2003-04-03 23:30     ` Matthew Dobson [this message]
2003-04-03 12:20   ` Hugh Dickins
2003-04-03 13:25     ` Paolo Zeppegno
2003-04-03 23:57     ` Matthew Dobson
2003-04-04 13:40   ` Christoph Hellwig
2003-04-04 13:34 ` [rfc][patch] Memory Binding Take 2 (0/1) Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3E8CC41D.90003@us.ibm.com \
    --to=colpatch@us.ibm.com \
    --cc=ak@muc.de \
    --cc=akpm@digeo.com \
    --cc=hch@infradead.org \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    --cc=mbligh@aracnet.com \
    --cc=zeppegno.paolo@seat.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox