From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754588Ab0IOPpF (ORCPT ); Wed, 15 Sep 2010 11:45:05 -0400 Received: from relay3.sgi.com ([192.48.152.1]:41610 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752938Ab0IOPpD (ORCPT ); Wed, 15 Sep 2010 11:45:03 -0400 Date: Wed, 15 Sep 2010 10:44:58 -0500 From: Robin Holt To: Linus Torvalds Cc: Christopher Yeoh , Avi Kivity , linux-kernel@vger.kernel.org, Linux Memory Management List , Ingo Molnar Subject: Re: [RFC][PATCH] Cross Memory Attach Message-ID: <20100915154458.GE3013@sgi.com> References: <20100915104855.41de3ebf@lilo> <4C90A6C7.9050607@redhat.com> <20100916001232.0c496b02@lilo> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > 3. ability to map part of another process's address space directly into > >   the current one. Would have setup/tear down overhead, but this would > >   be useful specifically for reduction operations where we don't even > >   need to really copy the data once at all, but use it directly in > >   arithmetic/logical operations on the receiver. > > Don't even think about this. If you want to map another tasks memory, > use shared memory. The shared memory code knows about that. The races > for anything else are crazy. SGI has a similar, but significantly more difficult, problem to solve and have written a fairly complex driver to handle exactly the scenario IBM is proposing. In our case, not only are we trying to directly access one processes memory, we are doing it from a completely different operating system instance operating on the same numa fabric. In our case (I have not looked at IBMs patch), we are actually using get_user_pages() to get extra references on struct pages. We are judicious about reference counting the mm and we use get_task_mm in all places with the exception of process teardown (ignorable detail for now). We have a fault handler inserting PFNs as appropriate. You can guess at the complexity. Even with all its complexity, we still need to caveat certain functionality as not being supported. If we were to try and get that driver included in the kernel, how would you suggest we expand the shared memory code to include support for the coordination needed between those seperate operating system instances? I am genuinely interested and not trying to be argumentative. This has been on my "Get done before Aug-1 list for months and I have not had any time to pursue. Thanks, Robin