qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Roland Dreier <roland@kernel.org>,
	qemu-devel@nongnu.org,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Yishai Hadas <yishaih@mellanox.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Michael R. Hines" <mrhines@linux.vnet.ibm.com>,
	Hal Rosenstock <hal.rosenstock@gmail.com>,
	Sean Hefty <sean.hefty@intel.com>,
	Christoph Lameter <cl@linux.com>
Subject: Re: [Qemu-devel] [PATCH] rdma: don't make pages writeable if not requiested
Date: Thu, 21 Mar 2013 14:09:22 -0600	[thread overview]
Message-ID: <20130321200922.GA8109@obsidianresearch.com> (raw)
In-Reply-To: <20130321191541.GB5272@redhat.com>

On Thu, Mar 21, 2013 at 09:15:41PM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 21, 2013 at 12:41:35PM -0600, Jason Gunthorpe wrote:
> > On Thu, Mar 21, 2013 at 08:16:33PM +0200, Michael S. Tsirkin wrote:
> > 
> > > This is the one I find redundant. Since the write will be done by
> > > the adaptor under direct control by the application, why does it
> > > make sense to declare this beforehand?  If you don't want to allow
> > > local write access to memory, just do not post any receive WRs with
> > > this address.  If you posted and regret it, reset the QP to cancel.
> > 
> > This is to support your COW scenario - the app declares before hand to
> > the kernel that it will write to the memory and the kernel ensures
> > pages are dedicated to the app at registration time. Or the app says
> > it will only read and the kernel could leave them shared.
> 
> Someone here is confused. LOCAL_WRITE/absence of it does not address
> COW, it breaks COW anyway.  Are you now saying we should change rdma so
> without LOCAL_WRITE it will not break COW?

I am talking about 'from a spec' perspective - not what Linux does
today. The absence of LOCAL_WRITE is part of the specification to
support shared pages.

Pages can only be kept shared if all the ACCESS WRITE bits are clear -
today Linux always breaks the COW, but if you patch in the ability to
keep things shared then it must only happen when *all* the ACCESS
WRITE bits are clear.

> > The adaptor enforces the access control to prevent a naughty app from
> > writing to shared memory - think about mmap'ing libc.so and then using
> > RDMA to write to the shared pages. It is necessary to ensure that is
> > impossible.

> That's why it's redundant: we can't trust an application to tell us
> 'this page is writeable', we must get this info from kernel.  And so
> there's apparently no need for application to tell adaptor about
> LOCAL_WRITE.

The API design gives user space maximum flexibility, if it wants to
create an enforced no-write MR in otherwise writable pages by skipping
LOCAL_WRITE then it can do so.

The kernel's role in this should be to deny ibv_reg_mr with WRITE bits
set if the pages are not writable by the app - I don't know if it does
this today, it isn't critically important as long as the pages are
unshared.

Jason

  reply	other threads:[~2013-03-21 20:09 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-21  6:18 [Qemu-devel] [PATCH] rdma: don't make pages writeable if not requiested Michael S. Tsirkin
2013-03-21  6:55 ` Roland Dreier
2013-03-21  7:03   ` Michael S. Tsirkin
2013-03-21  7:15     ` Roland Dreier
2013-03-21  8:51       ` Michael S. Tsirkin
2013-03-21  9:13         ` Roland Dreier
2013-03-21  9:39           ` Michael S. Tsirkin
2013-03-21 17:11             ` Jason Gunthorpe
2013-03-21 17:15               ` Michael S. Tsirkin
2013-03-21 17:21                 ` Jason Gunthorpe
2013-03-21 17:42                   ` Michael S. Tsirkin
2013-03-21 17:57                     ` Jason Gunthorpe
2013-03-21 18:03                       ` Michael S. Tsirkin
2013-03-21 18:16               ` Michael S. Tsirkin
2013-03-21 18:41                 ` Jason Gunthorpe
2013-03-21 19:15                   ` Michael S. Tsirkin
2013-03-21 20:09                     ` Jason Gunthorpe [this message]
2013-03-21  9:32   ` Michael S. Tsirkin
2013-03-21 11:30     ` Michael S. Tsirkin
2013-03-21 12:23 ` Michael R. Hines
2013-03-21 12:32   ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130321200922.GA8109@obsidianresearch.com \
    --to=jgunthorpe@obsidianresearch.com \
    --cc=cl@linux.com \
    --cc=hal.rosenstock@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mrhines@linux.vnet.ibm.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=roland@kernel.org \
    --cc=sean.hefty@intel.com \
    --cc=yishaih@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).