netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jerome Glisse <jglisse@redhat.com>
Cc: "Leon Romanovsky" <leon@kernel.org>,
	"Kenneth Lee" <liguozhu@hisilicon.com>,
	"Tim Sell" <timothy.sell@unisys.com>,
	linux-doc@vger.kernel.org,
	"Alexander Shishkin" <alexander.shishkin@linux.intel.com>,
	"Zaibo Xu" <xuzaibo@huawei.com>,
	zhangfei.gao@foxmail.com, linuxarm@huawei.com,
	haojian.zhuang@linaro.org, "Christoph Lameter" <cl@linux.com>,
	"Hao Fang" <fanghao11@huawei.com>,
	"Gavin Schenk" <g.schenk@eckelmann.de>,
	"RDMA mailing list" <linux-rdma@vger.kernel.org>,
	"Zhou Wang" <wangzhou1@hisilicon.com>,
	"Doug Ledford" <dledford@redhat.com>,
	"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>,
	"David Kershner" <david.kershner@unisys.com>,
	"Kenneth Lee" <nek.in.cn@gmail.com>,
	"Johan Hovold" <johan@kernel.org>,
	"Cyrille Pitchen" <cyrille.pitchen@free-electrons.com>
Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
Date: Mon, 19 Nov 2018 13:11:56 -0700	[thread overview]
Message-ID: <20181119201156.GG4890@ziepe.ca> (raw)
In-Reply-To: <20181119194631.GE4593@redhat.com>

On Mon, Nov 19, 2018 at 02:46:32PM -0500, Jerome Glisse wrote:

> > ?? How can O_DIRECT be fine but RDMA not? They use exactly the same
> > get_user_pages flow, right? Can we do what O_DIRECT does in RDMA and
> > be fine too?
> > 
> > AFAIK the only difference is the length of the race window. You'd have
> > to fork and fault during the shorter time O_DIRECT has get_user_pages
> > open.
> 
> Well in O_DIRECT case there is only one page table, the CPU
> page table and it gets updated during fork() so there is an
> ordering there and the race window is small.

Not really, in O_DIRECT case there is another 'page table', we just
call it a DMA scatter/gather list and it is sent directly to the block
device's DMA HW. The sgl plays exactly the same role as the various HW
page list data structures that underly RDMA MRs.

It is not a page table that matters here, it is if the DMA address of
the page is active for DMA on HW.

Like you say, the only difference is that the race is hopefully small
with O_DIRECT (though that is not really small, NVMeof for instance
has windows as large as connection timeouts, if you try hard enough)

So we probably can trigger this trouble with O_DIRECT and fork(), and
I would call it a bug :(

> > Why? Keep track in each mm if there are any active get_user_pages
> > FOLL_WRITE pages in the mm, if yes then sweep the VMAs and fix the
> > issue for the FOLL_WRITE pages.
> 
> This has a cost and you don't want to do it for O_DIRECT. I am pretty
> sure that any such patch to modify fork() code path would be rejected.
> At least i would not like it and vote against.

I was thinking the incremental cost on top of what John is already
doing would be very small in the common case and only be triggered in
cases that matter (which apps should avoid anyhow).

Jason

  reply	other threads:[~2018-11-19 20:11 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20181112075807.9291-1-nek.in.cn@gmail.com>
     [not found] ` <20181112075807.9291-2-nek.in.cn@gmail.com>
2018-11-13  0:23   ` [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce Leon Romanovsky
2018-11-14  2:58     ` Kenneth Lee
2018-11-14 16:00       ` Leon Romanovsky
2018-11-15  8:51         ` Kenneth Lee
2018-11-15 14:54           ` Leon Romanovsky
2018-11-19  9:14             ` Kenneth Lee
2018-11-19  9:19               ` Kenneth Lee
2018-11-19 10:48                 ` Leon Romanovsky
2018-11-19 16:48                   ` Jerome Glisse
2018-11-19 18:27                     ` Jason Gunthorpe
2018-11-19 18:42                       ` Jerome Glisse
2018-11-19 18:53                         ` Jason Gunthorpe
2018-11-19 19:17                           ` Jerome Glisse
2018-11-19 19:27                             ` Jason Gunthorpe
2018-11-19 19:46                               ` Jerome Glisse
2018-11-19 20:11                                 ` Jason Gunthorpe [this message]
2018-11-19 20:26                                   ` Jerome Glisse
2018-11-19 21:26                                     ` Jason Gunthorpe
2018-11-19 21:33                                       ` Jerome Glisse
2018-11-19 21:41                                         ` Jason Gunthorpe
2018-11-19 19:02                         ` Leon Romanovsky
2018-11-19 19:19                         ` Christopher Lameter
2018-11-19 19:25                           ` Jerome Glisse
2018-11-20  2:30                   ` Kenneth Lee
2018-11-27  2:52                     ` Kenneth Lee
2018-11-19 18:49               ` Jason Gunthorpe
2018-11-20  3:07                 ` Kenneth Lee
2018-11-20  3:29                   ` Jason Gunthorpe
2018-11-20  9:16                     ` Jonathan Cameron
2018-11-20 12:19                       ` Jean-Philippe Brucker
2018-11-21  6:08                     ` Kenneth Lee
2018-11-22  2:58                       ` Jason Gunthorpe
2018-11-23  8:02                         ` Kenneth Lee
2018-11-23 18:05                           ` Jason Gunthorpe
2018-11-24  4:13                             ` Kenneth Lee
2018-11-20  5:17                   ` Leon Romanovsky
2018-11-21  3:02                     ` Kenneth Lee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181119201156.GG4890@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=cl@linux.com \
    --cc=cyrille.pitchen@free-electrons.com \
    --cc=david.kershner@unisys.com \
    --cc=dledford@redhat.com \
    --cc=fanghao11@huawei.com \
    --cc=g.schenk@eckelmann.de \
    --cc=haojian.zhuang@linaro.org \
    --cc=jglisse@redhat.com \
    --cc=johan@kernel.org \
    --cc=leon@kernel.org \
    --cc=liguozhu@hisilicon.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=nek.in.cn@gmail.com \
    --cc=timothy.sell@unisys.com \
    --cc=u.kleine-koenig@pengutronix.de \
    --cc=wangzhou1@hisilicon.com \
    --cc=xuzaibo@huawei.com \
    --cc=zhangfei.gao@foxmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).