public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Ummunotify: progress at last!
@ 2010-03-23 16:06 Jeff Squyres
       [not found] ` <1CDB2AA4-A8DF-4169-943E-4EA190814596-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Squyres @ 2010-03-23 16:06 UTC (permalink / raw)
  To: Linux RDMA List; +Cc: Brad Benton

IBM has found a resource that they think will be able to progress Roland's ummunotify work.

After a few discussions in Sonoma last week and some off-list emails, here's what we decided:

1. Take Roland's last code drop (Roland: can you re-send the last copy of your code?).

2. Do not convert it to the perf events kernel framework as the Linux kernel community requested.  Instead, migrate the functionality into the ibv code base.  Roland thinks that most of the code should be adaptable without too many changes.  Here's the highlights of the new functionality:

   a. Add a new flag to ibv_reg_mr() that does the same function as UMMUNOTIFY_REGISTER_REGION
   b. ibv_dereg_mr() always performs the equivalent of UMMUNOTIFY_UNREGISTER_REGION (if necessary)
   c. Make a new device somewhere (under /dev/infiniband?) that performs the same functions as /dev/ummunotify (open it to mmap the counter into user space, and read events when something "interesting" happens)

3. I will update the Open MPI prototype to use this slightly modified scheme (i.e., no longer have explicit ioctl's to UMMUNOTIFY_DE/REGISTER_REGION, but instead do it during ibv_reg_mr(), etc.).

Roland's initial goal was to have this functionality be separate from verbs because it's applicable outside of this specific use case.  The above proposals makes the ummunotify functionality tied to the ibv API, but given that Roland simply does not have the cycles to make forward progress on the generalized ummunotify mechanism, it seems like a reasonable way to make forward progress.

-- 
Jeff Squyres
jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found] ` <1CDB2AA4-A8DF-4169-943E-4EA190814596-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-03-23 16:59   ` Jason Gunthorpe
       [not found]     ` <20100323165920.GH29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2010-03-23 16:59 UTC (permalink / raw)
  To: Jeff Squyres; +Cc: Linux RDMA List, Brad Benton

On Tue, Mar 23, 2010 at 12:06:50PM -0400, Jeff Squyres wrote:
> IBM has found a resource that they think will be able to progress Roland's ummunotify work.
> 
> After a few discussions in Sonoma last week and some off-list emails, here's what we decided:
> 
> 1. Take Roland's last code drop (Roland: can you re-send the last copy of your code?).
> 
> 2. Do not convert it to the perf events kernel framework as the Linux kernel community requested.  Instead, migrate the functionality into the ibv code base.  Roland thinks that most of the code should be adaptable without too many changes.  Here's the highlights of the new functionality:
> 
>    a. Add a new flag to ibv_reg_mr() that does the same function as UMMUNOTIFY_REGISTER_REGION
>    b. ibv_dereg_mr() always performs the equivalent of
>    UMMUNOTIFY_UNREGISTER_REGION (if necessary)

>    c. Make a new device somewhere (under /dev/infiniband?) that
>    performs the same functions as /dev/ummunotify (open it to mmap
>    the counter into user space, and read events when something
>    "interesting" happens)

I would prefer to do this by adding a new verbs call that returns a fd
directly. Ie use ib_uverbs_alloc_event_file and act like
ibv_create_comp_channel.

The main reason for the new FD is so it can be polled on..

You can also avoid the mmap scheme by doing what perf events does,
pass in a pointer from userspace and have the kernel pin that page it
is on.

So, I'd suggest

fd = ibv_create_mmu_monitor(verbs, &counter);
[..]
poll(fd);
[..]
if (counter != last_counter)
  [..]
close(fd);

Refuse to create more than one mmu_monitor for each verbs for now.

I looked at this for a little while at Sonoma and I think it is quite
straightforward, I'm happy to look over anything.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]     ` <20100323165920.GH29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2010-03-23 17:17       ` Jeff Squyres
       [not found]         ` <D7DCABC9-4BB2-4FBA-A7B3-C06F4A848467-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  2010-03-24  5:55       ` Roland Dreier
  1 sibling, 1 reply; 16+ messages in thread
From: Jeff Squyres @ 2010-03-23 17:17 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Linux RDMA List, Brad Benton

On Mar 23, 2010, at 12:59 PM, Jason Gunthorpe wrote:

> The main reason for the new FD is so it can be polled on..

What do you poll on the fd for?

With ummunotify, you only read() from the fd when (counter != last_counter).  Were you thinking that the poll() would be for something else?

-- 
Jeff Squyres
jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]         ` <D7DCABC9-4BB2-4FBA-A7B3-C06F4A848467-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-03-23 17:29           ` Jason Gunthorpe
       [not found]             ` <20100323172953.GI29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2010-03-24  5:53           ` Roland Dreier
  1 sibling, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2010-03-23 17:29 UTC (permalink / raw)
  To: Jeff Squyres; +Cc: Linux RDMA List, Brad Benton

On Tue, Mar 23, 2010 at 01:17:40PM -0400, Jeff Squyres wrote:
> On Mar 23, 2010, at 12:59 PM, Jason Gunthorpe wrote:
> 
> > The main reason for the new FD is so it can be polled on..
> 
> What do you poll on the fd for?
> 
> With ummunotify, you only read() from the fd when (counter !=
> last_counter).  Were you thinking that the poll() would be for
> something else?

poll() is for apps that want to get the notifications without
spinning on the counter. If you don't think that is worth doing it
does simplify things alot, just add two new verbs calls:

ibv_set_mmu_counter(verbs, &my_counter);
ibv_get_mmu_notifications(verbs, &my_list, sizeof(my_list));

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]             ` <20100323172953.GI29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2010-03-23 19:17               ` Jeff Squyres
       [not found]                 ` <5F80899D-F989-4162-B050-7E4D6B389876-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Squyres @ 2010-03-23 19:17 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Linux RDMA List, Brad Benton

On Mar 23, 2010, at 1:29 PM, Jason Gunthorpe wrote:

> > What do you poll on the fd for?
> 
> poll() is for apps that want to get the notifications without
> spinning on the counter.

Ah, ok.

I think even with the ummunotify interface, that would work, too.  Meaning: since you have to read() from the fd to get event details, poll() would *also* tell you if there was something to read (in addition to checking if (last_counter != counter)).  The counter is a fast way of checking -- e.g., if you need to check in your fast path (which MPI's likely will).  poll() could be used if you don't care if the check is slow.

> If you don't think that is worth doing it
> does simplify things alot, just add two new verbs calls:
> 
> ibv_set_mmu_counter(verbs, &my_counter);
> ibv_get_mmu_notifications(verbs, &my_list, sizeof(my_list));

I have no real opinion on whether the mmap/read should be hidden by the above ibv calls or not.  Either is fine with me.  I would *assume* that ibv_get_mmu_notifications() is non-blocking, right?  E.g., if you ask for N and only M are available (where M < N), then the call returns with only M items filled (and M could be 0).  Perhaps you need another parameter to indicate how many items in my_list were actually filled?  Or is that the return value?

-- 
Jeff Squyres
jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]                 ` <5F80899D-F989-4162-B050-7E4D6B389876-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-03-23 19:52                   ` Jason Gunthorpe
       [not found]                     ` <20100323195251.GJ29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2010-03-23 19:52 UTC (permalink / raw)
  To: Jeff Squyres; +Cc: Linux RDMA List, Brad Benton

On Tue, Mar 23, 2010 at 03:17:12PM -0400, Jeff Squyres wrote:

> > If you don't think that is worth doing it
> > does simplify things alot, just add two new verbs calls:
> > 
> > ibv_set_mmu_counter(verbs, &my_counter);
> > ibv_get_mmu_notifications(verbs, &my_list, sizeof(my_list));
> 
> I have no real opinion on whether the mmap/read should be hidden by
> the above ibv calls or not.  Either is fine with me.  I would
> *assume* that ibv_get_mmu_notifications() is non-blocking, right?
> E.g., if you ask for N and only M are available (where M < N), then
> the call returns with only M items filled (and M could be 0).
> Perhaps you need another parameter to indicate how many items in
> my_list were actually filled?  Or is that the return value?
 
Right, non-blocking, return value indicates length, like read().

These are not hiding mmap/read, they are new uverbs 'syscalls' that
get the kernel to perform that operation.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]                     ` <20100323195251.GJ29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2010-03-23 20:01                       ` Jeff Squyres
       [not found]                         ` <3B848E1F-C9B6-416A-9E6E-99604E71902A-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Squyres @ 2010-03-23 20:01 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Linux RDMA List, Brad Benton

On Mar 23, 2010, at 3:52 PM, Jason Gunthorpe wrote:

> > > ibv_set_mmu_counter(verbs, &my_counter);
> > > ibv_get_mmu_notifications(verbs, &my_list, sizeof(my_list));
> 
> These are not hiding mmap/read, they are new uverbs 'syscalls' that
> get the kernel to perform that operation.

Oh -- so there's 2 mechanisms to get the counter info (for example):

1. the above uverb
2. mmap

Right?

I don't really have an opinion here -- I'm not really an "owner" of the ibv API.  As long as there is a fast/mmap way for me to get the counter without an extra function call, I'm happy.  :-)

-- 
Jeff Squyres
jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]                         ` <3B848E1F-C9B6-416A-9E6E-99604E71902A-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-03-23 20:11                           ` Jason Gunthorpe
       [not found]                             ` <20100323201124.GK29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2010-03-23 20:11 UTC (permalink / raw)
  To: Jeff Squyres; +Cc: Linux RDMA List, Brad Benton

On Tue, Mar 23, 2010 at 04:01:21PM -0400, Jeff Squyres wrote:
> On Mar 23, 2010, at 3:52 PM, Jason Gunthorpe wrote:
> 
> > > > ibv_set_mmu_counter(verbs, &my_counter);
> > > > ibv_get_mmu_notifications(verbs, &my_list, sizeof(my_list));
> > 
> > These are not hiding mmap/read, they are new uverbs 'syscalls' that
> > get the kernel to perform that operation.
> 
> Oh -- so there's 2 mechanisms to get the counter info (for example):
> 
> 1. the above uverb
> 2. mmap
> 
> Right?

No, there is no mmap. Like this:

u64 my_counter = 0;

ibv_set_mmu_counter(verbs, &my_counter);
[..]
while (my_counter != last_my_counter) {
    last_my_counter = my_counter;
    ibv_get_mmu_notifications(verbs, ...);   // <- I am a memory barrier as well
}

The kernel 'syscall' ibv_set_mmu_counter would bind the given verbs to
the 8 byte counter you specified without having to the mmap thing. As
I understand it this is what perfevents does.

Integrating with the verbs api avoids the need for another device
file. That is good. Eliminating poll() from the API can remove the
dedicated fd entirely. Within verbs I guess we could replace poll()
with something like a completion channel (??) if anyone cares.

So the user space visible functionality boils down to 2 new 'syscalls'
and the new flag to ibv_reg_mr.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]         ` <D7DCABC9-4BB2-4FBA-A7B3-C06F4A848467-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  2010-03-23 17:29           ` Jason Gunthorpe
@ 2010-03-24  5:53           ` Roland Dreier
  1 sibling, 0 replies; 16+ messages in thread
From: Roland Dreier @ 2010-03-24  5:53 UTC (permalink / raw)
  To: Jeff Squyres; +Cc: Jason Gunthorpe, Linux RDMA List, Brad Benton

 > What do you poll on the fd for?

 > With ummunotify, you only read() from the fd when (counter !=
 > last_counter).  Were you thinking that the poll() would be for
 > something else?

The ummunotify fd supported poll() (and SIGIO etc).  You don't have to
use it if you don't want to, but I made it as much of a normal fd as
possible.

 - R.
-- 
Roland Dreier  <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]     ` <20100323165920.GH29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2010-03-23 17:17       ` Jeff Squyres
@ 2010-03-24  5:55       ` Roland Dreier
       [not found]         ` <adapr2u2rib.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: Roland Dreier @ 2010-03-24  5:55 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Jeff Squyres, Linux RDMA List, Brad Benton

 > I would prefer to do this by adding a new verbs call that returns a fd
 > directly. Ie use ib_uverbs_alloc_event_file and act like
 > ibv_create_comp_channel.
 > 
 > The main reason for the new FD is so it can be polled on..

Agree, we don't want a new device node I don't think -- too hard to
associate an fd you get from a separate open() with a uverbs context.

 > You can also avoid the mmap scheme by doing what perf events does,
 > pass in a pointer from userspace and have the kernel pin that page it
 > is on.

I wonder, is that a win?  I guess you don't even have to pin it, just do
copy_to_user() to update the counter, but mmap doesn't seem so bad.

I'll have to look at the perf code.
-- 
Roland Dreier  <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]         ` <adapr2u2rib.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
@ 2010-03-24  5:59           ` Jason Gunthorpe
       [not found]             ` <20100324055913.GA9769-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2010-03-24  5:59 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jeff Squyres, Linux RDMA List, Brad Benton

On Tue, Mar 23, 2010 at 10:55:08PM -0700, Roland Dreier wrote:
>  > I would prefer to do this by adding a new verbs call that returns a fd
>  > directly. Ie use ib_uverbs_alloc_event_file and act like
>  > ibv_create_comp_channel.
>  > 
>  > The main reason for the new FD is so it can be polled on..
> 
> Agree, we don't want a new device node I don't think -- too hard to
> associate an fd you get from a separate open() with a uverbs context.

Right, that would suck..
 
>  > You can also avoid the mmap scheme by doing what perf events does,
>  > pass in a pointer from userspace and have the kernel pin that page it
>  > is on.
> 
> I wonder, is that a win?  I guess you don't even have to pin it, just do
> copy_to_user() to update the counter, but mmap doesn't seem so bad.

Not sure you can call copy_to_user from a mmu_notifier callback? What
if it faults?

I think the idea would be that by letting user space select the
counter location it could place it in a sensible cachline/etc and
probably avoid a pointer indirection.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]                             ` <20100323201124.GK29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2010-03-24  5:59                               ` Roland Dreier
       [not found]                                 ` <adaljdi2rap.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
  2010-04-07 19:37                               ` Roland Dreier
  1 sibling, 1 reply; 16+ messages in thread
From: Roland Dreier @ 2010-03-24  5:59 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Jeff Squyres, Linux RDMA List, Brad Benton

 > No, there is no mmap. Like this:
 > 
 > u64 my_counter = 0;
 > 
 > ibv_set_mmu_counter(verbs, &my_counter);
 > [..]
 > while (my_counter != last_my_counter) {
 >     last_my_counter = my_counter;
 >     ibv_get_mmu_notifications(verbs, ...);   // <- I am a memory barrier as well
 > }
 > 
 > The kernel 'syscall' ibv_set_mmu_counter would bind the given verbs to
 > the 8 byte counter you specified without having to the mmap thing. As
 > I understand it this is what perfevents does.
 > 
 > Integrating with the verbs api avoids the need for another device
 > file. That is good. Eliminating poll() from the API can remove the
 > dedicated fd entirely. Within verbs I guess we could replace poll()
 > with something like a completion channel (??) if anyone cares.

That is all definitely doable.  I wonder if it's better to get rid of
the dedicated fd though.  After all, having the fd means a fancy app can
do poll() or sigio or whatever internally.  Being able to integrate into
an fd-driven event loop seems like a pretty big thing to give up.

Also, having a uverbs "syscall" that is exactly like read() seems a bit
of a stretch, even within the ugliness that is the uverbs interface.
(We love all our children, but sometimes we have to be realistic)

 - R.
-- 
Roland Dreier  <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]             ` <20100324055913.GA9769-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2010-03-24  6:17               ` Roland Dreier
  0 siblings, 0 replies; 16+ messages in thread
From: Roland Dreier @ 2010-03-24  6:17 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Jeff Squyres, Linux RDMA List, Brad Benton

 > > I wonder, is that a win?  I guess you don't even have to pin it, just do
 > > copy_to_user() to update the counter, but mmap doesn't seem so bad.
 > 
 > Not sure you can call copy_to_user from a mmu_notifier callback? What
 > if it faults?

Good point... we could defer it to a context where we could block but...

 > I think the idea would be that by letting user space select the
 > counter location it could place it in a sensible cachline/etc and
 > probably avoid a pointer indirection.

OK, I have to look at the perf stuff.  That makes sense but OTOH
userspace could (unknowingly) pick a highmen page and then we end up
having to do kmap_atomic or something from an mmu notifier callback.
-- 
Roland Dreier  <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]                                 ` <adaljdi2rap.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
@ 2010-03-24 17:43                                   ` Jason Gunthorpe
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2010-03-24 17:43 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jeff Squyres, Linux RDMA List, Brad Benton

On Tue, Mar 23, 2010 at 10:59:42PM -0700, Roland Dreier wrote:

> That is all definitely doable.  I wonder if it's better to get rid of
> the dedicated fd though.  After all, having the fd means a fancy app can
> do poll() or sigio or whatever internally.  Being able to integrate into
> an fd-driven event loop seems like a pretty big thing to give up.

Not sure, adding the fd is only a small amount of work. But on the
other hand integrating with a completion channel might bet better
suited to verbs's API, and even less work..

> Also, having a uverbs "syscall" that is exactly like read() seems a bit
> of a stretch, even within the ugliness that is the uverbs interface.
> (We love all our children, but sometimes we have to be realistic)

I agree, but if the FD is eliminated then that seems like the best bet
- it wouldn't be exactly like read, it would be a multiplexed uverb
command executed over the uverbs fd.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]                             ` <20100323201124.GK29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2010-03-24  5:59                               ` Roland Dreier
@ 2010-04-07 19:37                               ` Roland Dreier
       [not found]                                 ` <ada39z72gww.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: Roland Dreier @ 2010-04-07 19:37 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Jeff Squyres, Linux RDMA List, Brad Benton

 > No, there is no mmap. Like this:
 > 
 > u64 my_counter = 0;
 > 
 > ibv_set_mmu_counter(verbs, &my_counter);
 > [..]
 > while (my_counter != last_my_counter) {
 >     last_my_counter = my_counter;
 >     ibv_get_mmu_notifications(verbs, ...);   // <- I am a memory barrier as well
 > }
 > 
 > The kernel 'syscall' ibv_set_mmu_counter would bind the given verbs to
 > the 8 byte counter you specified without having to the mmap thing. As
 > I understand it this is what perfevents does.

I was trying to look at how perf events handles this, and AFAICT it
looks like kernel/perf_event.c just supports mmap().  Can you expand on
what you meant here?

(I was trying to figure out how one would handle the case where
userspace gives us a counter in highmem -- doing kmap_atomic() seems to
be to only option but then I'm not sure if I want to deal with that...)
-- 
Roland Dreier <rolandd-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Ummunotify: progress at last!
       [not found]                                 ` <ada39z72gww.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
@ 2010-04-07 20:05                                   ` Jason Gunthorpe
  0 siblings, 0 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2010-04-07 20:05 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jeff Squyres, Linux RDMA List, Brad Benton

On Wed, Apr 07, 2010 at 12:37:03PM -0700, Roland Dreier wrote:
>  > No, there is no mmap. Like this:
>  > 
>  > u64 my_counter = 0;
>  > 
>  > ibv_set_mmu_counter(verbs, &my_counter);
>  > [..]
>  > while (my_counter != last_my_counter) {
>  >     last_my_counter = my_counter;
>  >     ibv_get_mmu_notifications(verbs, ...);   // <- I am a memory barrier as well
>  > }
>  > 
>  > The kernel 'syscall' ibv_set_mmu_counter would bind the given verbs to
>  > the 8 byte counter you specified without having to the mmap thing. As
>  > I understand it this is what perfevents does.
> 
> I was trying to look at how perf events handles this, and AFAICT it
> looks like kernel/perf_event.c just supports mmap().  Can you expand on
> what you meant here?
> 
> (I was trying to figure out how one would handle the case where
> userspace gives us a counter in highmem -- doing kmap_atomic() seems to
> be to only option but then I'm not sure if I want to deal with that...)

I think I was mistaken here, disregard..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-04-07 20:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-23 16:06 Ummunotify: progress at last! Jeff Squyres
     [not found] ` <1CDB2AA4-A8DF-4169-943E-4EA190814596-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-03-23 16:59   ` Jason Gunthorpe
     [not found]     ` <20100323165920.GH29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-03-23 17:17       ` Jeff Squyres
     [not found]         ` <D7DCABC9-4BB2-4FBA-A7B3-C06F4A848467-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-03-23 17:29           ` Jason Gunthorpe
     [not found]             ` <20100323172953.GI29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-03-23 19:17               ` Jeff Squyres
     [not found]                 ` <5F80899D-F989-4162-B050-7E4D6B389876-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-03-23 19:52                   ` Jason Gunthorpe
     [not found]                     ` <20100323195251.GJ29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-03-23 20:01                       ` Jeff Squyres
     [not found]                         ` <3B848E1F-C9B6-416A-9E6E-99604E71902A-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-03-23 20:11                           ` Jason Gunthorpe
     [not found]                             ` <20100323201124.GK29129-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-03-24  5:59                               ` Roland Dreier
     [not found]                                 ` <adaljdi2rap.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-03-24 17:43                                   ` Jason Gunthorpe
2010-04-07 19:37                               ` Roland Dreier
     [not found]                                 ` <ada39z72gww.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-04-07 20:05                                   ` Jason Gunthorpe
2010-03-24  5:53           ` Roland Dreier
2010-03-24  5:55       ` Roland Dreier
     [not found]         ` <adapr2u2rib.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-03-24  5:59           ` Jason Gunthorpe
     [not found]             ` <20100324055913.GA9769-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-03-24  6:17               ` Roland Dreier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox