public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH][RFC][0/4] InfiniBand userspace verbs implementation
@ 2005-04-04 22:09 Roland Dreier
  2005-04-04 22:09 ` [PATCH][RFC][1/4] IB: core changes for userspace verbs Roland Dreier
  2005-04-11 14:22 ` [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Troy Benjegerdes
  0 siblings, 2 replies; 145+ messages in thread
From: Roland Dreier @ 2005-04-04 22:09 UTC (permalink / raw)
  To: linux-kernel, openib-general

Here is an initial implementation of InfiniBand userspace verbs.  I
plan to commit this code to the OpenIB repository shortly, and submit
it for inclusion during the 2.6.13 cycle, so I am posting it early for
comments.

This code, in conjunction with the libibverbs and libmthca userspace
libraries available from the subversion trees at

    https://openib.org/svn/gen2/branches/roland-uverbs/src/userspace/libibverbs
    https://openib.org/svn/gen2/branches/roland-uverbs/src/userspace/libmthca

enables userspace processes to access InfiniBand HCAs directly.

For those not familiar with the InfiniBand architecture, this
so-called "userspace verbs" support allows userspace to post data path
commands directly to the HCA.  Resource allocation and other control
path operations still go through the kernel driver.

Please take a look at this code if you have a chance.  I would
appreciate high-level criticism of the design and implementation as
well as nitpicky complaints about coding style and typos.

In particular, the memory pinning code in in uverbs_mem.c could stand
a looking over.  In addition, a sanity check of the write()-based
scheme for passing commands into the kernel in uverbs_main.c and
uverbs_cmd.c is probably worthwhile.

Thanks,
  Roland



^ permalink raw reply	[flat|nested] 145+ messages in thread
* Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation
@ 2005-04-22 13:10 Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
  2005-04-22 17:01 ` [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbsimplementation Fab Tillier
  0 siblings, 1 reply; 145+ messages in thread
From: Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org> @ 2005-04-22 13:10 UTC (permalink / raw)
  To: Andy Isaacson, Timur Tabi, Troy Benjegerdes, Bernhard Fischer,
	Arjan van de Ven, linux-kernel, openib-general

Andy Isaacson <adi@hexapodia.org> wrote:
> On Wed, Apr 20, 2005 at 10:07:45PM -0500, Timur Tabi wrote:

>> I don't know if VM_REGISTERED is a good idea or not, but it should be
>> absolutely impossible for the kernel to reclaim "registered" (aka pinned)
>> memory, no matter what. For RDMA services (such as Infiniband, iWARP, etc),
>> it's normal for non-root processes to pin hundreds of megabytes of memory,
>> and that memory better be locked to those physical pages until the
>> application deregisters them.
> 
> If you take the hardline position that "the app is the only thing that
> matters", your code is unlikely to get merged.  Linux is a
> general-purpose OS.

All userspace hardware drivers with DMA will require pinned pages (and some
of them will require continuous memory). Since this memory may be scheduled
to be accessed by DMA, reclaiming those pages may (aka. will) result in
"random" memory corruption unless done by the driver itself.

You can't even set a time limit, the driver may have allocated all DMA
memory to queued transfers, and some media needs to get plugged in by
the lazy robot. As soon as the robot arrives - boom. (For the same reason,
this memory MUST NOT be freed if the application terminates abnormally,
e.g. killed by OOM).

In other words, you need to make this memory as unaccessible as the
framebuffer on a graphic card. If that causes a lockup, you better had
prevented that while allocating.

> In a Linux context, I doubt that fullblown SA is necessary or
> appropriate.  Rather, I'd suggest two new signals, SIGMEMLOW and
> SIGMEMCRIT.  The userland comms library registers handlers for both.
> When the kernel decides that it needs to reclaim some memory from the
> app, it sends SIGMEMLOW.  The comms library then has the responsibility
> to un-reserve some memory in an orderly fashion.  If a reasonable [1]
> time has expired since SIGMEMLOW and the kernel is still hungry, the
> kernel sends SIGMEMCRIT.  At this point, the comms lib *must* unregister
> some memory [2] even if it has to drop state to do so; if it returns
> from the signal handler without having unregistered the memory, the
> kernel will SIGKILL.

Choosing Data loss vs. finitely stalled system may sometimes be a bad
decision.

If I designes an application that might get a "gimme memory or die",
I'd reserve an extra bunch of memory with the only purpose of being
released in this situation. If the kernel had done that instead, this
part of memory could have been used e.g. as a read-only disk cache in
the meantime (off cause provided somebody cared to implement that).

> [2] Is there a way for the kernel to pass down to userspace how many
>     pages it wants, maybe in the sigcontext?

Then you'd need only one signal.

I think this interface is usefull, it would e.g. allow a picture viewer
to cache as many decoded and scaled pictures as the RAM permits, freeing
them if the RAM gets full and the swap would have to be used.

-- 
"When the pin is pulled, Mr. Grenade is not our friend.
-U.S. Marine Corps


^ permalink raw reply	[flat|nested] 145+ messages in thread

end of thread, other threads:[~2005-05-11 23:07 UTC | newest]

Thread overview: 145+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-04 22:09 [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Roland Dreier
2005-04-04 22:09 ` [PATCH][RFC][1/4] IB: core changes for userspace verbs Roland Dreier
2005-04-04 22:09   ` [PATCH][RFC][2/4] IB: userspace verbs main module Roland Dreier
2005-04-04 22:09     ` [PATCH][RFC][3/4] IB: userspace verbs mthca changes Roland Dreier
2005-04-04 22:09       ` [PATCH][RFC][4/4] IB: userspace verbs Kconfig/Makefile changes Roland Dreier
2005-04-04 22:49       ` [openib-general] [PATCH][RFC][3/4] IB: userspace verbs mthca changes Tom Duffy
2005-04-04 23:34         ` Roland Dreier
2005-04-21  0:37       ` [PATCH][MTHCA] fix sparc build WAS: " Tom Duffy
2005-04-21  0:38         ` David S. Miller
2005-04-11 14:22 ` [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Troy Benjegerdes
2005-04-11 15:34   ` Roland Dreier
2005-04-11 16:33     ` Troy Benjegerdes
2005-04-11 16:56       ` Roland Dreier
2005-04-11 18:01         ` Troy Benjegerdes
2005-04-11 18:03           ` Roland Dreier
2005-04-12  0:13             ` Andrew Morton
2005-04-12  0:21               ` Roland Dreier
2005-04-12 18:23                 ` Michael S. Tsirkin
2005-04-13 18:28                   ` Roland Dreier
2005-04-13 19:32                     ` Andrew Morton
2005-04-13  1:04               ` [openib-general] " Libor Michalek
2005-04-18 17:15                 ` Timur Tabi
2005-04-26  3:31                 ` Libor Michalek
2005-05-04 18:27                   ` Timur Tabi
2005-05-05 18:48                     ` Timur Tabi
2005-05-06 23:08                       ` Timur Tabi
2005-05-07 13:18                         ` Hugh Dickins
2005-05-07 14:45                           ` Timur Tabi
2005-05-07 16:30                             ` Hugh Dickins
2005-05-11 20:12                               ` William Jordan
2005-05-11 20:42                                 ` Hugh Dickins
2005-05-11 22:52                                   ` Andrea Arcangeli
2005-05-11 22:49                                 ` Andrea Arcangeli
2005-05-11 22:53                                   ` Timur Tabi
2005-05-11 23:05                                     ` Andrea Arcangeli
2005-05-05 23:34                     ` Libor Michalek
2005-04-18 16:22               ` Timur Tabi
2005-04-18 16:43                 ` Christoph Hellwig
2005-04-18 16:45                   ` Timur Tabi
2005-04-24  2:44                     ` Andrew Morton
2005-04-24 14:23                       ` Timur Tabi
2005-04-24 20:53                         ` Greg KH
2005-04-24 21:52                           ` Timur Tabi
2005-04-25  1:03                             ` Greg KH
2005-04-25  4:12                               ` Timur Tabi
2005-04-25 13:30                                 ` Dave Hansen
2005-04-25 13:15                         ` Roland Dreier
2005-04-25 13:17                           ` Christoph Hellwig
2005-04-25 14:16                             ` Roland Dreier
2005-04-25 20:54                           ` Andrew Morton
2005-04-25 21:12                             ` Roland Dreier
2005-04-25 22:14                               ` Andrew Morton
2005-04-25 22:21                                 ` Timur Tabi
2005-04-25 22:32                                   ` Andrew Morton
2005-04-25 23:58                                     ` Roland Dreier
2005-04-26  0:11                                       ` Andrew Morton
2005-04-26  0:23                                         ` Roland Dreier
2005-04-26  0:37                                           ` Andrew Morton
2005-04-26  2:21                                             ` Timur Tabi
2005-04-26  3:16                                               ` Andrew Morton
2005-04-26  3:38                                                 ` Timur Tabi
2005-04-26  4:33                                                   ` Andrew Morton
2005-04-26 14:07                                                     ` Timur Tabi
2005-04-26 15:31                                             ` Roland Dreier
2005-04-26 15:42                                               ` [openib-general] " Libor Michalek
2005-04-26 15:49                                                 ` Roland Dreier
2005-04-26 19:28                                                   ` Andrew Morton
2005-04-26 20:14                                                     ` Roland Dreier
2005-04-26 20:18                                                       ` Timur Tabi
2005-04-26 20:37                                                         ` Andrew Morton
2005-04-29 14:26                                                           ` Bill Jordan
2005-04-29 15:56                                                             ` Caitlin Bestler
2005-04-29 16:45                                                               ` RDMA memory registration (was: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation) Roland Dreier
2005-04-29 17:23                                                                 ` Libor Michalek
2005-04-29 18:22                                                                 ` RDMA memory registration Brice Goglin
2005-04-29 18:31                                                                   ` Roland Dreier
2005-04-29 19:33                                                                   ` [openib-general] " Grant Grundler
2005-05-03  8:42                                                                     ` David Addison
2005-05-03 15:36                                                                       ` Grant Grundler
2005-04-29 19:43                                                                 ` RDMA memory registration (was: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation) Bill Jordan
2005-04-29 19:45                                                                   ` RDMA memory registration Roland Dreier
2005-04-29 17:04                                                               ` [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Libor Michalek
2005-04-30  0:31                                                                 ` Caitlin Bestler
2005-05-03 18:43                                                                   ` Andy Isaacson
2005-05-03 19:04                                                                     ` Caitlin Bestler
2005-05-04 18:22                                                                     ` William Jordan
2005-05-05  1:27                                                                       ` Rik van Riel
2005-05-05  1:57                                                                         ` Andy Isaacson
2005-04-26 20:32                                                       ` Andrew Morton
2005-04-26 21:23                                                         ` Roland Dreier
2005-04-27  0:05                                                           ` Andrew Morton
2005-04-27  2:13                                                             ` Roland Dreier
2005-04-27  3:21                                                             ` Caitlin Bestler
2005-04-27  3:15                                                     ` Caitlin Bestler
2005-04-26  2:03                                       ` IWAMOTO Toshihiro
2005-04-26  2:16                                         ` Timur Tabi
2005-04-26  2:26                                         ` [openib-general] " Stephen Langdon
2005-04-25 22:23                                 ` Timur Tabi
2005-04-25 22:35                                   ` Andrew Morton
2005-04-25 22:42                                     ` Timur Tabi
2005-04-25 23:13                                       ` Andrew Morton
2005-04-25 23:21                                         ` Timur Tabi
2005-04-25 23:27                                           ` Andrew Morton
2005-04-26  0:08                                         ` Roland Dreier
2005-04-25 22:51                                     ` [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbsimplementation Bob Woodruff
2005-04-25 23:13                                       ` Timur Tabi
2005-04-25 23:17                                         ` Andrew Morton
2005-04-25 23:29                                         ` Bob Woodruff
2005-04-25 23:17                                     ` [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbs implementation Libor Michalek
2005-04-25 23:24                                       ` Andrew Morton
2005-04-25 23:37                                         ` Caitlin Bestler
2005-04-26  0:10                                           ` Andrew Morton
2005-04-26  3:55                                         ` Libor Michalek
2005-04-26  0:02                                 ` Roland Dreier
2005-04-26  6:12                                   ` Christoph Hellwig
2005-04-26 13:45                                     ` [openib-general] " Caitlin Bestler
2005-04-26 15:24                                     ` Timur Tabi
2005-04-25 19:11                       ` Andy Isaacson
2005-04-18 16:09     ` Timur Tabi
2005-04-18 16:12       ` Roland Dreier
2005-04-18 16:50         ` Timur Tabi
2005-04-21 19:47           ` Pavel Machek
2005-04-18 16:16       ` Arjan van de Ven
2005-04-18 16:25         ` Timur Tabi
2005-04-18 19:40           ` Arjan van de Ven
2005-04-18 20:00             ` Timur Tabi
2005-04-18 20:05               ` Arjan van de Ven
2005-04-18 20:19                 ` Timur Tabi
2005-04-18 20:07             ` [openib-general] " Bernhard Fischer
2005-04-21  2:17               ` Troy Benjegerdes
2005-04-21  3:07                 ` Timur Tabi
2005-04-21 17:38                   ` Andy Isaacson
2005-04-21 18:39                     ` Timur Tabi
2005-04-21 19:56                       ` Andy Isaacson
2005-04-21 20:07                         ` Timur Tabi
2005-04-21 20:12                           ` Chris Wright
2005-04-21 20:14                             ` Timur Tabi
2005-04-21 20:25                               ` Chris Wright
2005-04-21 20:30                                 ` Arjan van de Ven
2005-04-22  6:14                           ` Greg KH
2005-04-22 17:55         ` Timur Tabi
2005-04-22 18:12           ` Arjan van de Ven
2005-04-29  0:56         ` Andrew Morton
  -- strict thread matches above, loose matches on Subject: below --
2005-04-22 13:10 [openib-general] " Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
2005-04-22 17:01 ` [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspace verbsimplementation Fab Tillier
2005-04-22 22:01   ` Bodo Eggert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox