public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	'Doug Ledford' <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	'Ram Amrani' <Ram.Amrani-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org>,
	'Ira Weiny' <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	'Benjamin Drung'
	<benjamin.drung-EIkl63zCoXaH+58JC4qpiA@public.gmane.org>,
	'Jarod Wilson' <jarod-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH rdma-core 0/5] Common systemd/udev based boot support
Date: Tue, 25 Jul 2017 16:02:10 -0600	[thread overview]
Message-ID: <20170725220210.GA15663@obsidianresearch.com> (raw)
In-Reply-To: <016901d30590$3eee9910$bccbcb30$@opengridcomputing.com>

On Tue, Jul 25, 2017 at 04:52:01PM -0500, Steve Wise wrote:

> > This sort of hotplug that cxbg4 does is quite strange, what happens
> > when 'ip link set X down' is done? Does it remove the RDMA device?
> > Does 'ip link set down' block until all users go away?
> 
> No.  iw_cxgb4 just triggers on the first 'up', to add the rdma provider instance
> for that device.  The Low Level Driver (LLD), cxgb4, passes the CXGB4_STATE_UP
> event to all registered upper level drivers (ULDs) when the first port is
> enabled (see cxgb_up).  Any rdma connections that are active when a link goes
> down still function, as any TCP connection would function if the interface was
> brought down;  eg: tcp retransmits if there is pending data until it gives up
> and aborts the connection.  So Netdev link down/up transitions are hidden from
> the rdma application.   

I think you should change this to create the RDMA device when the
module is installed and the hardware is present..

> > This is going to make it harder for cxgb users to get a reliably
> > bootup at this time, we need more kernel autoloading for things to be
> > reliable, and I'm sure iwpmd.service needs some dependency adjusting,
> > I just don't know enough about it to do it right. :\
> 
> I don't understand?

At the present moment udev will start running rules at the link up
time, which happens sometime around 'network.target'

However, systemd will continue processing unknowing what udev is
doing.

So, if you have a RDMA enabled daemon, and you make it start after the
RDMA device is plugged we have some races..

- udev is creating /dev/ nodes and telling systemd to start module loading
  units, and run iwpmd
- systemd may have already started loading the RDMA daemon before udev
  gets to any of this (racy) eg the /dev/ nodes may not exist yet, or
  the modules may still in process to be loaded
- systemd may have started iwpmd, but it is not yet ready and then
  starts the RDMA daemon (racy differently, this is helped with
  sd_notify)
- The RDMA daemon now needs explicit dependencies on the RDMA device
  to order properly, something simple like sysinit.target isn't going to work

Basically, it is very hard to start a RDMA daemon and not have it race
with something and randomly fail to start properly the more hotpluggy
things are.

The existing RDMA stuff largely relies on some sequentiality, eg
loading the RDMA module is enough to create the RDMA device, and that
more reliably happens before sysinit.target, so we can create some
predictable ordering in the system.

This is also why I have been so insistent that the only way to make
all of this work properly and reliably is to have robust kernel auto
loading.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2017-07-25 22:02 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-24 20:44 [PATCH rdma-core 0/5] Common systemd/udev based boot support Jason Gunthorpe
     [not found] ` <1500929067-1583-1-git-send-email-jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-24 20:44   ` [PATCH rdma-core 1/5] Common infrastructure for auto loading rdma modules Jason Gunthorpe
     [not found]     ` <1500929067-1583-2-git-send-email-jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-25 17:15       ` Bart Van Assche
     [not found]         ` <1501002916.8931.4.camel-Sjgp3cTcYWE@public.gmane.org>
2017-07-25 17:39           ` Jason Gunthorpe
     [not found]             ` <20170725173947.GA10905-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-27 22:18               ` Jason Gunthorpe
     [not found]                 ` <20170727221850.GB16986-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-27 22:28                   ` Bart Van Assche
     [not found]                     ` <1501194538.2516.10.camel-Sjgp3cTcYWE@public.gmane.org>
2017-07-27 22:38                       ` Jason Gunthorpe
2017-07-26 13:48       ` Dennis Dalessandro
     [not found]         ` <0b3badf8-053f-e1ed-2be8-c5e6d416384b-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-07-26 16:04           ` Jason Gunthorpe
2017-07-24 20:44   ` [PATCH rdma-core 2/5] srp: Autoload the SRP kernel module if required Jason Gunthorpe
2017-07-24 20:44   ` [PATCH rdma-core 3/5] iwpmd: Autoload iwpmd " Jason Gunthorpe
2017-07-24 20:44   ` [PATCH rdma-core 4/5] redhat: Remove cxgb3/4.modprobe files Jason Gunthorpe
2017-07-24 20:44   ` [PATCH rdma-core 5/5] Move umad udev rules to rdma-core Jason Gunthorpe
2017-07-25 16:27   ` [PATCH rdma-core 0/5] Common systemd/udev based boot support Steve Wise
2017-07-25 16:40     ` Jason Gunthorpe
2017-07-25 17:01       ` Steve Wise
2017-07-25 17:05         ` Jason Gunthorpe
     [not found]           ` <20170725170506.GB3164-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-25 17:08             ` Steve Wise
     [not found]       ` <20170725164004.GA20959-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-25 16:43         ` Steve Wise
2017-07-25 16:55           ` Jason Gunthorpe
2017-07-25 18:49         ` Steve Wise
2017-07-25 21:33           ` Jason Gunthorpe
     [not found]             ` <20170725213354.GE10905-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-25 21:52               ` Steve Wise
2017-07-25 22:02                 ` Jason Gunthorpe [this message]
     [not found]                   ` <20170725220210.GA15663-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-26 14:05                     ` Steve Wise
2017-07-26 16:24                       ` Jason Gunthorpe
     [not found]                         ` <20170726162419.GC20499-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-26 16:29                           ` Steve Wise
2017-07-28 13:51                         ` Steve Wise
2017-07-28 15:58                           ` Jason Gunthorpe
     [not found]                             ` <20170728155842.GC9646-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-07-28 16:08                               ` Steve Wise
2017-07-25 18:55       ` Steve Wise
2017-07-27  7:47   ` Amrani, Ram
     [not found]     ` <BN3PR07MB257807A6AE85B9B5575AAD7AF8BE0-EldUQEzkDQfpW3VS/XPqkOFPX92sqiQdvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2017-07-27 16:36       ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170725220210.GA15663@obsidianresearch.com \
    --to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
    --cc=Ram.Amrani-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org \
    --cc=benjamin.drung-EIkl63zCoXaH+58JC4qpiA@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=jarod-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox