From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: "Weiny, Ira" <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org"
<roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
"hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org"
<hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
Subject: Re: [PATCH] infiniband-diags: add rdma-ndd daemon
Date: Sun, 9 Nov 2014 22:11:55 -0700 [thread overview]
Message-ID: <20141110051155.GD31256@obsidianresearch.com> (raw)
In-Reply-To: <2807E5FD2F6FDA4886F6618EAC48510E0CBA8FCD-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
On Mon, Nov 10, 2014 at 12:28:27AM +0000, Weiny, Ira wrote:
> >
> > Sadly, I think the proper way to address this is the same way netdev addresses
> > it - do not activate the interface on module load, wait for an explicit
> > enablement so userspace can configure before it tries to link up.
> >
> > (Not to say that is even possible considering RDMA's history, but still..)
>
> I don't think this is really possible. When would you consider the
> "interface" to be "up"? Even before the port goes active the SM and
> other diags can query this value. This is why I propose that the
> default of the drivers, without user space involvement, should
> change. The user space daemon is only trying to react to other user
> space activity.
Same as net dev, after the module load the physical layer is
disabled. Nothing can see the NodeDescription because the link is kept
down.
The everything gets configured, then the physical layer is allowed to
come up.
I don't know if this is practical, but it is the only race free way to
properly address all of this.
> > TBH, I'd rather see just this daemon and drop the kernel side... If the daemon is
> > started before the modules are loaded then it might be able to set the
> > description while the port is still down.
>
> The problem is this sequence.
>
> 1) rdma-ndd daemon started (hostname == localhost, no rdma devices set)
> 2) dhcp started (or other hostname change, rdma-ndd triggered but no rdma devices set; not loaded yet)
> 3) rdma devices loaded, default node_desc
4) daemon detects new rdma device instantly and programs the admin
desired node name in the 100ms before the physical link reaches
up. Since the link is down when the change occurs no traps are scheduled.
Or in the 10% case where the race is lost you get a trap. Oh well.
> What I have proposed will work no matter what order things are
> started/loaded.
Well, not really - it works in the sense that the kernel default node
name works properly, but it doesn't allow the admin to set a custom
name without hitting all these same issues (well, except perhaps by
module parameter..).
> Finally, I contemplated the alternative of having the rdma-ndd
> daemon detect new rdma devices.
Well, without that all it does is link two parts of the kernel
together through user space with no way toset policy in userspace
(policy in this case is the node description string). Seems very
strange.
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-11-10 5:11 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-30 23:54 [PATCH 0/1] infiniband-diags: add rdma-ndd daemon ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1414713270-12710-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2014-10-30 23:54 ` [PATCH] " ira.weiny-ral2JQCrhuEAvxtiuMwx3w
[not found] ` <1414713270-12710-2-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2014-10-31 6:58 ` Bart Van Assche
[not found] ` <54533318.8010700-HInyCGIudOg@public.gmane.org>
2014-10-31 9:24 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E0CB79457-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-10-31 9:32 ` Bart Van Assche
[not found] ` <54535741.3030409-HInyCGIudOg@public.gmane.org>
2014-10-31 9:43 ` Weiny, Ira
2014-11-08 0:40 ` Jason Gunthorpe
[not found] ` <20141108004029.GB12120-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2014-11-10 0:28 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E0CBA8FCD-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-11-10 5:11 ` Jason Gunthorpe [this message]
[not found] ` <20141110051155.GD31256-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2014-11-10 18:32 ` Weiny, Ira
[not found] ` <2807E5FD2F6FDA4886F6618EAC48510E0CBB39E8-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-11-10 19:05 ` Jason Gunthorpe
2014-11-11 15:11 ` Hal Rosenstock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141110051155.GD31256@obsidianresearch.com \
--to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
--cc=hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org \
--cc=ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox