From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: stuarts <stuarts-dK3M3PVJaX4iXRBKUn1UN0EOCMrvLtNR@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: Multicast joins failing on 1.5-rc1?
Date: Wed, 21 Oct 2009 14:23:46 -0600 [thread overview]
Message-ID: <20091021202346.GO14520@obsidianresearch.com> (raw)
In-Reply-To: <C28CB83A-CF52-4603-91DF-D56865CBEA98-dK3M3PVJaX4iXRBKUn1UN0EOCMrvLtNR@public.gmane.org>
On Wed, Oct 21, 2009 at 02:16:47PM -0500, stuarts wrote:
> I did a lot more tracing on the sender side. I think I see what is
> happening: The sender uses the IP_ADD_MEMBERSHIP socket op. The IP
> stack (via the dev->mc_list multicast list) tries to create the
> following MGIDs:
> ff12:401b:ffff:0000:0000:0000:0100:0025
> ff12:601b:ffff:0000:0000:0000:0000:00fb
> ff12:601b:ffff:0000:0000:0001:ff03:2431
> ff12:601b:ffff:0000:0000:0000:0000:0001
> ff12:401b:ffff:0000:0000:0000:0000:0001
> ff12:401b:ffff:0000:0000:0000:0000:00fb
>
> The first one is mine, and the others are in the admin band (***1 is
> all-hosts, for example).
>
> This looks like it is valid, BUT, the call to
> ipoib_mcast_addr_is_valid occurs BEFORE the pkey is folded in from the
> ipoib_dev_priv structure. Printing out the pre-fold-in values shows:
> 00ffffffff12601b0000000000000000000000fb
>
> (This is the dev_mc_list -> dmi_addr value)
>
> Oops, that pkey is "wrong" (0 vs ffff). Out this address goes!
Hmm, I created the ipoib_mcast_addr_is_valid last month and it seemed
correct in my testing. I'm surprised to see this.
The intention was to catch groups that don't have the right pkey
set. Everything should be compeltely consistent by this point in the
code, the dmi_addr should have the pkey included in it. If this is not
true then the ip tools and other diagnostics will not function
properly.
What does IP say for your setup? Mine reports this:
$ ip link show dev ib0
4: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast state UP qlen 256
link/infiniband 80:2e:00:48:fe:80:00:00:00:00:00:00:00:02:c9:03:00:00:14:a5 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
$ ib1{jgg}~#~/work/iproute2.git/ip/ip maddr show dev ib0
4: ib0
link 33:33:ff:fe:f9:2d:00:00:00:00:00:00:00:00:00:e2:e4:f5:00:df static
link 00:ff:ff:ff:ff:12:60:1b:ff:ff:00:00:00:00:00:01:ff:00:14:a5
link 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:00:00:00:01
link 00:ff:ff:ff:ff:12:60:1b:ff:ff:00:00:00:00:00:00:00:00:00:01
So:
brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
link 00:ff:ff:ff:ff:12:60:1b:ff:ff:00:00:00:00:00:00:00:00:00:01
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Seems OK to me.
All mcast groups are created in the IP stack using this function:
static inline void ip_ib_mc_map(__be32 naddr, const unsigned char *broadcast, char *buf)
{
[..]
buf[8] = broadcast[8]; /* P_Key */
buf[9] = broadcast[9];
}
So I can't see how you can possibly get a mismatching pkey.
Are you using an upstream kernel or a backport to some RH kernel? What
does your ip_ib_mc_map function look like? It is a bit of a problem
for backports because it is inlined and built into the main kernel
code, if the original RH source for their kernel does not include the
above then it is broken and backporting the ipoib_mcast_addr_is_valid
just catches a pre-existing bug (as it was intended, actually)
Can you point me to where you see the 'pkey folding'? Is that present
in the mainline kernel?
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-10-21 20:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-20 18:16 Multicast joins failing on 1.5-rc1? stuarts
[not found] ` <A5E1097A-DFEA-4508-A47F-FF07C34EA525-dK3M3PVJaX4iXRBKUn1UN0EOCMrvLtNR@public.gmane.org>
2009-10-20 18:34 ` Jason Gunthorpe
2009-10-20 18:52 ` Hal Rosenstock
[not found] ` <f0e08f230910201152g476383ffp8e7392dc0c48e41-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-10-21 19:16 ` stuarts
[not found] ` <C28CB83A-CF52-4603-91DF-D56865CBEA98-dK3M3PVJaX4iXRBKUn1UN0EOCMrvLtNR@public.gmane.org>
2009-10-21 20:23 ` Jason Gunthorpe [this message]
[not found] ` <20091021202346.GO14520-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-10-21 21:43 ` stuarts
[not found] ` <A53D7B2B-EE41-4ABC-BC02-EE9A100C5DD8-dK3M3PVJaX4iXRBKUn1UN0EOCMrvLtNR@public.gmane.org>
2009-10-21 22:08 ` Multicast joins failing on 1.5-rc1? (OFED BACKPORT BUG) Jason Gunthorpe
[not found] ` <20091021220837.GP14520-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2009-10-22 9:12 ` Tziporet Koren
2009-10-22 15:08 ` stuarts
[not found] ` <3BAE2C3C-9724-47C6-BF44-EF0CDD47612C-dK3M3PVJaX4iXRBKUn1UN0EOCMrvLtNR@public.gmane.org>
2009-10-22 16:39 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091021202346.GO14520@obsidianresearch.com \
--to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=stuarts-dK3M3PVJaX4iXRBKUn1UN0EOCMrvLtNR@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox