* [Lustre-devel] faking LNET scale
2009-04-17 18:33 ` Liang Zhen
@ 2009-04-18 12:35 ` Nic Henke
2009-04-30 1:21 ` Eric Barton
2009-06-02 17:12 ` Nicholas Henke
2 siblings, 0 replies; 8+ messages in thread
From: Nic Henke @ 2009-04-18 12:35 UTC (permalink / raw)
To: lustre-devel
Liang Zhen wrote:
> Nic,
> It's very late night for me now, my head is not clear enough for me to
> make sure whether I'm saying something crazy, :)
Liang,
Thanks for the notes - at worst this is crazy interesting :-) This
looks very doable - I've not dug into the code to see if there are any
implementation gotchas - but it looks like it should work. I'll let you
know what I come up with.
Cheers,
Nic
> LNet always thinks target is remote network(needs router) if it can't
> find a NI with same network ID, for example, if local NI is (ptl0) and
> caller wants to send message to (ptl1), then LNet will:
> 1. Try to find local NI for ptl1, and failed then:
> 2. try to find if ptl1 is a remote network and whether there is router
> for this network (ptl1)
>
> So if you want your server has only one NI instance and can talk with
> a set of different networks, and at the same time, it can talk with
> other remote networks via routers, I would suggest:
> 1. create a new command, for example: lctl add_local_net ptl0
> ptl[1-N], which means LNet should allow NI(ptl0) accessing networks(
> ptl[1-N] as local networks.
> 2. add a new structure in LNet, i.e:
> struct {
> struct list_head ln_list;
> __u32 ln_net;
> lnet_ni_t *ln_localni;
> ......
> }lnet_localnet_t;
> As you see, it's very like current structure lnet_remotenet_t, which
> is pending on lnet_t::ln_remote_nets; we can create a lnet_locallnet_t
> object and add it to global list (i.e: lnet_t::ln_local_nets) by the
> command we mentioned above: lctl add_local_net
> 3. once upper layer caller sending message, lnet_send() should check
> lnet_t::ln_local_nets firstly (before thinking it's a remote network
> and checking on lnet_t::ln_remote_nets), if it is on
> lnet_t::ln_local_netsthen we can take the local NI. on
> lnet_locanet_t::ln_localni;
> 4. We need add a new flag for LND, only LND with the flag can support
> command lctl add_local_net.
> 5. make the LND wouldn't reject messages from different networks.
> again, hope I'm answering what you are asking, :)
>
> Regards
> Liang
^ permalink raw reply [flat|nested] 8+ messages in thread* [Lustre-devel] faking LNET scale
2009-04-17 18:33 ` Liang Zhen
2009-04-18 12:35 ` Nic Henke
@ 2009-04-30 1:21 ` Eric Barton
2009-06-02 17:12 ` Nicholas Henke
2 siblings, 0 replies; 8+ messages in thread
From: Eric Barton @ 2009-04-30 1:21 UTC (permalink / raw)
To: lustre-devel
Why not just instantiate all the NIs on the server? LNDs that
support multiple NIs typically have a single set of global tables,
so it should still stress the LND just fine. Also having n different
targets (one for each LNET) on the server actually simplifies
client configuration too - if you only have a single target, lustre
would, by default, only use one client NID to get to it.
Cheers,
Eric
> -----Original Message-----
> From: lustre-devel-bounces at lists.lustre.org [mailto:lustre-devel-bounces at lists.lustre.org] On Behalf Of Liang Zhen
> Sent: 17 April 2009 7:34 PM
> To: Nicholas Henke
> Cc: lustre-devel at lists.lustre.org
> Subject: Re: [Lustre-devel] faking LNET scale
>
> Nic,
> It's very late night for me now, my head is not clear enough for me to
> make sure whether I'm saying something crazy, :)
> LNet always thinks target is remote network(needs router) if it can't
> find a NI with same network ID, for example, if local NI is (ptl0) and
> caller wants to send message to (ptl1), then LNet will:
> 1. Try to find local NI for ptl1, and failed then:
> 2. try to find if ptl1 is a remote network and whether there is router
> for this network (ptl1)
>
> So if you want your server has only one NI instance and can talk with a
> set of different networks, and at the same time, it can talk with other
> remote networks via routers, I would suggest:
> 1. create a new command, for example: lctl add_local_net ptl0 ptl[1-N],
> which means LNet should allow NI(ptl0) accessing networks( ptl[1-N] as
> local networks.
> 2. add a new structure in LNet, i.e:
> struct {
> struct list_head ln_list;
> __u32 ln_net;
> lnet_ni_t *ln_localni;
> ......
> }lnet_localnet_t;
> As you see, it's very like current structure lnet_remotenet_t, which is
> pending on lnet_t::ln_remote_nets; we can create a lnet_locallnet_t
> object and add it to global list (i.e: lnet_t::ln_local_nets) by the
> command we mentioned above: lctl add_local_net
> 3. once upper layer caller sending message, lnet_send() should check
> lnet_t::ln_local_nets firstly (before thinking it's a remote network and
> checking on lnet_t::ln_remote_nets), if it is on
> lnet_t::ln_local_netsthen we can take the local NI. on
> lnet_locanet_t::ln_localni;
> 4. We need add a new flag for LND, only LND with the flag can support
> command lctl add_local_net.
> 5. make the LND wouldn't reject messages from different networks.
> again, hope I'm answering what you are asking, :)
>
> Regards
> Liang
>
> Nicholas Henke wrote:
> > Greetings -
> >
> > I was looking into ways to simulate scale at the LNET level. It would allow us
> > to test the LNDs better with less hardware, not to mention things like LNet
> > SelfTest and friends.
> >
> > With the work in bug 15332 to add multiple nets per NIC, it seemed fairly close
> > that we could use that to generate multiple LND connections from a single NIC.
> > Ideally we'd have a server or router that would have just one LND instance
> > (ptl0) and the client nodes with multiple interfaces (ptl1, ptl2, ...). This
> > would increase the load on those server nodes to something interesting.
> >
> > However, to do this either hacking up lnet_ptlcompat_matchXXX to look at another
> > flag besides the_lnet.ln_ptlcompat or some other way of allowing a server with a
> > single NET (ptl0) to accept requests from a variety of nets (ptl1, ptl2, etc).
> > One cannot use multiple interfaces for the same net type with ln_ptlcompat enabled.
> >
> > Is there a better way to do this ? What would be the least abusive of th e rules ?
> >
> > Cheers,
> > Nic
> > _______________________________________________
> > Lustre-devel mailing list
> > Lustre-devel at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-devel
> >
>
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
^ permalink raw reply [flat|nested] 8+ messages in thread* [Lustre-devel] faking LNET scale
2009-04-17 18:33 ` Liang Zhen
2009-04-18 12:35 ` Nic Henke
2009-04-30 1:21 ` Eric Barton
@ 2009-06-02 17:12 ` Nicholas Henke
2009-06-05 8:57 ` Liang Zhen
2 siblings, 1 reply; 8+ messages in thread
From: Nicholas Henke @ 2009-06-02 17:12 UTC (permalink / raw)
To: lustre-devel
Liang Zhen wrote:
> Nic,
> It's very late night for me now, my head is not clear enough for me to
> make sure whether I'm saying something crazy, :)
> LNet always thinks target is remote network(needs router) if it can't
> find a NI with same network ID, for example, if local NI is (ptl0) and
> caller wants to send message to (ptl1), then LNet will:
> 1. Try to find local NI for ptl1, and failed then:
> 2. try to find if ptl1 is a remote network and whether there is router
> for this network (ptl1)
>
> So if you want your server has only one NI instance and can talk with a
> set of different networks, and at the same time, it can talk with other
> remote networks via routers, I would suggest:
> 1. create a new command, for example: lctl add_local_net ptl0 ptl[1-N],
> which means LNet should allow NI(ptl0) accessing networks( ptl[1-N] as
> local networks.
> 2. add a new structure in LNet, i.e:
> struct {
> struct list_head ln_list;
> __u32 ln_net;
> lnet_ni_t *ln_localni;
> ......
> }lnet_localnet_t;
> As you see, it's very like current structure lnet_remotenet_t, which is
> pending on lnet_t::ln_remote_nets; we can create a lnet_locallnet_t
> object and add it to global list (i.e: lnet_t::ln_local_nets) by the
> command we mentioned above: lctl add_local_net
> 3. once upper layer caller sending message, lnet_send() should check
> lnet_t::ln_local_nets firstly (before thinking it's a remote network and
> checking on lnet_t::ln_remote_nets), if it is on
> lnet_t::ln_local_netsthen we can take the local NI. on
> lnet_locanet_t::ln_localni;
> 4. We need add a new flag for LND, only LND with the flag can support
> command lctl add_local_net.
> 5. make the LND wouldn't reject messages from different networks.
> again, hope I'm answering what you are asking, :)
This is almost working - I'm running into one problem: lnet_accept wants to
match the ni->ni_nid against the requested NID. It is failing as the nets don't
match (ptl1 vs ptl0).
It looks like there are a fair number of places like this, most using
lnet_ptlcompat_match{net,nid}.
How should I handle those? Add another clause like ptlcompat (like ln_aliases)
and if that is set (we have aliases set), do a search to find the alias and see
if there is an alias that would allow NIDNET(lnet_net) == NIDNET(ptl_net)?
Is there a cleaner way?
Nic
^ permalink raw reply [flat|nested] 8+ messages in thread* [Lustre-devel] faking LNET scale
2009-06-02 17:12 ` Nicholas Henke
@ 2009-06-05 8:57 ` Liang Zhen
0 siblings, 0 replies; 8+ messages in thread
From: Liang Zhen @ 2009-06-05 8:57 UTC (permalink / raw)
To: lustre-devel
Hi Nic,
For incoming requests, I think we can share the same network aliases
with outgoing messsages (i.e: lnet_t::ln_local_nets in my previous
mail), matching on the aliases list could be embedded in
lnet_ptlcompat_match{net,nid} and lnet_net2ni_locked so we don't need
worry about changing code everywhere.
Regards
Liang
Nicholas Henke wrote:
> Liang Zhen wrote:
>> Nic,
>> It's very late night for me now, my head is not clear enough for me
>> to make sure whether I'm saying something crazy, :)
>> LNet always thinks target is remote network(needs router) if it can't
>> find a NI with same network ID, for example, if local NI is (ptl0)
>> and caller wants to send message to (ptl1), then LNet will:
>> 1. Try to find local NI for ptl1, and failed then:
>> 2. try to find if ptl1 is a remote network and whether there is
>> router for this network (ptl1)
>>
>> So if you want your server has only one NI instance and can talk with
>> a set of different networks, and at the same time, it can talk with
>> other remote networks via routers, I would suggest:
>> 1. create a new command, for example: lctl add_local_net ptl0
>> ptl[1-N], which means LNet should allow NI(ptl0) accessing networks(
>> ptl[1-N] as local networks.
>> 2. add a new structure in LNet, i.e:
>> struct {
>> struct list_head ln_list;
>> __u32 ln_net;
>> lnet_ni_t *ln_localni;
>> ......
>> }lnet_localnet_t;
>> As you see, it's very like current structure lnet_remotenet_t, which
>> is pending on lnet_t::ln_remote_nets; we can create a
>> lnet_locallnet_t object and add it to global list (i.e:
>> lnet_t::ln_local_nets) by the command we mentioned above: lctl
>> add_local_net
>> 3. once upper layer caller sending message, lnet_send() should check
>> lnet_t::ln_local_nets firstly (before thinking it's a remote network
>> and checking on lnet_t::ln_remote_nets), if it is on
>> lnet_t::ln_local_netsthen we can take the local NI. on
>> lnet_locanet_t::ln_localni;
>> 4. We need add a new flag for LND, only LND with the flag can support
>> command lctl add_local_net.
>> 5. make the LND wouldn't reject messages from different networks.
>> again, hope I'm answering what you are asking, :)
>
> This is almost working - I'm running into one problem: lnet_accept
> wants to match the ni->ni_nid against the requested NID. It is failing
> as the nets don't match (ptl1 vs ptl0).
>
> It looks like there are a fair number of places like this, most using
> lnet_ptlcompat_match{net,nid}.
>
> How should I handle those? Add another clause like ptlcompat (like
> ln_aliases) and if that is set (we have aliases set), do a search to
> find the alias and see if there is an alias that would allow
> NIDNET(lnet_net) == NIDNET(ptl_net)?
>
> Is there a cleaner way?
>
> Nic
^ permalink raw reply [flat|nested] 8+ messages in thread