All of lore.kernel.org
 help / color / mirror / Atom feed
* [Lustre-devel] Testing LNET
@ 2009-07-16 21:10 Scott Atchley
  2009-07-16 21:20 ` Nicholas Henke
  0 siblings, 1 reply; 3+ messages in thread
From: Scott Atchley @ 2009-07-16 21:10 UTC (permalink / raw)
  To: lustre-devel

Hi all,

I have implemented handling of hosts with different PAGE_SIZE in  
MXLND. I am running tests to make sure that I did not accidentally  
break something else. So far, I have been using lctl and pinging back  
and forth as well as with obdecho (using loadgen).

When running loadgen tests or lctl test_brw with loadgen's echosrv  
running, if I kill a host (either client or server) and bring it back  
up, LNET seems happy (MXLND reconnects normally), but loadgen does not  
resume. When using lctl test_brw and I restart a server, the client  
reconnects but then fails an assertion when it connects to a new  
server with:

LustreError: 6295:0:(echo_client.c:1341:echo_client_cleanup())  
ASSERTION(eco->eco_refcount == 0) failed

If this the proper way to test a LND? What other methods can you  
suggest?

Thanks,

Scott

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Lustre-devel] Testing LNET
  2009-07-16 21:10 [Lustre-devel] Testing LNET Scott Atchley
@ 2009-07-16 21:20 ` Nicholas Henke
  2009-07-16 21:27   ` Nicholas Henke
  0 siblings, 1 reply; 3+ messages in thread
From: Nicholas Henke @ 2009-07-16 21:20 UTC (permalink / raw)
  To: lustre-devel

Scott Atchley wrote:
> Hi all,
>
> I have implemented handling of hosts with different PAGE_SIZE in  
> MXLND. I am running tests to make sure that I did not accidentally  
> break something else. So far, I have been using lctl and pinging back  
> and forth as well as with obdecho (using loadgen).
>
> When running loadgen tests or lctl test_brw with loadgen's echosrv  
> running, if I kill a host (either client or server) and bring it back  
> up, LNET seems happy (MXLND reconnects normally), but loadgen does not  
> resume. When using lctl test_brw and I restart a server, the client  
> reconnects but then fails an assertion when it connects to a new  
> server with:
>
> LustreError: 6295:0:(echo_client.c:1341:echo_client_cleanup())  
> ASSERTION(eco->eco_refcount == 0) failed
>
> If this the proper way to test a LND? What other methods can you  
> suggest?
>   

I've been using LNet SelfTest for my LND development and testing.  It is 
ok once you get things working - but it is painful to script around and 
to trace LST errors back to LND transactions.

Nic

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Lustre-devel] Testing LNET
  2009-07-16 21:20 ` Nicholas Henke
@ 2009-07-16 21:27   ` Nicholas Henke
  0 siblings, 0 replies; 3+ messages in thread
From: Nicholas Henke @ 2009-07-16 21:27 UTC (permalink / raw)
  To: lustre-devel

Nicholas Henke wrote:
> Scott Atchley wrote:
>>
>
> I've been using LNet SelfTest for my LND development and testing.  It 
> is ok once you get things working - but it is painful to script around 
> and to trace LST errors back to LND transactions.
>
> Nic
>

FWIW: Here are scripts I've been using for LST runs. 'sim_tests.sh' runs 
tests in serial, parallel is...

The 'sim_config' is an easy way to manage different machine configs - 
just set the CONFIG environment variable to the particular file you'd 
like to use.

The cli format is "test size loops" where:
    - test can be 'read' 'write' or 'ping'
    - size is anything <= 1m, i.e; '243k', '2342', '1m'
    - loops is the number to send.

There is a 'NCONN' environment variable that controls the --concurency 
for LST tests, bascially how many RPCs to keep on the wire.

YMMV,
Nic
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sim_tests.sh
Type: application/x-sh
Size: 1143 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20090716/133019f2/attachment.sh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sim_config
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20090716/133019f2/attachment.asc>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parallel_test.sh
Type: application/x-sh
Size: 1207 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20090716/133019f2/attachment-0001.sh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lnet_selftest_framework.sh
Type: application/x-sh
Size: 6366 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20090716/133019f2/attachment-0002.sh>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2009-07-16 21:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-16 21:10 [Lustre-devel] Testing LNET Scott Atchley
2009-07-16 21:20 ` Nicholas Henke
2009-07-16 21:27   ` Nicholas Henke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.