From: Boaz Harrosh <bharrosh@panasas.com>
To: Idan Kedar <idank@tonian.com>
Cc: Johannes Schild <JSchild@gmx.de>, <osd-dev@open-osd.org>,
<linux-nfs@vger.kernel.org>
Subject: Re: Questions about Exofs
Date: Tue, 15 May 2012 20:20:52 +0300 [thread overview]
Message-ID: <4FB29074.6010606@panasas.com> (raw)
In-Reply-To: <CABpMAy+OWcp7VAOGK4nYWfcY4iZ9a1WQHPUbh5pxotYL9yy+KA@mail.gmail.com>
On 05/15/2012 07:21 PM, Idan Kedar wrote:
<snip>
>> 8 OSDs with a mirror ? what was the mkfs.exofs command line you used?
> Something along the lines of
> # LD_LIBRARY_PATH=lib ./usr/mkfs.exofs --pid=0x10000 --format
> --mirrors=1 --group_width=2 --group_depth=2 --dev=/dev/osd0
> --osdname=$(uuid) --dev=/dev/osd1 --osdname=$(uuid) --dev=/dev/osd2
> --osdname=$(uuid) ...
>
> I don't remember exactly at the moment, but I will bump this thread
> when I'll start using RAID again.
Certainly missing the --format thing. --osdname= without an --format
is ignored.
Again if you never set an OSD_NAME on the devices in the past this will
not work.
But perhaps you did --format and forgot
>>
>> And did you use one otgtd with 8 targets, or 8 targets (8 IP addresses)
>> with one target each, or a combination?
> one target with 8 LUNs
>>
>> What is the otgtd platform? what file system? what HW and HD environment?
> osc-osd over ext4, 64 bit VirtualBox VM over x86_64.
OK that's a fishy setup.
The otgtd is sensitive to timeouts which I never investigated properly.
It looks like the OSD_VM-to-host, probably a single link, is very slow
and imagine 8 initiators actually banging on the same single slow link.
One of the commands times out, probably just a guess
The best for a VM setup, that I have is:
VM1 - exofs+MDS
VM2 - pnfs client.
(On my dev machine I have VM2+VM1 combined, and mount on localhost)
HOST - Run otgtd naively on host for best results.
Best is to spread all targets on as many physical HD devices.
Note that multiple targets from the same OSD-host only makes
"preformance" sense if they each serve a different spindle.
Unless in a simulated test environment as yourself
Which reminds me that upstream tgtd as a few fixes in this area I should
integrate.
>>
>> And yes otgtd has some instabilities.
>>
>> There are two I can think off:
>> * Over xfs the --format command crashes the otgtd (aborted exit no
>> crash dump) Debugging welcome.
>>
>> * When lots of pnfs clients do heavy writing to the same otgtd, it
>> times-out and disconnects.
> it was a single client performing git-clone of the kernel tree.
>> At Panasas we have a watch-dog that reloads it in a loop.
>> I have only seen this on FreeBSD, in Linux it never happened
>> to me.
>>
>> Please give me more details on what you did before it exited
>> like that.
> Nothing special, just git-clone. at some point it hanged (at a
> different place every time), and when investigated a bit I saw that
> otgtd is dead.
>>
>>
>> In anyway I pushed a tree I tested with at:
>> git://git.open-osd.org/linux-open-osd.git
>>
>> checkout the *merge_and_compile-3.3* branch. But in principal they are the
>> same:
>> fs/exofs - Added autologin support
>> fs/nfs/objlayout - Added autologin support
>> fs/nfsd - Same
>> fs/nfs - Few fixes that are in benny's tree are not in linux-open-osd
> Thanks, I will try it soon.
>>
>> So it should all be the same. For a proper cluster setup you will probably
>> need my do-ect scripts which take a cluster descriptor file and does
>> generic loops on everything.
> Please note that I didn't try a cluster setup, just a single DS with 8
> LUNs, single MDS, and single pNFS client, all 3 different VMs on the
> same host.
Just semantics so we speak the same language. Yes you do have a cluster.
In pnfs-objects world there are no such thing as DSs there is MDS and
there are OSDs. (objects). The OSDs are the equivalent of DSs in "files"
and LUNs in "blocks"
A none cluster is when you have a single OSD. (No striping, no multiple
devices, what I call the trivial layout)
>>
>> Thanks
>> Boaz
>
If you want there is a new tree at:
git://git.open-osd.org/tests.git
There is one script that does everything. ./do-ect (exofs cluster test)
You edit the ect.conf file (or run ./do-ect -f alternate.conf file)
In turn inside ect.conf you edit your topology and setup. It also points
to a device-table file (osds_list=XXX.olst) see lots of *.olst example
files.
list of operations.
* Read the scripts and study what they do. They are just a convenience
not a black-box application.
* Edit a new XXX.olst file
* Edit ect.conf with your setup.
[On MDS]
* ./do-ect login2 - login to all devices specified by osds_list=
* ./do-ect format2 - Set up an FS as specified by ect.conf
* ./do-ect mount2 - mount the exofs file system
* ./do-ect seturi - If you have the autologin version and want an autologin support.
[On pnfs client]
* ./do-ect login2 - Only if autologin is not enabled. You will need the
/sbin/osd_login script, which is part of the newest nfs-utils
(Tell me if you can't find it)
* ./do-ect pnfs_start - mount the pnfs server on pnfs_dir as specified in ect.conf
And there are other facilities as well. In principal the commands that end with "2"
are those that preform an action on the XXX.olst file and receive an optional parameter
as the OSDs list. For example:
./do-ect login2
- login to default list specified in ect.conf
./do-ect -f clusterXYZ.conf format2 device_table7.olst
- Format according to setup in clusterXYZ.conf file but override the OSDs list
instead use device_table7.olst.
Again please read the scripts before using. the .conf and .olst files are yours, I
just have them in git as an history of the tests I conducted.
But if you have any changes to the do-ect and fn-osd.sh please send me a patch.
There are other interesting scripts in there for example the target/ dir as a
way of controlling lots of OSD hosts in a single command using the closh script (CLuster Output SHell)
which also operate on the .olst and .clst files. Have fun
Cheers
Boaz
next prev parent reply other threads:[~2012-05-15 17:21 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-15 9:03 Questions about Exofs Johannes Schild
2012-05-15 9:48 ` Boaz Harrosh
2012-05-15 12:19 ` Johannes Schild
2012-05-15 13:09 ` Idan Kedar
2012-05-15 13:42 ` Boaz Harrosh
2012-05-15 14:22 ` Idan Kedar
2012-05-15 15:06 ` Boaz Harrosh
2012-05-15 16:21 ` Idan Kedar
2012-05-15 17:20 ` Boaz Harrosh [this message]
2012-05-16 8:07 ` Idan Kedar
2012-05-16 12:15 ` Boaz Harrosh
2012-05-16 15:18 ` Idan Kedar
2012-05-15 13:18 ` Boaz Harrosh
2012-05-16 9:00 ` Johannes Schild
2012-05-16 10:30 ` Boaz Harrosh
2012-05-21 13:07 ` Johannes Schild
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FB29074.6010606@panasas.com \
--to=bharrosh@panasas.com \
--cc=JSchild@gmx.de \
--cc=idank@tonian.com \
--cc=linux-nfs@vger.kernel.org \
--cc=osd-dev@open-osd.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).