* [Lustre-devel] Kernel crash from "mkfs.lustre --index" setting @ 2013-10-11 17:11 Wendy Cheng 2013-10-11 17:41 ` White, Cliff 0 siblings, 1 reply; 5+ messages in thread From: Wendy Cheng @ 2013-10-11 17:11 UTC (permalink / raw) To: lustre-devel This panic seems to be generic regardless the platform, though I'm actually on Intel Xeon Phi Lustre (client) nodes. New to Lustre, I mistakenly thought the "index" option of mkfs.lustre was for software raid so I formatted one of the server disks as the following: server> mkfs.lustre --reformat --fsname=lus1 --mgs --mdt --index=1 /dev/sdd1 server> mkfs.lustre --reformat --ost --fsname=lus1 --mgsnode=192.168.20.46 at o2ib0 --index=1 /dev/sde1 The client mount immediately crashed at lmv_get_info(). The attached patch fixed that particular panic ... but unfortunately crashed at an assertion further down the path. I'll be travelling next week so might give up pursuing this issue. The disks are now subsequently re-formatted with index=0 - things seem to work fine and performance numbers collected. Three questions here: 1. What is this "index" option all about ? 2. Does the problem worth being fixed ? Or is it a user error ? 3. The performance numbers (again, NOT Xeon Phi specific) surprise me. Would this list be a good place to ask questions ? -- Wendy -------------- next part -------------- A non-text attachment was scrubbed... Name: index.patch Type: application/octet-stream Size: 1884 bytes Desc: not available URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20131011/978dfb1b/attachment.obj> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Lustre-devel] Kernel crash from "mkfs.lustre --index" setting 2013-10-11 17:11 [Lustre-devel] Kernel crash from "mkfs.lustre --index" setting Wendy Cheng @ 2013-10-11 17:41 ` White, Cliff 2013-10-11 17:59 ` Wendy Cheng 0 siblings, 1 reply; 5+ messages in thread From: White, Cliff @ 2013-10-11 17:41 UTC (permalink / raw) To: lustre-devel On 10/11/13 10:11 AM, "Wendy Cheng" <s.wendy.cheng@gmail.com> wrote: >This panic seems to be generic regardless the platform, though I'm >actually on Intel Xeon Phi Lustre (client) nodes. > >New to Lustre, I mistakenly thought the "index" option of mkfs.lustre >was for software raid so I formatted one of the server disks as the >following: > >server> mkfs.lustre --reformat --fsname=lus1 --mgs --mdt --index=1 >/dev/sdd1 >server> mkfs.lustre --reformat --ost --fsname=lus1 >--mgsnode=192.168.20.46 at o2ib0 --index=1 /dev/sde1 > >The client mount immediately crashed at lmv_get_info(). The attached >patch fixed that particular panic ... but unfortunately crashed at an >assertion further down the path. I'll be travelling next week so might >give up pursuing this issue. The disks are now subsequently >re-formatted with index=0 - things seem to work fine and performance >numbers collected. Three questions here: > >1. What is this "index" option all about ? >2. Does the problem worth being fixed ? Or is it a user error ? >3. The performance numbers (again, NOT Xeon Phi specific) surprise me. >Would this list be a good place to ask questions ? > >-- Wendy > 1. --index is used to enumerate OSTs and MDT, when using DNE. The index MUST be unique, and indexes must not have gaps. So, you should do this: server> mkfs.lustre --reformat --fsname=lus1 --mgs --mdt --index=0 /dev/sdd1 /* First MDT */ server> mkfs.lustre --reformat --ost --fsname=lus1 --mgsnode=192.168.20.46 at o2ib0 --index=0 /dev/sde1 /* first OST */ If you add a second OST partition: server> mkfs.lustre --reformat --ost --fsname=lus1 --mgsnode=192.168.20.46 at o2ib0 --index=1 /dev/sdfoo /* second OST */ And a third: server> mkfs.lustre --reformat --ost --fsname=lus1 --mgsnode=192.168.20.46 at o2ib0 --index=2 /dev/sdbar /* third OST */ 2.- You must fix this, or things won't work. I would suggest starting again, and doing a reformat Etc,etc 3. Surprise you how? HPDD-discuss is likely a better list for these sorts of questions, lustre-devel is for code development. Cliffw ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Lustre-devel] Kernel crash from "mkfs.lustre --index" setting 2013-10-11 17:41 ` White, Cliff @ 2013-10-11 17:59 ` Wendy Cheng 2013-10-11 18:30 ` White, Cliff 0 siblings, 1 reply; 5+ messages in thread From: Wendy Cheng @ 2013-10-11 17:59 UTC (permalink / raw) To: lustre-devel On Fri, Oct 11, 2013 at 10:41 AM, White, Cliff <cliff.white@intel.com> wrote: > 1. > --index is used to enumerate OSTs and MDT, when using DNE. > The index MUST be unique, and indexes must not have gaps. I see ... index must not have gaps. However, a user error could crash the kernelr . Does that sound right ? . > > 3. Surprise you how? > > HPDD-discuss is likely a better list for these sorts of questions, > lustre-devel is for code development. Thanks .. I'll move the discuss there sometime next week. It looks to me Lustre is doing sync to the disks all the time vs. other network filesystem (e.g. NFS) that does caching quite aggressively. -- Wendy ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Lustre-devel] Kernel crash from "mkfs.lustre --index" setting 2013-10-11 17:59 ` Wendy Cheng @ 2013-10-11 18:30 ` White, Cliff 2013-10-12 0:47 ` Wendy Cheng 0 siblings, 1 reply; 5+ messages in thread From: White, Cliff @ 2013-10-11 18:30 UTC (permalink / raw) To: lustre-devel On 10/11/13 10:59 AM, "Wendy Cheng" <s.wendy.cheng@gmail.com> wrote: >On Fri, Oct 11, 2013 at 10:41 AM, White, Cliff <cliff.white@intel.com> >wrote: > >> 1. >> --index is used to enumerate OSTs and MDT, when using DNE. >> The index MUST be unique, and indexes must not have gaps. > >I see ... index must not have gaps. However, a user error could crash >the kernelr . Does that sound right ? . Well, creating the filesystem is normally done by admins, not users, but yes, it shouldn't crash. Lustre-devel is the place for your patch, sorry I wasn't clear. -discuss is more for the 'why are their indexes' type of questions. :) > >> >> 3. Surprise you how? >> >> HPDD-discuss is likely a better list for these sorts of questions, >> lustre-devel is for code development. > >Thanks .. I'll move the discuss there sometime next week. It looks to >me Lustre is doing sync to the disks all the time vs. other network >filesystem (e.g. NFS) that does caching quite aggressively. Yes, Lustre by design does direct IO to disk, and does not cache data on the servers. Some caching can be enabled, but in general no, you should not see the servers caching. However, the clients should be using the normal Linux block cache, if the clients are not caching There may be an issue with your setup. Cliffw > >-- Wendy > ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Lustre-devel] Kernel crash from "mkfs.lustre --index" setting 2013-10-11 18:30 ` White, Cliff @ 2013-10-12 0:47 ` Wendy Cheng 0 siblings, 0 replies; 5+ messages in thread From: Wendy Cheng @ 2013-10-12 0:47 UTC (permalink / raw) To: lustre-devel On Fri, Oct 11, 2013 at 11:30 AM, White, Cliff <cliff.white@intel.com> wrote: > On 10/11/13 10:59 AM, "Wendy Cheng" <s.wendy.cheng@gmail.com> wrote: > >>On Fri, Oct 11, 2013 at 10:41 AM, White, Cliff <cliff.white@intel.com> >>wrote: >> > > Yes, Lustre by design does direct IO to disk, and does not cache data on > the servers. I see .. Direct IO . the data makes sense now :) Thanks ! -- Wendy ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-10-12 0:47 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-10-11 17:11 [Lustre-devel] Kernel crash from "mkfs.lustre --index" setting Wendy Cheng 2013-10-11 17:41 ` White, Cliff 2013-10-11 17:59 ` Wendy Cheng 2013-10-11 18:30 ` White, Cliff 2013-10-12 0:47 ` Wendy Cheng
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.