From: Jeremy Filizetti <jeremy.filizetti@gmail.com>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] lustre 1.8+ issues with automounter
Date: Fri, 04 Mar 2011 01:12:47 -0500 [thread overview]
Message-ID: <4D7082DF.2040705@gmail.com> (raw)
In-Reply-To: <FD878C30-2103-400A-A02D-C3745DBEBF1F@xyratex.com>
An example is below with some comments and a handful of the log
removed. I don't actually have this many OSTs but I just created a lot
of OSTs to easily reproduce the problem in a VM. autofs is setup to
mount lustre. The autofs attempts to mount the file system when I typed
"ls -l /lustre/xen1/tmp/testfile" where testfile is allocated on the
192nd OST IIRC.
Mount kicked off by the above command by the automounter.
00000020:01200004:2:1298954011.295906:0:8398:0:(obd_mount.c:2001:lustre_fill_super())
VFS Op: sb ffff8801e7e22c00
00000020:01000004:2:1298954011.295920:0:8398:0:(obd_mount.c:2015:lustre_fill_super())
Mounting client xen1-client
00000080:00200000:2:1298954011.301889:0:8398:0:(llite_lib.c:1017:ll_fill_super())
VFS Op: sb ffff8801e7e22c00
00000080:01000000:2:1298954011.431273:0:8398:0:(llite_lib.c:1115:ll_fill_super())
Found profile xen1-client: mdc=xen1-MDT0000-mdc osc=xen1-clilov
00000080:00000010:2:1298954011.431274:0:8398:0:(llite_lib.c:1118:ll_fill_super())
kmalloced 'osc': 29 at ffff8801e7efd9a0.
00000080:00000010:2:1298954011.431276:0:8398:0:(llite_lib.c:1124:ll_fill_super())
kmalloced 'mdc': 34 at ffff8801dcb56ec0.
00000080:00000010:2:1298954011.431277:0:8398:0:(llite_lib.c:267:client_common_fill_super())
kmalloced 'data': 72 at ffff8801e9deedc0.
00000080:00100000:2:1298954011.432116:0:8398:0:(llite_lib.c:409:client_common_fill_super())
ocd_connect_flags: 0xe1440478 ocd_version: 17302784 ocd_grant: 0
00020000:01000000:1:1298954011.432928:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0000_UUID active
00020000:01000000:1:1298954011.432977:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0002_UUID active
00020000:01000000:1:1298954011.433025:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0004_UUID active
.
.
.
00020000:01000000:2:1298954011.455806:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0094_UUID active
00020000:01000000:2:1298954011.455924:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0095_UUID active
00020000:01000000:2:1298954011.456042:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0096_UUID active
00020000:01000000:2:1298954011.456161:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0097_UUID active
00020000:01000000:2:1298954011.457417:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0098_UUID active
00000080:00000004:1:1298954011.457543:0:8398:0:(llite_lib.c:467:client_common_fill_super())
rootfid 16:[0x10:0xababf859:0x4000]
00020000:01000000:2:1298954011.457573:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST0099_UUID active
00020000:01000000:2:1298954011.457705:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST009a_UUID active
00000080:00000010:1:1298954011.457830:0:8398:0:(super25.c:57:ll_alloc_inode())
slab-alloced '(lli)': 928 at ffff8801e0de4bc0.
00020000:01000000:2:1298954011.457855:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST009b_UUID active
00000080:00000010:1:1298954011.457938:0:8398:0:(llite_lib.c:528:client_common_fill_super())
kfreed 'data': 72 at ffff8801e9deedc0.
00000080:00000010:1:1298954011.457977:0:8398:0:(llite_lib.c:1151:ll_fill_super())
kfreed 'mdc': 34 at ffff8801dcb56ec0.
00000080:00000010:1:1298954011.457979:0:8398:0:(llite_lib.c:1153:ll_fill_super())
kfreed 'osc': 29 at ffff8801e7efd9a0.
00000080:02000400:1:1298954011.457979:0:8398:0:(llite_lib.c:1157:ll_fill_super())
Client xen1-client has started
00000020:00000004:1:1298954011.457980:0:8398:0:(obd_mount.c:2053:lustre_fill_super())
Mount 192.168.66.2 at tcp8:/xen1 complete
We just returned from filling the super block so now the file system is
accessible, but as you can see by the lov_set_osc_active not all OSC's
have been set active yet.
00020000:01000000:2:1298954011.457981:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST009c_UUID active
00020000:01000000:2:1298954011.458108:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST009d_UUID active
.
.
.
00020000:01000000:2:1298954011.460053:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST00ac_UUID active
00020000:01000000:2:1298954011.460187:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST00ad_UUID active
00000080:00000010:1:1298954011.461272:0:8395:0:(super25.c:57:ll_alloc_inode())
slab-alloced '(lli)': 928 at ffff8801e0de4800.
00020000:01000000:2:1298954011.461487:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST00ae_UUID active
00000080:00000010:1:1298954011.461589:0:8395:0:(super25.c:57:ll_alloc_inode())
slab-alloced '(lli)': 928 at ffff8801e0de4440.
00000080:00010000:1:1298954011.461624:0:8395:0:(file.c:965:ll_glimpse_size())
Glimpsing inode 218
00000080:00020000:1:1298954011.461636:0:8395:0:(file.c:995:ll_glimpse_size())
obd_enqueue returned rc -5, returning -EIO
Now glimpsing the inode from above that is allocated on xen-OST00bf
which is not yet active so the set is empty and returns -EIO.
00020000:01000000:2:1298954011.461644:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST00af_UUID active
00020000:01000000:2:1298954011.461782:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST00b0_UUID active
.
.
.
00020000:01000000:2:1298954011.463766:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST00be_UUID active
00020000:01000000:2:1298954011.463911:0:11545:0:(lov_obd.c:570:lov_set_osc_active())
Marking OSC xen1-OST00bf_UUID active
Finally the last OSC is set active, this is where
client_common_fill_super should, ll_fill_super, lustre_fill_super should
return from the mount syscall because the file system is now all accessible.
I will take a look at your suggestion below tomorrow to see if it will
handle this situate.
Thanks,
Jeremy
> you patch is wrong in case some OSC targets will be inaccessible (in maintenance, or network troubles).
> In that case lov_connect will stick in waiting for infinity time, but that is don't expected behavior.
> Can you provide more details about what is situation confuses automount ?
> or try to move
>>>
> err = obd_statfs(obd, &osfs, cfs_time_current_64() - HZ, 0);
> if (err)
> GOTO(out_mdc, err);
>>>
> from current location to something after get root fid.
>
> if FS mounted without lazystatfs option, obd_statfs will blocked until all connection requests is finished.
> so you will have same behavior but without changes in obd_connect() code.
next prev parent reply other threads:[~2011-03-04 6:12 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-04 4:48 [Lustre-devel] lustre 1.8+ issues with automounter Jeremy Filizetti
2011-03-04 5:47 ` Alexey Lyashkov
2011-03-04 6:12 ` Jeremy Filizetti [this message]
2011-03-04 6:21 ` Alexey Lyashkov
2011-03-04 6:39 ` Andreas Dilger
2011-03-04 9:22 ` Alexey Lyashkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D7082DF.2040705@gmail.com \
--to=jeremy.filizetti@gmail.com \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.