From mboxrd@z Thu Jan 1 00:00:00 1970 From: david m. richter Date: Fri, 5 Dec 2008 12:35:06 -0500 Subject: [Cluster-devel] gfs uevent and sysfs changes In-Reply-To: <1228470705.3579.12.camel@localhost.localdomain> References: <20081201173137.GA25171@redhat.com> <1d07ca700812041032o6f82fecew3fb93545fe64ed2d@mail.gmail.com> <20081204210754.GA19571@redhat.com> <1d07ca700812041359i2fe5443by7ac229485ec36f71@mail.gmail.com> <20081204223820.GB19571@redhat.com> <1228470705.3579.12.camel@localhost.localdomain> Message-ID: <1d07ca700812050935q2d37c53lda8ad5ce4af6459a@mail.gmail.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Fri, Dec 5, 2008 at 4:51 AM, Steven Whitehouse wrote: > Hi, > > On Thu, 2008-12-04 at 16:38 -0600, David Teigland wrote: >> On Thu, Dec 04, 2008 at 04:59:23PM -0500, david m. richter wrote: >> > ah, so just to make sure i'm with you here: (1) gfs_controld is >> > generating this "id"-which-is-the-mountgroup-id, and (2) gfs_kernel >> > will no longer receive this in the hostdata string, so (3) i can just >> > rip out my in-kernel hostdata-parsing gunk and instead send in the >> > mountgroup id on my own (i have my own up/downcall channel)? if i've >> > got it right, then everything's a cinch and i'll shut up :) >> >> Yep. Generally, the best way to uniquely identify and refer to a gfs >> filesystem is using the fsname string (specified during mkfs with -t and >> saved in the superblock). But, sometimes it's just a lot easier have a >> numerical identifier instead. I expect this is why you're using the id, >> and it's why we were using it for communicating about plocks. >> >> In cluster1 and cluster2 the cluster infrastructure dynamically selected a >> unique id when needed, and it never worked great. In cluster3 the id is >> just a crc of the fsname string. >> >> Now that I think about this a bit more, there may be a reason to keep the >> id in the string. There was some interest on linux-kernel about better >> using the statfs fsid field, and this id is what gfs should be putting >> there. >> > In that case gfs2 should be able to generate the id itself from the > fsname and it still doesn't need it passed in, even if it continues to > expose the id in sysfs. > > Perhaps better still, it should be possible for David to generate the id > directly if he really needs it from the fsname. > > Since we also have a UUID now, for recently created filesystems, it > might be worth exposing that via sysfs and/or uevents too. > >> > say, one tangential question (i won't be offended if you skip it - >> > heh): is there a particular reason that you folks went with the uevent >> > mechanism for doing upcalls? i'm just curious, given the >> > seeming-complexity and possible overhead of using the whole layered >> > netlink apparatus vs. something like Trond Myklebust's rpc_pipefs >> > (don't let the "rpc" fool you; it's a barebones, dead-simple pipe). >> > -- and no, i'm not selling anything :) my boss was asking for a list >> > of differences between rpc_pipefs and uevents and the best i could >> > come up with is the former's bidirectional. Trond mentioned the >> > netlink overhead and i wondered if that was actually a significant >> > factor or just lost in the noise in most cases. >> >> The uevents looked pretty simple when I was initially designing how the >> kernel/user interactions would work, and they fit well with sysfs files >> which I was using too. I don't think the overhead of using uevents is too >> bad. Sysfs files and uevents definately don't work great if you need any >> kind of sophisticated bi-directional interface. >> >> Dave >> > I think uevents are a reasonable choice as they are easy enough to parse > that it could be done by scripts, etc and easy to extend as well. We do > intend to use netlink in the future (bz #337691) for quota messages, but > in that case we would be using an existing standard for sending those > messages. > > Netlink can be extended fairly easily, but you do have to be careful > when designing the message format. I've not come across rpc_pipefs > before, so I can't comment on that yet. I don't think we need to worry > about overhead on sending the messages (if you have so much recovery > message traffic that its a problem, you probably have bigger things to > worry about!), and I don't see that netlink should have any more > overhead than any other method of sending messages. thanks! again, i appreciate learning from other peoples' experiences. fyi, the rpc_pipefs stuff is only currently used in two places, i believe -- by rpc.idmapd and rpc.gssd; just another of a surprisingly wide variety of ways to do kernel<->userland communication. thanks again, d. > > Steve. > > >