From: "Siddhartha Jain" <sid@netmagicsolutions.com>
To: <linux-fsdevel@vger.kernel.org>
Subject: RE: Replicating directories - Intercepting write/modify system calls
Date: Fri, 13 Feb 2004 14:29:16 +0530 [thread overview]
Message-ID: <AGEEIPMDCDHNNJHJCJEGAEDOCAAA.sid@netmagicsolutions.com> (raw)
In-Reply-To: <000001c3f208$edd70de0$0301a8c0@family>
Thanks, your explanation was very succinct and to the point.
>
> >> Too complicated. Too invasive.
>
> > In what way? Will it slow down the FS too much? What other approach
> > can be adopted to keep the implementation FS-independant and still
> > track the changes in real-time.
>
> The changes you proposed in your original email would require
> changes to too > many functions, files, and other segments of code and
design.
> Such changes would never be adopted by the Linux "powers that be." As
such, you'd have
> to maintain the code yourself and manually patch each new kernel with your
> customized replication code. The goal, therefore, is to make changes that
> get adopted into the official Linux kernel to save yourself future
> maintenance work and allow others to both debug and benefit from
> your code.
>
> A rule of thumb -- the change that accomplishes the task in the
> least amount
> of code wins.
I totally agree.
>
> As for another approach, see my next point.
>
> >> Replicating the entire partition rather than just a specific directory
> >> is more feasible.
>
> > How would I do that? Will the implementation still be FS-independant?
>
> To be brief, I don't know that it can be done -- the code may have to be
> written first -- and no implementation will be FS independent.
>
> The VFS (Virtual File System) is merely a set of function pointers. A
> function pointer when called simply says, "Use this other function." In
> other words, a function pointer is a variable that stores which
> function to
> use, but a function pointer is not in and of itself a function. It cannot
> do any grunt work of its own. Hence, implementing your proposal
> at the VFS
> level would be impossible because all the VFS does is call the appropriate
> FS-specific function.
Ok, to give an example (with ref to
http://www.faqs.org/docs/kernel_2_4/lki-3.html)
the function open() really calls upon sys_open in fs/open.c.
sys_open looks like:
asmlinkage long sys_open(const char __user * filename, int flags, int mode)
{
char * tmp;
int fd, error;
#if BITS_PER_LONG != 32
flags |= O_LARGEFILE;
#endif
tmp = getname(filename);
fd = PTR_ERR(tmp);
if (!IS_ERR(tmp)) {
fd = get_unused_fd();
if (fd >= 0) {
struct file *f = filp_open(tmp, flags, mode);
error = PTR_ERR(f);
if (IS_ERR(f))
goto out_error;
fd_install(fd, f);
}
out:
putname(tmp);
}
return fd;
out_error:
put_unused_fd(fd);
fd = error;
goto out;
}
Change it to include:
1. Check if the filename path matches any master definition in a config
file.
2. If it does, append the (filename path - master path) and append the
remaining to the mirror path.
3. Replicate every call within sys_open with the parameter mirror_file.
Same goes for sys_write and other functions that modify a file (ioctl?).
Ofcourse, what I should be doing is trying it out and seeing if it works
instead of increasing mail traffic. But I do not know the repurcussions on
other parts of the kernel and that is where I need your input.
>
> Your proposal would be easier to implement in C++ which uses virtual
> functions rather than function pointers, but converting the Linux
> kernel to
> C++ is a whole different topic. FYI: the core kernel developers have
> already outright refused to do that.
C++?? Now now, I am humble being. I don't want to be another Linus. One's
enough for the M$ and the like :)
>
> >> Does it need to be replicated in real-time, or would an
> >> incremental synchronization be acceptable (i.e. synced
> >> every 15 minutes or so)?
>
> > Synced every fifteen minutes is ok but anything that scans the
> > whole partition/directory file by file would be nighmarish given
> > that the mail toasters I am mirroring have tens of thousands of
> > files. Therefore, IMHO, the solution needs to track changes as
> > and when they happen to the file and mirror them immiediately.
> >
> > For eg, a rsync/mirrordir takes upto half-an-hour to sync about
> > 100,000 files. Not to mention the 40% CPU load on the file server
> > while the rsync runs [trimmed].
>
> You can scan ~100,000 files to see if they have changed in less than 90
> seconds -- at least on a locally mounted partition. The part that takes
> forever is the actual syncing of data.
Yep, scanning might not take long. But the whole process and overheads are
just not practical. I guess, rsync/mirrodir weren't written with this
application in mind.
>
> See my next point for a continued explanation.
>
> >> Does it need to be a two-way synchronization, or is a one-way
> >> synchronization acceptable? In other words, do changes on the
> >> replica need to be synced back to the original?
>
> > For my purpose, one-way is fine. But two-way would be cool and
> > in the greater good of humankind ;)
>
> Is this for load balancing, or a sophisticated way of doing a backup?
>
> If it's just for a backup, I'll tell you right now that your project isn't
> worth the effort.
>
> If it's for load balancing, you'll have to do a
> quasi-cost/benefit analysis
> on whether or not real-time replication really results in a
> balanced load or
> if the network bandwidth and CPU cycles consumed by real-time replication
> offsets the load balancing benefits. After all, one of the
> pillars of load
> balancing is to reduce the amount of work in the here and now,
> and real-time
> replication -- as opposed to syncing every 15 or so minutes -- is
> counterproductive to that end.
See, I have a NetApp filer. All mail toasters mount the same volume from the
filer. Now, what if the filer dies - as in a hardware failure? I would have
to bring up another file server and restore the last tape backup to it. The
loss of mails between the last tape backup and disaster time are not
acceptable. Another option is that I buy another filer and run (buy)
NetApp's propreitary code to do the replication between the two filers.
Unfortunately, I have money for neither.
So what I can do is keep another Linux box as a filer. Export a volume as
NFS to all the toasters. And the toasters replicate the NetApp filer mounted
directory to the Linux-filer mounted directory. Initially, one way is good.
Later, if I can get two-way to work, then I catch load-balance between the
NFS filers with intelligent DNS.
>
> >> Are you willing to commission someone (i.e. pay them money) to
> >> do this work? 8-D
>
> [paraphrased] Yep, I will give you a free mail account on hotmail....
>
> Uh... gee, thanks, but I was hoping for something along the lines
> of actual
> cash or employment. You see, I'm a computer programmer with my BA in
> Computer Science and even a 3.0 GPA, but no one's willing to hire
> me without
> 2-3 years of experience. (Catch 22. How do I get the experience
> if no one
> is willing to give it?) BTW, I could also be a system administrator; my
> degree covered that too. It's just that I haven't gotten far
> enough down my
> career path to be fixed on one road or the other.
I don't want to mock you. I know you live in the US or some developed
nation. But if you want a job, I can offer you one. Its in India and we pay
reasonanly well according to Indian standards. On what we pay, you can rent
a place, buy a car, spend nights at a disc/pub/PS2/X-box and still save a
bit. We are a small company that enjoys working with people who have new
ideas and we encourage whatever cool stuff people want to do (as long as it
makes some money for us). Basically, no corporate crap and policy and stuff.
Check out www.netmagicsolutions.com
>
> > Btw, I am willing to do all the dirty work of wrestling with C code
> > and testing but I need guidance.
>
> At the risk of sounding rude, without the motivation of actual cash or
> employment, the most I'm willing to do is direct you to some good albeit
> expensive books on the topic.
Thanks for the same. I don't expect anyone to do my work for me. Goodluck
with your little project. May it find its place in the official Linux kernel
tree :)
next prev parent reply other threads:[~2004-02-13 9:04 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-02-12 12:39 Replicating directories - Intercepting write/modify system calls Siddhartha Jain
2004-02-12 17:21 ` Joseph D. Wagner
2004-02-13 5:00 ` Siddhartha Jain
2004-02-13 8:11 ` Joseph D. Wagner
2004-02-13 8:59 ` Siddhartha Jain [this message]
2004-02-14 13:08 ` Joseph D. Wagner
2004-02-13 14:34 ` Siddhartha Jain
2004-02-13 15:11 ` Akshat Aranya
2004-02-13 18:40 ` Herbert Poetzl
2004-02-15 8:06 ` Nir Tzachar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AGEEIPMDCDHNNJHJCJEGAEDOCAAA.sid@netmagicsolutions.com \
--to=sid@netmagicsolutions.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.