From: Boaz Harrosh <bharrosh@panasas.com>
To: Avishay Traeger <avishay@gmail.com>,
Jeff Garzik <jeff@garzik.org>,
Andrew Morton <akpm@linux-foundation.org>,
Al Viro <viro@ZenIV.linux.org.uk>,
linux-fsdevel <linux-fsdevel@vger.ker
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Subject: [PATCH 8/9] exofs: Documentation
Date: Tue, 16 Dec 2008 17:36:44 +0200 [thread overview]
Message-ID: <4947CB0C.3000406@panasas.com> (raw)
In-Reply-To: <4947BFAA.4030208@panasas.com>
Added some documentation in exofs.txt, as well as a BUGS file.
For further reading, operation instructions, example scripts
and up to date infomation and code please see:
http://open-osd.org
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
---
Documentation/filesystems/exofs.txt | 173 +++++++++++++++++++++++++++++++++++
fs/exofs/BUGS | 6 +
2 files changed, 179 insertions(+), 0 deletions(-)
create mode 100644 Documentation/filesystems/exofs.txt
create mode 100644 fs/exofs/BUGS
diff --git a/Documentation/filesystems/exofs.txt b/Documentation/filesystems/exofs.txt
new file mode 100644
index 0000000..ec5040c
--- /dev/null
+++ b/Documentation/filesystems/exofs.txt
@@ -0,0 +1,173 @@
+===============================================================================
+WHAT IS EXOFS?
+===============================================================================
+
+exofs is a file system that uses an OSD and exports the API of a normal Linux
+file system. Users access exofs like any other local file system, and exofs
+will in turn issue commands to the local initiator.
+
+===============================================================================
+ENVIRONMENT
+===============================================================================
+
+To use this file system, you need to have an object store to run it on. You
+may download a target from:
+http://open-osd.org
+
+See drivers/scsi/osd/README for how to setup a working osd environment.
+
+===============================================================================
+USAGE
+===============================================================================
+
+1. Download and compile exofs and open-osd initiator:
+ You need an external Kernel source tree or kernel headers from your
+ distribution. (anything based on 2.6.26 or later).
+
+ a. download open-osd including exofs source using:
+ [parent-directory]$ git clone git://git.open-osd.org/open-osd.git
+
+ b. Build the library module like this:
+ [parent-directory]$ make -C KSRC=$(KER_DIR) open-osd
+
+ This will build both the open-osd initiator as well as the exofs kernel
+ module. Use whatever parameters you compiled your Kernel with and
+ $(KER_DIR) above pointing to the Kernel you compile against. See the file
+ open-osd/top-level-Makefile for an example.
+
+2. Get the OSD initiator and target set up properly, and login to the target.
+ See drivers/scsi/osd/README for farther instructions. Also see ./do-osd-test
+ for example script that does all these steps.
+
+3. Insmod the exofs.ko module:
+ [exofs]$ insmod exofs.ko
+
+4. Make sure the directory where you want to mount exists. If not, create it.
+ (For example, /mnt/exofs)
+
+5. At first run you will need to invoke the mkexofs.c routine
+
+ As an example, this will create the file system on:
+ /dev/osd0 partition ID 65540, max capacity 1024 Mg bytes
+
+ mount -t exofs -o pid=65540,mkfs=1,format=1024 /dev/osd0 /mnt/exofs/
+
+ The format=1024 is optional if not specified no OSD_FORMAT will be preformed
+ and a clean file system will be created in the specified pid, in the
+ available space of the target.
+ If pid already exist it will be deleted and a new one will be created in it's
+ place. Be careful.
+
+6. Mount the file system. The above command left the filesystem mounted,
+ but on subsequent runs the mkfs=1 should not be invoked.
+
+ For example, to mount /dev/osd0, partition ID 65540 on /mnt/exofs:
+
+ mount -t exofs -o pid=65540 /dev/osd0 /mnt/exofs/
+
+7. For reference (under fs/exofs/):
+ do-exofs start - an example of how to perform the above steps.
+ do-exofs stop - an example of how to unmount the file system.
+
+8. Extra compilation flags (uncomment in fs/exofs/Kbuild):
+ EXOFS_DEBUG - for debug messages and extra checks.
+
+===============================================================================
+exofs mount options
+===============================================================================
+Similar to any mount command:
+ mount -t exofs -o exofs_options /dev/osdX mount_exofs_directory
+
+Where:
+ -t exofs: specifies the exofs file system
+
+ /dev/osdX: X is a decimal number. /dev/osdX was created after a successful
+ login into an OSD target.
+
+ mount_exofs_directory: The directory to mount the file system on
+
+ exofs_options: Options are separated by commas (,)
+ pid=<integer> - The partition number to mount/create as
+ container of the filesystem.
+ This option is mandatory
+ mkfs=<1/0> - If mkfs=1 make a new filesystem before mount.
+ Default is 0 - don't make. If mkfs=0 pid must
+ exist and an mkfs=1 was previously preformed
+ on it.
+ format=<integer>- If mkfs=1 is specified then the format=
+ parameter will also invoke an OSD_FORMAT
+ command prior to creation of the filesystem
+ partition (mkfs). The integer specified is in
+ Mega bytes. If not specified or set to 0 then
+ no format is executed, and a partition is
+ created in the available space.
+ If mkfs=0 this option is ignored.
+ to=<integer> - Timeout in ticks for a single command
+ default is (60 * HZ) [for debugging only]
+
+===============================================================================
+DESIGN
+===============================================================================
+
+* The file system control block (AKA on-disk superblock) resides in an object
+ with a special ID (defined in common.h).
+ Information included in the file system control block is used to fill the
+ in-memory superblock structure at mount time. This object is created before
+ the file system is used by mkexofs.c It contains information such as:
+ - The file system's magic number
+ - The next inode number to be allocated
+
+* Each file resides in its own object and contains the data (and it will be
+ possible to extend the file over multiple objects, though this has not been
+ implemented yet).
+
+* A directory is treated as a file, and essentially contains a list of <file
+ name, inode #> pairs for files that are found in that directory. The object
+ IDs correspond to the files' inode numbers and will be allocated according to
+ a bitmap (stored in a separate object). Now they are allocated using a
+ counter.
+
+* Each file's control block (AKA on-disk inode) is stored in its object's
+ attributes. This applies to both regular files and other types (directories,
+ device files, symlinks, etc.).
+
+* Credentials are generated per object (inode and superblock) when they is
+ created in memory (read off disk or created). The credential works for all
+ operations and is used as long as the object remains in memory.
+
+* Async OSD operations are used whenever possible, but the target may execute
+ them out of order. The operations that concern us are create, delete,
+ readpage, writepage, update_inode, and truncate. The following pairs of
+ operations should execute in the order written, and we need to prevent them
+ from executing in reverse order:
+ - The following are handled with the OBJ_CREATED and OBJ_2BCREATED
+ flags. OBJ_CREATED is set when we know the object exists on the OSD -
+ in create's callback function, and when we successfully do a read_inode.
+ OBJ_2BCREATED is set in the beginning of the create function, so we
+ know that we should wait.
+ - create/delete: delete should wait until the object is created
+ on the OSD.
+ - create/readpage: readpage should be able to return a page
+ full of zeroes in this case. If there was a write already
+ en-route (i.e. create, writepage, readpage) then the page
+ would be locked, and so it would really be the same as
+ create/writepage.
+ - create/writepage: if writepage is called for a sync write, it
+ should wait until the object is created on the OSD.
+ Otherwise, it should just return.
+ - create/truncate: truncate should wait until the object is
+ created on the OSD.
+ - create/update_inode: update_inode should wait until the
+ object is created on the OSD.
+ - Handled by VFS locks:
+ - readpage/delete: shouldn't happen because of page lock.
+ - writepage/delete: shouldn't happen because of page lock.
+ - readpage/writepage: shouldn't happen because of page lock.
+
+===============================================================================
+LICENSE/COPYRIGHT
+===============================================================================
+The exofs file system is based on ext2 v0.5b (distributed with the Linux kernel
+version 2.6.10). All files include the original copyrights, and the license
+is GPL version 2 (only version 2, as is true for the Linux kernel). The
+Linux kernel can be downloaded from www.kernel.org.
diff --git a/fs/exofs/BUGS b/fs/exofs/BUGS
new file mode 100644
index 0000000..6d6e1f9
--- /dev/null
+++ b/fs/exofs/BUGS
@@ -0,0 +1,6 @@
+- Some mount time options should have been 64-bit, but are declared as 32-bit
+ because that's what the kernel's parsing methods support at this time.
+
+- Out-of-space may cause a severe problem if the object (and directory entry)
+ were written, but the inode attributes failed. Then if the filesystem was
+ unmounted and mounted the kernel can get into an endless loop doing a readdir.
--
1.6.0.1
next prev parent reply other threads:[~2008-12-16 15:36 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-16 14:48 [PATCHSET 0/9] exofs (was osdfs) Boaz Harrosh
2008-12-16 14:52 ` [PATCH 1/9] exofs: osd Swiss army knife Boaz Harrosh
2008-12-29 20:29 ` Andrew Morton
2008-12-31 15:33 ` Boaz Harrosh
2008-12-31 19:26 ` Andrew Morton
2009-01-01 14:44 ` Boaz Harrosh
2009-01-02 16:52 ` Pavel Machek
2009-01-04 8:43 ` Boaz Harrosh
2009-01-04 20:03 ` Pavel Machek
2009-01-05 9:01 ` Boaz Harrosh
2009-01-05 9:36 ` Pavel Machek
2008-12-16 15:15 ` Boaz Harrosh
2008-12-16 15:17 ` [PATCH 2/9] exofs: file and file_inode operations Boaz Harrosh
2008-12-29 20:34 ` Andrew Morton
2008-12-31 15:36 ` Boaz Harrosh
2008-12-16 15:21 ` [PATCH 3/9] exofs: symlink_inode and fast_symlink_inode operations Boaz Harrosh
2008-12-29 20:35 ` Andrew Morton
2008-12-16 15:22 ` [PATCH 4/9] exofs: address_space_operations Boaz Harrosh
2008-12-29 20:45 ` Andrew Morton
2008-12-31 15:35 ` Boaz Harrosh
2008-12-16 15:28 ` [PATCH 5/9] exofs: dir_inode and directory operations Boaz Harrosh
2008-12-29 20:47 ` Andrew Morton
2008-12-31 15:33 ` Boaz Harrosh
2008-12-16 15:31 ` [PATCH 6/9] exofs: super_operations and file_system_type Boaz Harrosh
2008-12-17 22:23 ` Marcin Slusarz
2008-12-18 8:41 ` Boaz Harrosh
2008-12-29 20:50 ` Andrew Morton
2008-12-16 15:33 ` [PATCH 7/9] exofs: mkexofs Boaz Harrosh
2008-12-29 20:14 ` Andrew Morton
2008-12-31 15:19 ` Boaz Harrosh
2008-12-31 15:57 ` James Bottomley
2009-01-01 9:22 ` [osd-dev] " Benny Halevy
2009-01-01 9:54 ` Jeff Garzik
2009-01-01 14:23 ` Benny Halevy
2009-01-01 14:28 ` Matthew Wilcox
2009-01-01 18:12 ` Jörn Engel
2009-01-01 23:26 ` J. Bruce Fields
2009-01-02 7:14 ` Benny Halevy
2009-01-04 15:20 ` Boaz Harrosh
2009-01-04 15:38 ` Christoph Hellwig
2009-01-12 18:12 ` James Bottomley
2009-01-12 19:23 ` Jeff Garzik
2009-01-12 19:56 ` James Bottomley
[not found] ` <1231790190.15161.29.camel@localhost.localdomain>
2009-01-12 20:22 ` Jeff Garzik
2009-01-12 23:25 ` James Bottomley
2009-01-13 13:03 ` [osd-dev] " Benny Halevy
2009-01-13 13:24 ` Jeff Garzik
2009-01-13 13:32 ` Benny Halevy
2009-01-13 13:44 ` Jeff Garzik
2009-01-13 14:03 ` Alan Cox
2009-01-13 14:17 ` Jeff Garzik
2009-01-13 16:14 ` Alan Cox
2009-01-13 17:21 ` Boaz Harrosh
2009-01-21 18:13 ` Jeff Garzik
2009-01-21 18:44 ` Boaz Harrosh
2009-01-12 22:48 ` Jamie Lokier
2009-01-06 8:40 ` Andreas Dilger
2008-12-31 19:25 ` Andrew Morton
2009-01-01 13:33 ` Boaz Harrosh
2009-01-02 22:46 ` James Bottomley
2009-01-04 8:59 ` Boaz Harrosh
2008-12-16 15:36 ` Boaz Harrosh [this message]
2008-12-18 7:47 ` [PATCH 8/9] exofs: Documentation Pavel Machek
2008-12-18 8:32 ` Boaz Harrosh
2008-12-16 15:38 ` [PATCH 9/9] fs: Add exofs to Kernel build Boaz Harrosh
[not found] ` <4947C624.3050602@panasas.com>
2009-01-07 15:47 ` [osd-dev] [PATCH 1/9] exofs: osd Swiss army knife Benny Halevy
2009-01-13 13:55 ` Alan Cox
2009-01-13 14:43 ` Boaz Harrosh
2009-01-13 14:52 ` Boaz Harrosh
2009-01-13 15:09 ` Jamie Lokier
2009-01-13 15:17 ` Jeff Garzik
2009-01-13 15:28 ` Benny Halevy
-- strict thread matches above, loose matches on Subject: below --
2009-04-01 14:05 [PATCHSET 0/9 ver7] exofs for 2.6.30 (really) Boaz Harrosh
2009-04-01 14:29 ` [PATCH 8/9] exofs: Documentation Boaz Harrosh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4947CB0C.3000406@panasas.com \
--to=bharrosh@panasas.com \
--cc=akpm@linux-foundation.org \
--cc=avishay@gmail.com \
--cc=jeff@garzik.org \
--cc=linux-fsdevel@vger.ker \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).