From: Christoph Hellwig <hch@infradead.org>
To: Joe Thornber <joe@fib011235813.fsnet.co.uk>
Cc: Linus Torvalds <torvalds@transmeta.com>,
Kernel Mailing List <linux-kernel@vger.kernel.org>,
Alan Cox <alan@redhat.com>, Jens Axboe <axboe@suse.de>
Subject: Re: Linux v2.5.42
Date: Mon, 14 Oct 2002 20:21:58 +0100 [thread overview]
Message-ID: <20021014202158.A27076@infradead.org> (raw)
In-Reply-To: <20021014100150.GC2518@fib011235813.fsnet.co.uk>; from joe@fib011235813.fsnet.co.uk on Mon, Oct 14, 2002 at 11:01:50AM +0100
On Mon, Oct 14, 2002 at 11:01:50AM +0100, Joe Thornber wrote:
> Yes, there has been a lot more discussion of EVMS than device-mapper
> in the last couple of weeks, however not much of it was complimentary.
> I feel like adding some obvious design flaws to device-mapper so that
> Christoph will give me some free publicity too ;)
Haven't found that big design issues yet, but a number of small
and medium implementation issues and some style nitpicking :)
Comments are against dm_2002-10-09.tar.bz2:
00.patch
Looks fine. Useful for other code (like EVMS..), too.
01.patch
Looks fine, but I wonder whether we really want the
zeroing in kernel mode (yes, I know userspace calloc
does it)
02.patch
It starts to get interesting:
+#define NUM_BUCKETS 64
+#define MASK_BUCKETS (NUM_BUCKETS - 1)
+#define HASH_MULT 2654435387U
+static struct list_head *_dev_buckets;
+static struct list_head *_name_buckets;
+static struct list_head *_uuid_buckets;
+
+/*
+ * Guards access to all three tables.
+ */
+static DECLARE_RWSEM(_hash_lock);
Your heavy _ prefix looks a bit strange from the normal
kernel coding style perspective. It's fine with me as long
as it's consistant with itself (and it is).
+/*-----------------------------------------------------------------
+ * Init/exit code
+ *---------------------------------------------------------------*/
+void dm_hash_exit(void)
+{
+ if (_dev_buckets)
+ kfree(_dev_buckets);
+
+ if (_name_buckets)
+ kfree(_name_buckets);
+
+ if (_uuid_buckets)
+ kfree(_uuid_buckets);
+}
kfree(NULL) is fine.
+/*-----------------------------------------------------------------
+ * Code for looking up the device by kdev_t.
+ *---------------------------------------------------------------*/
+static struct hash_cell *__get_dev_cell(kdev_t dev)
+{
+ struct list_head *tmp;
+ struct hash_cell *hc;
+ unsigned int h = hash_dev(dev);
+
+ list_for_each (tmp, _dev_buckets + h) {
+ hc = list_entry(tmp, struct hash_cell, list);
+ if (kdev_same(hc->md->dev, dev))
+ return hc;
+ }
+
+ return NULL;
+}
As the argument is purely a hash value I'd suggest to
use a dev_t. Maybe pass in a struct block_device for
consistency.
+
+struct mapped_device *dm_get_r(kdev_t dev)
+{
+ struct hash_cell *hc;
+ struct mapped_device *md = NULL;
+
+ down_read(&_hash_lock);
+ hc = __get_dev_cell(dev);
+ if (hc && dm_flag(hc->md, DMF_VALID)) {
+ md = hc->md;
+ down_read(&md->lock);
+ }
+ up_read(&_hash_lock);
+
+ return md;
+}
Dito (and some more).
+/*
+ * Convert a device path to a kdev_t.
+ */
+int lookup_device(const char *path, kdev_t *dev)
+{
+ int r;
+ struct nameidata nd;
+ struct inode *inode;
+
+ if ((r = path_lookup(path, LOOKUP_FOLLOW, &nd)))
+ return r;
+
+ inode = nd.dentry->d_inode;
+ if (!inode) {
+ r = -ENOENT;
+ goto out;
+ }
+
+ if (!S_ISBLK(inode->i_mode)) {
+ r = -EINVAL;
+ goto out;
+ }
+
+ *dev = inode->i_rdev;
+
+ out:
+ path_release(&nd);
+ return r;
+}
What about resolving directly to a struct block_device?
And yes, this name -> struct block_Device thing is duplicated
a few times. Al & I need to look into factoring out.
+ * Open a device so we can use it as a map destination.
+ */
+static int open_dev(struct dm_dev *d)
+{
+ int r;
+
+ if (d->bdev)
+ BUG();
+
+ if (!(d->bdev = bdget(kdev_t_to_nr(d->dev))))
+ return -ENOMEM;
+
+ r = blkdev_get(d->bdev, d->mode, 0, BDEV_RAW);
+ if (r) {
+ bdput(d->bdev);
+ return r;
+ }
+
+ return 0;
+}
bd_claim is missing..
+/*
+ * Close a device that we've been using.
+ */
+static void close_dev(struct dm_dev *d)
+{
+ if (!d->bdev)
+ return;
+
+ blkdev_put(d->bdev, BDEV_RAW);
+ d->bdev = NULL;
+}
And bd_unclaim here.
+
+ if (sscanf(path, "%x:%x", &major, &minor) == 2) {
+ /* Extract the major/minor numbers */
+ dev = mk_kdev(major, minor);
+ } else {
+ /* convert the path to a device */
+ if ((r = lookup_device(path, &dev)))
+ return r;
+ }
What do you need the major/minor version for?
+static int __init dm_init(void)
+{
+ const int count = sizeof(_inits) / sizeof(*_inits);
Use ARRAY_SIZE()?
+static int dm_blk_ioctl(struct inode *inode, struct file *file,
+ uint command, unsigned long a)
+{
+ int r;
+ sector_t size;
+ long l_size;
+ unsigned long long ll_size;
+
+ r = get_device_size(inode->i_rdev, &size);
+ if (r)
+ return r;
+
+ switch (command) {
+ case BLKGETSIZE:
+ l_size = (long) size;
+ if (copy_to_user((void *) a, &l_size, sizeof(long)))
+ return -EFAULT;
+ break;
+
+ case BLKGETSIZE64:
+ ll_size = (unsigned long long) size << 9;
+ if (put_user(ll_size, (u64 *) a))
+ return -EFAULT;
+ break;
These two are in generic code and odn't need to be implemented
by a lowlevel driver (won't ever be called).
+{
+ struct clone_info ci;
+
+ ci.md = md;
+ ci.bio = bio;
+ ci.io = alloc_io();
+ ci.io->error = 0;
Some indentation issues here (and in a lot other places). I'd
suggest you run the file through unexpand(1)
+ ci.io->io_count = (atomic_t) ATOMIC_INIT(1);
This cast looks bogus to me.
+/*
+ * Sets or clears the read-only flag for the device. Write lock
+ * must be held.
+ */
+void dm_set_ro(struct mapped_device *md, int ro)
+{
+ if (ro)
+ dm_set_flag(md, DMF_RO);
+ else
+ dm_clear_flag(md, DMF_RO);
+
+ set_device_ro(md->dev, ro);
+}
Split this into dm_set_ro and dm_set_rw? Yes, set_device_ro
has the saem braindead API, but no need to do the same
+/*
+ * We need to be able to change a mapping table under a mounted
+ * filesystem. For example we might want to move some data in
+ * the background. Before the table can be swapped with
+ * dm_bind_table, dm_suspend must be called to flush any in
+ * flight bios and ensure that any further io gets
+ * deferred. Write lock must be held.
+ */
+int dm_suspend(kdev_t dev)
Pass in a struct block_Device here?
+ add_wait_queue(&md->wait, &wait);
+ while (1) {
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ if (!atomic_read(&md->pending))
+ break;
+
+ yield();
+ }
+
+ current->state = TASK_RUNNING;
+ remove_wait_queue(&md->wait, &wait);
Hmm, the yield() looks strange and INTERRUPTIBLE without
a check for signals, too. Switch to wait_event_interruptible?
+int dm_resume(kdev_t dev)
struct block_Device?
+#include <linux/config.h>
+#include <linux/version.h>
+#include <linux/major.h>
+#include <linux/iobuf.h>
You don't actually use, do you?
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/compatmac.h>
What do you need from this one?
+#include <linux/cache.h>
+#include <linux/devfs_fs_kernel.h>
+#include <linux/ctype.h>
+#include <linux/device-mapper.h>
+#include <linux/list.h>
+#include <linux/init.h>
+#include <linux/blkdev.h>
IMHO many of these includes should go into the individual sources
instead. I doubt you need them in the header.
+static inline void dm_put_r(struct mapped_device *md) {
+ up_read(&md->lock);
+}
static inline void dm_put_r(struct mapped_device *md)
{
up_read(&md->lock);
}
+
+static inline void dm_put_w(struct mapped_device *md) {
+ up_write(&md->lock);
+}
Dito.
+static inline char *dm_strdup(const char *str)
+{
+ char *r = kmalloc(strlen(str) + 1, GFP_KERNEL);
+ if (r)
+ strcpy(r, str);
+ return r;
+}
What about the following in kernel.h instead?:
static inline char *kstrdup(const char *str, unsigned int gfp_mask)
{
char *r = kmalloc(strlen(str) + 1, gfp_mask);
if (likely(r))
strcpy(r, str);
return r;
}
+
+static inline int dm_flag(struct mapped_device *md, int flag)
+{
+ return (md->flags & (1 << flag));
+}
+
+static inline void dm_set_flag(struct mapped_device *md, int flag)
+{
+ md->flags |= (1 << flag);
+}
+
+static inline void dm_clear_flag(struct mapped_device *md, int flag)
+{
+ md->flags &= ~(1 << flag);
+}
Are these performance-critial or is there another reason to not
use the generic linux/bitops.h variants?
+int __init dm_interface_init(void);
__init is not needed for the prototypes. That way you don't need init.h
in the header.
03.patch
04.patch
Look fine. I wonder whether they want to be separate modules?
05.patch
+
+/*-----------------------------------------------------------------
+ * Implementation of open/close/ioctl on the special char
+ * device.
+ *---------------------------------------------------------------*/
+static int ctl_open(struct inode *inode, struct file *file)
+{
+ /* only root can open this */
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;
Do you really want this in ->open and not the åctual ioctl
commands?
+
+ MOD_INC_USE_COUNT;
Not needed - that's what THIS_MODULE in struct file_operations
is for.
+static int ctl_close(struct inode *inode, struct file *file)
+{
+ MOD_DEC_USE_COUNT;
+ return 0;
+}
Method not needed at all.
+ r = devfs_generate_path(_dm_misc.devfs_handle, rname + 3,
+ sizeof rname - 3);
+ if (r == -ENOSYS)
+ return 0; /* devfs not present */
+
+ if (r < 0) {
+ DMERR("devfs_generate_path failed for control device");
+ goto failed;
+ }
+
+ strncpy(rname + r, "../", 3);
+ r = devfs_mk_symlink(NULL, DM_DIR "/control",
+ DEVFS_FL_DEFAULT, rname + r, &_ctl_handle, NULL);
Looks a bit crude. Why do you need this symlink?
+ __kernel_dev_t dev; /* in/out */
Hmm. Can't you just do every ioctl on the actually affected
block device node instead of the character ones? Unlike LVM1
the nodes are there in /dev/mapper and must not be created..
And I must admit I don't really like the ioctl interface. But at least
it's separated out properly.
next prev parent reply other threads:[~2002-10-14 19:16 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-10-12 4:59 Linux v2.5.42 Linus Torvalds
2002-10-12 5:53 ` Adrian Bunk
2002-10-12 7:23 ` Adrian Bunk
2002-10-12 7:52 ` Andres Salomon
2002-10-12 9:23 ` David S. Miller
2002-10-12 9:11 ` Adrian Bunk
2002-10-12 9:24 ` Adrian Bunk
2002-10-12 9:41 ` Sam Ravnborg
2002-10-12 10:21 ` Adrian Bunk
2002-10-12 9:50 ` Matthias Andree
2002-10-12 11:11 ` jw schultz
2002-10-12 11:29 ` Andres Salomon
2002-10-12 11:46 ` Alan Cox
2002-10-12 11:40 ` jw schultz
2002-10-12 17:47 ` Jon Portnoy
2002-10-12 18:10 ` Rik van Riel
2002-10-13 11:58 ` venom
2002-10-13 12:52 ` Michael Clark
2002-10-12 12:37 ` Ed Tomlinson
2002-10-12 13:32 ` Christoph Hellwig
2002-10-12 19:39 ` Andres Salomon
2002-10-12 13:43 ` Christoph Hellwig
2002-10-13 17:10 ` Dipankar Sarma
2002-10-14 10:01 ` Joe Thornber
2002-10-14 19:21 ` Christoph Hellwig [this message]
2002-10-14 19:32 ` Alexander Viro
2002-10-14 22:28 ` Joe Thornber
-- strict thread matches above, loose matches on Subject: below --
2002-10-12 17:14 Mark Peloquin
2002-10-12 19:34 ` Alan Cox
2002-10-12 19:37 ` jbradford
2002-10-13 23:55 ` Rob Landley
2002-10-13 12:41 ` [Evms-devel] " Michael Clark
2002-10-13 13:49 ` Christoph Hellwig
2002-10-13 15:16 ` Michael Clark
2002-10-13 15:35 ` Christoph Hellwig
2002-10-13 16:11 ` Brian Jackson
2002-10-13 16:26 ` Arjan van de Ven
2002-10-13 17:06 ` Brian Jackson
2002-10-13 19:58 ` Mark Hahn
2002-10-13 19:57 ` Rik van Riel
2002-10-13 20:26 ` Sean Neakums
2002-10-24 11:45 ` Alexander Kellett
2002-10-13 19:59 ` Andrew Morton
2002-10-13 20:24 ` Bernd Eckenfels
2002-10-14 15:11 ` Christoph Hellwig
2002-10-14 22:27 ` Bernd Eckenfels
2002-10-13 17:46 ` Robert Love
2002-10-13 18:34 ` Brian Jackson
2002-10-14 14:45 ` Christoph Hellwig
2002-10-13 13:41 ` Christoph Hellwig
2002-10-12 20:20 Dieter Nützel
2002-10-12 22:19 ` Hans Reiser
2002-10-14 17:37 Ben Rafanello
2002-10-15 2:47 Paul McKenney
2002-10-15 17:43 Mark Peloquin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20021014202158.A27076@infradead.org \
--to=hch@infradead.org \
--cc=alan@redhat.com \
--cc=axboe@suse.de \
--cc=joe@fib011235813.fsnet.co.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox