* [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
@ 2001-05-19 6:23 Ben LaHaise
2001-05-19 9:42 ` Christer Weinigel
` (4 more replies)
0 siblings, 5 replies; 27+ messages in thread
From: Ben LaHaise @ 2001-05-19 6:23 UTC (permalink / raw)
To: torvalds; +Cc: viro, linux-kernel, linux-fsdevel
Hey folks,
The work-in-progress patch for-demonstration-purposes-only below consists
of 3 major components, and is meant to start discussion about the future
direction of device naming and its interaction block layer. The main
motivations here are the wasting of minor numbers for partitions, and the
duplication of code between user and kernel space in areas such as
partition detection, uuid location, lvm setup, mount by label, journal
replay, and so on...
1. Generic lookup method and argument parsiing (fs/lookupargs.c)
This code implements a lookup function which is for demonstration
purposes used in fs/block_dev.c. The general idea is to pass
additional parameters to device drivers on open via a comma
seperated list of options following the device's name. Sample
uses:
/dev/sda/raw -> open sda in raw mode.
/dev/sda/limit=102400 -> open sda with a limit of 100K
/dev/sda/offset=1024,limit=2048
-> open a device that gives a view of sda at an
offset of 1KB to 2KB
The arguments are defined in a table (fs/block_dev.c:660), which
defines the name and type of argument to parse. This table is
used at lookup time to determine if an option name is valid
(resulting in a postive dentry) or invalid. Potential uses for
this are numerous: opening a control channel to a device,
specifying a graphics mode for a framebuffer on open, replacing
ioctls, .... lots of options. Please seperate comments on this
portion from the other parts of the patch.
2. Restricted block device (drivers/block/blkrestrict.c)
This is a quick-n-dirty implementation of a simple md-like block
device that adds an offset to sector requests and limits the
maximum offset on the device. The idea here is to replace the
special case minor numbers used for the partitioning code with
a generic runtime allocated translation node. The idea will work
best once its data can be stored in a kdev_t structure. The API
for use is simple:
kdev_t restrict_create_dev(kdev_t dev,
unsigned long long offset,
unsigned long long limit)
The associated cleanup of the startup code is not addressed here.
Comments on this part (I know the implementation is ugly, talk
about the ideas please)?
3. Userspace partition code proposal
Given the above two bits, here's a brief explaination of a
proposal to move management of the partitioning scheme into
userspace, along with portions of raid startup, lvm, uuid and
mount by label code needed for mounting the root filesystem.
Consider that the device node currently known as /dev/hda5 can
also be viewed as /dev/hda at offset 512000 with a limit of 10GB.
With the extensions in fs/block_dev.c, you could replace /dev/hda5
with /dev/hda/offset=512000,limit=10240000000. Now, by putting
the partition parsing code into a libpart and binding mount to a
libpart, the root filesystem mounting code can be run out of an
initrd image. The use of mount gives us the ability to mount
filesystems by UUID, by label or other exotic schemes without
having to add any additional code to the kernel.
I'm going to stop writing this now. I need sleep...
Folks, please let me know your opinions on the ideas presented herein, and
do attempt to keep the bits of code that are useful. Cheers,
-ben
[23:34:07] <viro> bcrl: you are sick.
[23:41:13] <viro> bcrl: you _are_ sick.
[23:43:24] <viro> bcrl: you are _fscking_ sick.
here starts v2.4.5-pre3_bdev_naming-A0.diff
diff -urN kernels/2.4/v2.4.5-pre3/Makefile bdev_naming/Makefile
--- kernels/2.4/v2.4.5-pre3/Makefile Thu May 17 18:09:42 2001
+++ bdev_naming/Makefile Sat May 19 01:33:39 2001
@@ -1,7 +1,7 @@
VERSION = 2
PATCHLEVEL = 4
SUBLEVEL = 5
-EXTRAVERSION =-pre3
+EXTRAVERSION =-pre3-sick-test
KERNELRELEASE=$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)
diff -urN kernels/2.4/v2.4.5-pre3/arch/i386/boot/install.sh bdev_naming/arch/i386/boot/install.sh
--- kernels/2.4/v2.4.5-pre3/arch/i386/boot/install.sh Tue Jan 3 06:57:26 1995
+++ bdev_naming/arch/i386/boot/install.sh Fri May 18 20:24:36 2001
@@ -21,6 +21,7 @@
# User may have a custom install script
+if [ -x ~/bin/installkernel ]; then exec ~/bin/installkernel "$@"; fi
if [ -x /sbin/installkernel ]; then exec /sbin/installkernel "$@"; fi
# Default install - same as make zlilo
diff -urN kernels/2.4/v2.4.5-pre3/drivers/block/Makefile bdev_naming/drivers/block/Makefile
--- kernels/2.4/v2.4.5-pre3/drivers/block/Makefile Fri Dec 29 17:07:21 2000
+++ bdev_naming/drivers/block/Makefile Sat May 19 00:29:08 2001
@@ -12,7 +12,7 @@
export-objs := ll_rw_blk.o blkpg.o loop.o DAC960.o
-obj-y := ll_rw_blk.o blkpg.o genhd.o elevator.o
+obj-y := ll_rw_blk.o blkpg.o genhd.o elevator.o blkrestrict.o
obj-$(CONFIG_MAC_FLOPPY) += swim3.o
obj-$(CONFIG_BLK_DEV_FD) += floppy.o
diff -urN kernels/2.4/v2.4.5-pre3/drivers/block/blkrestrict.c bdev_naming/drivers/block/blkrestrict.c
--- kernels/2.4/v2.4.5-pre3/drivers/block/blkrestrict.c Wed Dec 31 19:00:00 1969
+++ bdev_naming/drivers/block/blkrestrict.c Sat May 19 01:17:36 2001
@@ -0,0 +1,105 @@
+/* driver/block/blkrestrict.c - written by Benjamin LaHaise
+ * Block device limit enforcer. Designed to implement partition
+ * tables under control of other code.
+ *
+ * Copyright 2001 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#include <linux/fs.h>
+#include <linux/blkdev.h>
+#include <linux/init.h>
+
+static char major_name[] = "restrict";
+static unsigned int major_nr;
+static unsigned int minor_nr; /* next free minor number */
+
+static struct restrict_info {
+ unsigned long offset;
+ unsigned long limit;
+ kdev_t dev;
+} restrict_info[256]; /* FIXME: stupid */
+
+static int restrict_blk_size[256]; /* grr */
+
+kdev_t restrict_create_dev(kdev_t dev, unsigned long long offset, unsigned long long limit)
+{
+ unsigned int minor = minor_nr++; /* FIXME: overflow/smp/fish */
+ struct restrict_info *info = &restrict_info[minor];
+
+ info->offset = offset / 512;
+ info->limit = limit / 512;
+ info->dev = dev;
+
+ restrict_blk_size[minor] = info->limit - info->offset;
+
+ printk("restrict_create_dev: (0x%02x, 0x%02x) offset=0x%lx limit=0x%lx on (0x%04x)\n", major_nr, minor, info->offset, info->limit, info->dev); /* FIXME: duh */
+
+ return MKDEV(major_nr, minor);
+}
+
+static int restrict_open(struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
+static int restrict_release(struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
+static int restrict_make_req(request_queue_t *q, int rw, struct buffer_head *bh)
+{
+ struct restrict_info *info = &restrict_info[MINOR(bh->b_rdev)];
+ unsigned long new_sector = bh->b_rsector + info->offset;
+
+ if (new_sector >= info->limit || new_sector < bh->b_rsector) {
+ printk("restrict_make_req: 0x%lx beyond limit on 0x%x (0x%lx,0x%lx)\n", bh->b_rsector, bh->b_rdev, info->offset, info->limit);
+ buffer_IO_error(bh);
+ return 0;
+ }
+
+ bh->b_rdev = info->dev;
+ bh->b_rsector += info->offset;
+
+ return 1;
+}
+
+static struct block_device_operations restrict_bdops = {
+ open: restrict_open,
+ release: restrict_release,
+};
+
+static int __init blkrestrict_init(void)
+{
+ major_nr = register_blkdev(0, major_name, &restrict_bdops);
+ if (major_nr < 0)
+ return major_nr;
+
+ printk("blkrestrict_init: got major %u\n", major_nr);
+
+ blk_queue_make_request(BLK_DEFAULT_QUEUE(major_nr), restrict_make_req);
+ blk_size[major_nr] = restrict_blk_size;
+
+ return 0;
+}
+
+static void __exit blkrestrict_exit(void)
+{
+ unregister_blkdev(major_nr, major_name);
+}
+
+module_init(blkrestrict_init);
+module_exit(blkrestrict_exit);
diff -urN kernels/2.4/v2.4.5-pre3/fs/Makefile bdev_naming/fs/Makefile
--- kernels/2.4/v2.4.5-pre3/fs/Makefile Thu Apr 5 11:53:44 2001
+++ bdev_naming/fs/Makefile Fri May 18 18:49:49 2001
@@ -12,7 +12,7 @@
obj-y := open.o read_write.o devices.o file_table.o buffer.o \
super.o block_dev.o stat.o exec.o pipe.o namei.o fcntl.o \
- ioctl.o readdir.o select.o fifo.o locks.o \
+ ioctl.o readdir.o select.o fifo.o locks.o lookupargs.o \
dcache.o inode.o attr.o bad_inode.o file.o iobuf.o dnotify.o \
filesystems.o
diff -urN kernels/2.4/v2.4.5-pre3/fs/block_dev.c bdev_naming/fs/block_dev.c
--- kernels/2.4/v2.4.5-pre3/fs/block_dev.c Thu May 17 18:09:42 2001
+++ bdev_naming/fs/block_dev.c Sat May 19 01:31:51 2001
@@ -14,9 +14,12 @@
#include <linux/major.h>
#include <linux/devfs_fs_kernel.h>
#include <linux/smp_lock.h>
+#include <linux/lookupargs.h>
#include <asm/uaccess.h>
+extern kdev_t restrict_create_dev(kdev_t dev, unsigned long long offset, unsigned long long limit);
+
extern int *blk_size[];
extern int *blksize_size[];
@@ -648,10 +651,52 @@
return ret;
}
+struct blkdev_param {
+ unsigned long long offset,
+ limit;
+ int raw;
+};
+
+arg_format_t blkdev_arg_fmt[] = {
+ { "offset", Arg_ull, offsetof(struct blkdev_param, offset) },
+ { "limit", Arg_ull, offsetof(struct blkdev_param, limit) },
+ { "raw", Arg_bool, offsetof(struct blkdev_param, raw) },
+ { NULL }
+};
+
+static struct dentry *blkdev_lookup(struct inode *inode, struct dentry *dentry)
+{
+ return generic_parse_lookup(inode, dentry, blkdev_arg_fmt);
+}
+
int blkdev_open(struct inode * inode, struct file * filp)
{
- int ret = -ENXIO;
+ int ret;
struct block_device *bdev = inode->i_bdev;
+ struct dentry *dentry = filp->f_dentry;
+ struct blkdev_param param = { 0ULL, ~0ULL, 0 };
+
+ if (dentry && dentry->d_parent &&
+ dentry->d_inode == dentry->d_parent->d_inode) {
+ printk("blkdev_open: args='%*s'\n", dentry->d_name.len, dentry->d_name.name);
+ ret = generic_parse_args(&dentry->d_name, blkdev_arg_fmt, ¶m);
+ if (ret)
+ return ret;
+ printk("blkdev_open: offset=0x%Lx limit=0x%Lx raw=%d",
+ param.offset, param.limit, param.raw);
+
+ if (param.offset || ~param.limit) {
+ struct inode *old_inode = inode;
+ inode = get_empty_inode();
+ inode->i_rdev = restrict_create_dev(old_inode->i_rdev, param.offset, param.limit);
+ bdev = inode->i_bdev = bdget(inode->i_rdev);
+ filp->f_dentry = d_alloc_root(inode);
+ /* FIXME: error handling, dangling dentry/inode */
+ }
+ }
+
+ ret = -ENXIO;
+
down(&bdev->bd_sem);
lock_kernel();
if (!bdev->bd_op)
@@ -721,6 +766,10 @@
write: block_write,
fsync: block_fsync,
ioctl: blkdev_ioctl,
+};
+
+struct inode_operations def_blk_iops = {
+ lookup: blkdev_lookup,
};
const char * bdevname(kdev_t dev)
diff -urN kernels/2.4/v2.4.5-pre3/fs/devices.c bdev_naming/fs/devices.c
--- kernels/2.4/v2.4.5-pre3/fs/devices.c Sun Oct 1 23:35:16 2000
+++ bdev_naming/fs/devices.c Fri May 18 18:41:00 2001
@@ -205,6 +205,7 @@
inode->i_rdev = to_kdev_t(rdev);
} else if (S_ISBLK(mode)) {
inode->i_fop = &def_blk_fops;
+ inode->i_op = &def_blk_iops;
inode->i_rdev = to_kdev_t(rdev);
inode->i_bdev = bdget(rdev);
} else if (S_ISFIFO(mode))
diff -urN kernels/2.4/v2.4.5-pre3/fs/lookupargs.c bdev_naming/fs/lookupargs.c
--- kernels/2.4/v2.4.5-pre3/fs/lookupargs.c Wed Dec 31 19:00:00 1969
+++ bdev_naming/fs/lookupargs.c Sat May 19 00:26:31 2001
@@ -0,0 +1,156 @@
+/* fs/lookupargs.c - written by Benjamin LaHaise
+ * Support for comma seperated argument lists via a lookup method.
+ * Useful for device drivers and other filesystem entities.
+ *
+ * Copyright 2001 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+#include <linux/fs.h>
+#include <linux/lookupargs.h>
+
+/* Returns the format if arg is in the list of options */
+static const struct parsed_arg_format *find_arg_fmt(
+ const struct parsed_arg *arg, arg_format_t *fmt)
+{
+ if (fmt)
+ for (; fmt->name; fmt++) {
+ const char *opt = fmt->name;
+
+ if (!memcmp(arg->arg_start, opt, arg->arg_len) &&
+ strlen(opt) == arg->arg_len) {
+ return fmt;
+ }
+ }
+
+ return NULL;
+}
+
+/* TODO: fix it to actually validate the argument */
+int generic_check_arg(const struct parsed_arg *arg, arg_format_t *fmt)
+{
+ return !find_arg_fmt(arg, fmt);
+}
+
+static int parse_arg(const struct qstr *qstr, int offset, struct parsed_arg *arg)
+{
+ const char *str = qstr->name + offset;
+ int left = qstr->len - offset;
+
+ arg->arg_start = NULL;
+ arg->arg_len = 0;
+ arg->option_start = NULL;
+ arg->option_len = 0;
+
+ if (offset < 0)
+ return -1;
+
+ if (left <= 0)
+ return -1;
+
+ /* First off, scan for the argument name -> ends at end of string,
+ * an equals sign or comma.
+ */
+ arg->arg_start = str;
+ for (; left > 0 && (*str != '=') && (*str != ',');
+ left--,str++)
+ ;
+
+ arg->arg_len = str - arg->arg_start;
+
+ /* This argument ends if therer's nothing left or we've hit a comma. */
+ if (left <= 0)
+ goto out;
+
+ left--;
+ if (*str++ == ',')
+ goto out;
+
+ /* Second part: scan the option looking for the end: ends at
+ * end of string or a comma.
+ */
+ arg->option_start = str;
+ for (; left > 0 && (*str != ',');
+ left--,str++) {
+ /* Eat the escaped character */
+ if (*str == '\\' && left > 1)
+ left--, str++;
+ }
+
+ arg->option_len = str - arg->arg_start;
+
+out:
+ return str - (const char *)qstr->name;
+}
+
+/* TODO: FIXME: proper range checking!!! */
+static int fill_arg_data(const struct parsed_arg *arg, arg_format_t *fmt, char *data)
+{
+ char *end;
+
+ data += fmt->offset;
+
+ switch (fmt->type) {
+ case Arg_bool:
+ *(int *)data = 1;
+ return 0;
+ case Arg_ull:
+ if (!arg->option_start || !arg->option_len)
+ break;
+ *(unsigned long long *)data = simple_strtoull(arg->option_start, &end, 10);
+ return 0;
+ }
+ return -EINVAL;
+}
+
+int generic_parse_args(const struct qstr *str, arg_format_t *fmt_list, void *data)
+{
+ int ret = 0;
+ for_each_parsed_arg(str) {
+ arg_format_t *fmt = find_arg_fmt(&arg, fmt_list);
+ ret = -EINVAL;
+ if (!fmt)
+ break;
+ ret = fill_arg_data(&arg, fmt, (char *)data);
+ if (ret)
+ break;
+ }
+ return ret;
+}
+
+struct dentry *generic_parse_lookup(
+ struct inode *inode,
+ struct dentry *dentry,
+ arg_format_t *fmt_list)
+{
+ /* Application compatibility: report -ENOTDIR on "." and ".." */
+ if (dentry->d_name.name[0] == '.' &&
+ ((dentry->d_name.len == 1) ||
+ (dentry->d_name.name[1] == '.' && dentry->d_name.len == 2)))
+ return ERR_PTR(-ENOTDIR);
+
+ /* Make sure all the arguments are okay */
+ { for_each_parsed_arg(&dentry->d_name) {
+ arg_format_t *fmt = find_arg_fmt(&arg, fmt_list);
+ if (!fmt || generic_check_arg(&arg, fmt)) {
+ inode = NULL;
+ break;
+ }
+ }}
+
+ d_add(dentry, inode);
+ return NULL;
+}
+
diff -urN kernels/2.4/v2.4.5-pre3/fs/namei.c bdev_naming/fs/namei.c
--- kernels/2.4/v2.4.5-pre3/fs/namei.c Thu May 3 11:22:16 2001
+++ bdev_naming/fs/namei.c Fri May 18 22:38:50 2001
@@ -470,7 +470,8 @@
* to be able to know about the current root directory and
* parent relationships.
*/
- if (this.name[0] == '.') switch (this.len) {
+ if (this.name[0] == '.' && S_ISDIR(nd->dentry->d_inode->i_mode))
+ switch (this.len) {
default:
break;
case 2:
@@ -538,7 +539,8 @@
last_component:
if (lookup_flags & LOOKUP_PARENT)
goto lookup_parent;
- if (this.name[0] == '.') switch (this.len) {
+ if (this.name[0] == '.' && S_ISDIR(nd->dentry->d_inode->i_mode))
+ switch (this.len) {
default:
break;
case 2:
@@ -593,7 +595,7 @@
lookup_parent:
nd->last = this;
nd->last_type = LAST_NORM;
- if (this.name[0] != '.')
+ if (this.name[0] != '.' || !S_ISDIR(nd->dentry->d_inode->i_mode))
goto return_base;
if (this.len == 1)
nd->last_type = LAST_DOT;
diff -urN kernels/2.4/v2.4.5-pre3/include/linux/fs.h bdev_naming/include/linux/fs.h
--- kernels/2.4/v2.4.5-pre3/include/linux/fs.h Thu May 17 18:09:42 2001
+++ bdev_naming/include/linux/fs.h Fri May 18 20:10:50 2001
@@ -984,6 +984,7 @@
extern void bdput(struct block_device *);
extern int blkdev_open(struct inode *, struct file *);
extern struct file_operations def_blk_fops;
+extern struct inode_operations def_blk_iops;
extern struct file_operations def_fifo_fops;
extern int ioctl_by_bdev(struct block_device *, unsigned, unsigned long);
extern int blkdev_get(struct block_device *, mode_t, unsigned, int);
diff -urN kernels/2.4/v2.4.5-pre3/include/linux/lookupargs.h bdev_naming/include/linux/lookupargs.h
--- kernels/2.4/v2.4.5-pre3/include/linux/lookupargs.h Wed Dec 31 19:00:00 1969
+++ bdev_naming/include/linux/lookupargs.h Fri May 18 23:06:56 2001
@@ -0,0 +1,48 @@
+/* include/linux/lookupargs.h
+ */
+struct parsed_arg {
+ const char *arg_start;
+ const char *option_start;
+ int arg_len;
+ int option_len;
+};
+
+enum parsed_arg_type {
+ Arg_bool, /* really an int */
+ Arg_ull,
+#if 0
+ //Arg_str, /* really a char */
+ Arg_c,
+ Arg_uc,
+ Arg_s,
+ Arg_us,
+ Arg_i,
+ Arg_ui,
+ Arg_l,
+ Arg_ul,
+ Arg_ll,
+ Arg_u32,
+ Arg_u64,
+#endif
+};
+
+typedef const struct parsed_arg_format {
+ const char *name;
+ enum parsed_arg_type type;
+ size_t offset;
+} arg_format_t;
+
+#define for_each_parsed_arg(str)\
+ struct parsed_arg arg; \
+ int __offset = 0; \
+ while ((__offset = parse_arg((str), __offset, &arg)) > 0)
+
+struct dentry;
+struct inode;
+struct qstr;
+
+extern int generic_parse_args(
+ const struct qstr *str, arg_format_t *fmt, void *data);
+extern struct dentry *generic_parse_lookup(
+ struct inode *inode, struct dentry *dentry, arg_format_t *fmt);
+
diff -urN kernels/2.4/v2.4.5-pre3/include/linux/raid/md_k.h bdev_naming/include/linux/raid/md_k.h
--- kernels/2.4/v2.4.5-pre3/include/linux/raid/md_k.h Thu May 17 18:09:42 2001
+++ bdev_naming/include/linux/raid/md_k.h Sat May 19 01:13:18 2001
@@ -36,6 +36,7 @@
case RAID5: return 5;
}
panic("pers_to_level()");
+ return 0;
}
extern inline int level_to_pers (int level)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 6:23 Ben LaHaise
@ 2001-05-19 9:42 ` Christer Weinigel
2001-05-19 9:51 ` Christer Weinigel
` (3 subsequent siblings)
4 siblings, 0 replies; 27+ messages in thread
From: Christer Weinigel @ 2001-05-19 9:42 UTC (permalink / raw)
To: bcrl; +Cc: linux-kernel
In article <Pine.LNX.4.33.0105190138150.6079-100000@toomuch.toronto.redhat.com> you write:
>3. Userspace partition code proposal
>
> Given the above two bits, here's a brief explaination of a
> proposal to move management of the partitioning scheme into
> userspace, along with portions of raid startup, lvm, uuid and
> mount by label code needed for mounting the root filesystem.
>
> Consider that the device node currently known as /dev/hda5 can
> also be viewed as /dev/hda at offset 512000 with a limit of 10GB.
> With the extensions in fs/block_dev.c, you could replace /dev/hda5
> with /dev/hda/offset=512000,limit=10240000000. Now, by putting
> the partition parsing code into a libpart and binding mount to a
> libpart, the root filesystem mounting code can be run out of an
> initrd image. The use of mount gives us the ability to mount
> filesystems by UUID, by label or other exotic schemes without
> having to add any additional code to the kernel.
The only problem I can see with this is that it removes one useful thing,
the ability to give a user access to a whole partition.
chown wingel /dev/hda5
won't work anymore since there is no such device node.
/Christer
--
"Just how much can I get away with and still go to heaven?"
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 6:23 Ben LaHaise
2001-05-19 9:42 ` Christer Weinigel
@ 2001-05-19 9:51 ` Christer Weinigel
2001-05-19 11:37 ` Eric W. Biederman
` (2 subsequent siblings)
4 siblings, 0 replies; 27+ messages in thread
From: Christer Weinigel @ 2001-05-19 9:51 UTC (permalink / raw)
To: linux-kernel
In article <20010519094224.AD5A236DDC@hog.ctrl-c.liu.se> I wrote:
>The only problem I can see with this is that it removes one useful thing,
>the ability to give a user access to a whole partition.
>
> chown wingel /dev/hda5
>
>won't work anymore since there is no such device node.
Apologies, this should have gone to linux-fsdev, I entered the mail
address by hand and by reflex typed the wrong thing.
*going back to sleep*
/Christer
--
"Just how much can I get away with and still go to heaven?"
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
@ 2001-05-19 11:09 Andries.Brouwer
2001-05-19 11:43 ` Andrew Morton
0 siblings, 1 reply; 27+ messages in thread
From: Andries.Brouwer @ 2001-05-19 11:09 UTC (permalink / raw)
To: bcrl, torvalds; +Cc: linux-fsdevel, linux-kernel, viro
From: Ben LaHaise <bcrl@redhat.com>
3. Userspace partition code proposal
Given the above two bits, here's a brief explaination of a
proposal to move management of the partitioning scheme into
userspace, along with portions of raid startup, lvm, uuid and
mount by label code needed for mounting the root filesystem.
Consider that the device node currently known as /dev/hda5 can
also be viewed as /dev/hda at offset 512000 with a limit of 10GB.
With the extensions in fs/block_dev.c, you could replace /dev/hda5
with /dev/hda/offset=512000,limit=10240000000. Now, by putting
the partition parsing code into a libpart and binding mount to a
libpart, the root filesystem mounting code can be run out of an
initrd image. The use of mount gives us the ability to mount
filesystems by UUID, by label or other exotic schemes without
having to add any additional code to the kernel.
I'm going to stop writing this now. I need sleep...
Hmm. You know that I wrote this long ago?
And that it has been part of the kernel for a long time?
And that there are user space utilities that use it?
In util-linux, look at the partx subdirectory.
In the kernel, read drivers/block/blkpg.c.
Andries
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 6:23 Ben LaHaise
2001-05-19 9:42 ` Christer Weinigel
2001-05-19 9:51 ` Christer Weinigel
@ 2001-05-19 11:37 ` Eric W. Biederman
2001-05-19 14:25 ` Daniel Phillips
2001-05-19 13:53 ` Daniel Phillips
2001-05-19 18:31 ` Linus Torvalds
4 siblings, 1 reply; 27+ messages in thread
From: Eric W. Biederman @ 2001-05-19 11:37 UTC (permalink / raw)
To: Ben LaHaise; +Cc: torvalds, viro, linux-kernel, linux-fsdevel
Ben LaHaise <bcrl@redhat.com> writes:
> Hey folks,
>
> The work-in-progress patch for-demonstration-purposes-only below consists
> of 3 major components, and is meant to start discussion about the future
> direction of device naming and its interaction block layer. The main
> motivations here are the wasting of minor numbers for partitions, and the
> duplication of code between user and kernel space in areas such as
> partition detection, uuid location, lvm setup, mount by label, journal
> replay, and so on...
>
> 1. Generic lookup method and argument parsiing (fs/lookupargs.c)
>
> This code implements a lookup function which is for demonstration
> purposes used in fs/block_dev.c. The general idea is to pass
> additional parameters to device drivers on open via a comma
> seperated list of options following the device's name. Sample
> uses:
>
> /dev/sda/raw -> open sda in raw mode.
> /dev/sda/limit=102400 -> open sda with a limit of 100K
> /dev/sda/offset=1024,limit=2048
> -> open a device that gives a view of sda at an
> offset of 1KB to 2KB
GAhh!!!!!!
Ben please think /proc/sys. One value per ``file''.
> 3. Userspace partition code proposal
>
> Given the above two bits, here's a brief explaination of a
> proposal to move management of the partitioning scheme into
> userspace, along with portions of raid startup, lvm, uuid and
> mount by label code needed for mounting the root filesystem.
>
> Consider that the device node currently known as /dev/hda5 can
> also be viewed as /dev/hda at offset 512000 with a limit of 10GB.
> With the extensions in fs/block_dev.c, you could replace /dev/hda5
> with /dev/hda/offset=512000,limit=10240000000. Now, by putting
> the partition parsing code into a libpart and binding mount to a
> libpart, the root filesystem mounting code can be run out of an
> initrd image. The use of mount gives us the ability to mount
> filesystems by UUID, by label or other exotic schemes without
> having to add any additional code to the kernel.
But you need to use uclibc or a similar library to get the code size down
small enough, so you don't quadruple the size of your boot image.
As for wasting minors. If you are going to rework partitions they
should have dynamic device numbers. That are assigned when the
partition is discovered by the system. I admit a hot-plug partition
sounds incongruous but it should be fairly simple to implement.
If your real root is on a ``hot-plug'' device then it does look
like you need an initrd to help select your root partition. Hmm. the
code is simple enough code in the kernel shouldn't be bad. And the
interface can be simple as well.
Have:
/dev/sda/partitions/1
/dev/sda/partitions/2
/dev/sda/partitions/3
/dev/sda/partitions/4
/dev/sda/partitions/5
and also
/dev/sda/partitions/1/uuid
/dev/sda/partitions/1/label
/dev/sda/partitions/1/offset
/dev/sda/partitions/1/limit
To expose what the kernel found it's initial scan of the partitions.
For creating partitions you might want to do:
cat 1024 2048 > /dev/sda/newpartition
Though if you could do it with create that would be nicer, and writes
to offset and limit, that would be a little nicer.
Al would it work to have the lookup method for /dev/sda automatically
mount an instance of scsifs on /dev/hda (from an internal mount), and
then have dput drop that mount. I skimmed the code and it looks
possible.
Soft mounting a fs isn't strictly necessary but for the case above but
it looks simplest to keep the list of partitions permanently in the
dcache. We would also need to modify permission to take a vfsmnt
argument so your permissions to a device file could vary depending on
which device file you start with.
Eric
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 11:09 [RFD w/info-PATCH] device arguments from lookup, partion code in userspace Andries.Brouwer
@ 2001-05-19 11:43 ` Andrew Morton
2001-05-19 12:00 ` Alexander Viro
0 siblings, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2001-05-19 11:43 UTC (permalink / raw)
To: Andries.Brouwer; +Cc: bcrl, torvalds, linux-fsdevel, linux-kernel, viro
Andries.Brouwer@cwi.nl wrote:
>
> Hmm. You know that I wrote this long ago?
Well, let's not get too hung up on the disk thing (yeah,
I started it...).
Ben's intent here is to *demonstrate* how argv-style
info can be passed into device nodes. It seems neat,
and nice.
We can also make use of a strong argument parsing library
in the kernel - there are a great number of open-coded
string bashing functions which could be rationalised
and regularised.
So. When am I going to be able to:
open("/bin/ls,-l,/etc/passwd", O_RDONLY);
?
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 11:43 ` Andrew Morton
@ 2001-05-19 12:00 ` Alexander Viro
2001-05-19 12:06 ` [RFD w/info-PATCH] device arguments from lookup, partion codein userspace Andrew Morton
2001-05-19 15:56 ` [RFD w/info-PATCH] device arguments from lookup, partion code in userspace Ben LaHaise
0 siblings, 2 replies; 27+ messages in thread
From: Alexander Viro @ 2001-05-19 12:00 UTC (permalink / raw)
To: Andrew Morton
Cc: Andries.Brouwer, bcrl, torvalds, linux-fsdevel, linux-kernel
On Sat, 19 May 2001, Andrew Morton wrote:
> So. When am I going to be able to:
>
> open("/bin/ls,-l,/etc/passwd", O_RDONLY);
You are not. Think for a minute and you'll see why.
Linus' idea of /dev/tty/<parameters> is marginally sane - it makes sense
to consider that as configuring-upon-open. You _are_ going to do IO on
that file.
Ben's /dev/md0/<living_horror> is ugly - it's open just for side effects,
with no IO supposed to happen.
His idea of passing file descriptor instead of name makes these side effects
even messier.
The stuff you've proposed is a perversion worth of Albert. You've introduced
additional metacharacter into filenames, you will need some form of quoting
to be able to pass literal commas and you will need to quote slashes. It's
way past ugly.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion codein userspace
2001-05-19 12:00 ` Alexander Viro
@ 2001-05-19 12:06 ` Andrew Morton
2001-05-19 15:56 ` [RFD w/info-PATCH] device arguments from lookup, partion code in userspace Ben LaHaise
1 sibling, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2001-05-19 12:06 UTC (permalink / raw)
To: Alexander Viro
Cc: Andries.Brouwer, bcrl, torvalds, linux-fsdevel, linux-kernel
Alexander Viro wrote:
>
> It's way past ugly.
I knew you'd like it.
It kind of makes sense, because it puts the two primary stream-of-bytes
objects in Unix into the same namespace, with the same accessors.
So if some random application is expecting a filename well heck, you
just give it a path-to-executable with args. It won't care, although
it may have trouble lseek()ing on it.
It wasn't very serious at all.
-
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 6:23 Ben LaHaise
` (2 preceding siblings ...)
2001-05-19 11:37 ` Eric W. Biederman
@ 2001-05-19 13:53 ` Daniel Phillips
2001-05-19 18:31 ` Linus Torvalds
4 siblings, 0 replies; 27+ messages in thread
From: Daniel Phillips @ 2001-05-19 13:53 UTC (permalink / raw)
To: Ben LaHaise, torvalds; +Cc: viro, linux-kernel, linux-fsdevel
On Saturday 19 May 2001 08:23, Ben LaHaise wrote:
> /dev/sda/offset=1024,limit=2048
> -> open a device that gives a view of sda at an
> offset of 1KB to 2KB
Whatever we end up with, can we express it in terms of base, size,
please?
--
Daniel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 11:37 ` Eric W. Biederman
@ 2001-05-19 14:25 ` Daniel Phillips
2001-05-21 8:14 ` Lars Marowsky-Bree
0 siblings, 1 reply; 27+ messages in thread
From: Daniel Phillips @ 2001-05-19 14:25 UTC (permalink / raw)
To: Eric W. Biederman, Ben LaHaise
Cc: torvalds, viro, linux-kernel, linux-fsdevel
On Saturday 19 May 2001 13:37, Eric W. Biederman wrote:
> For creating partitions you might want to do:
> cat 1024 2048 > /dev/sda/newpartition
How about:
# mkpart /dev/sda /dev/mypartition -o size=1024k,type=swap
# ls /dev/mypartition
base size device type
# cat /dev/mypartition/size
1048576
# cat /dev/mypartition/device
/dev/sda
# mke2fs /dev/mypartition
The information that was specified is persistent in /dev. We can
rearrange our physical devices any way we want without affecting
the name we chose in /dev. When the kernel enumerates devices
at startup, our persistent information better match or we will have
to take some corrective action.
Generally, we shouldn't care which order the kernel enumerates
devices in or which device number gets assigned internally. If we
did need to care, we'd just do:
# echo 666 >/dev/mypartition/number
setting a persistent device minor number. The major number is
inherited via the partition's /device property.
To set the minor number back to 'don't care':
# rm /dev/mypartition/number
By taking the physical device off the top of the food chain we
gain the flexibility of being able to move the device from bus to
bus for example, and only the partition's device property
changes, nothing in our fstab. It's no great leap to set things
up so that not even the /device property would need to
change.
Note that we can have a heirarchy of partitions this way if
we want to, since /dev/mypartition is just another block
device.
--
Daniel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 12:00 ` Alexander Viro
2001-05-19 12:06 ` [RFD w/info-PATCH] device arguments from lookup, partion codein userspace Andrew Morton
@ 2001-05-19 15:56 ` Ben LaHaise
2001-05-19 16:25 ` [RFD w/info-PATCH] device arguments from lookup, partion code Alan Cox
1 sibling, 1 reply; 27+ messages in thread
From: Ben LaHaise @ 2001-05-19 15:56 UTC (permalink / raw)
To: Alexander Viro
Cc: Andrew Morton, Andries.Brouwer, torvalds, linux-fsdevel,
linux-kernel
On Sat, 19 May 2001, Alexander Viro wrote:
> Ben's /dev/md0/<living_horror> is ugly - it's open just for side effects,
> with no IO supposed to happen.
Now that I'm awake and refreshed, yeah, that's awful. But
echo "hot-add,slot=5,device=/dev/sda" >/dev/md0/control *is* sane. Heck,
the system can even send back result codes that way.
-ben
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 15:56 ` [RFD w/info-PATCH] device arguments from lookup, partion code in userspace Ben LaHaise
@ 2001-05-19 16:25 ` Alan Cox
2001-05-19 16:36 ` Alexander Viro
` (4 more replies)
0 siblings, 5 replies; 27+ messages in thread
From: Alan Cox @ 2001-05-19 16:25 UTC (permalink / raw)
To: Ben LaHaise
Cc: Alexander Viro, Andrew Morton, Andries.Brouwer, torvalds,
linux-fsdevel, linux-kernel
> Now that I'm awake and refreshed, yeah, that's awful. But
> echo "hot-add,slot=5,device=/dev/sda" >/dev/md0/control *is* sane. Heck,
> the system can even send back result codes that way.
Only to an English speaker. I suspect Quebec City canadians would prefer a
different command set.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 16:25 ` [RFD w/info-PATCH] device arguments from lookup, partion code Alan Cox
@ 2001-05-19 16:36 ` Alexander Viro
2001-05-19 16:44 ` Matthew Wilcox
` (3 subsequent siblings)
4 siblings, 0 replies; 27+ messages in thread
From: Alexander Viro @ 2001-05-19 16:36 UTC (permalink / raw)
To: Alan Cox
Cc: Ben LaHaise, Andrew Morton, Andries.Brouwer, torvalds,
linux-fsdevel, linux-kernel
On Sat, 19 May 2001, Alan Cox wrote:
> > Now that I'm awake and refreshed, yeah, that's awful. But
> > echo "hot-add,slot=5,device=/dev/sda" >/dev/md0/control *is* sane. Heck,
> > the system can even send back result codes that way.
>
> Only to an English speaker. I suspect Quebec City canadians would prefer a
> different command set.
Alan, I'm not a native speaker and I had worked with system that got names
of utilities translated to Russian.
It's was hell. And I don't think that replacing compress (or пакуй) with
42A1769 would make it better.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 16:25 ` [RFD w/info-PATCH] device arguments from lookup, partion code Alan Cox
2001-05-19 16:36 ` Alexander Viro
@ 2001-05-19 16:44 ` Matthew Wilcox
2001-05-19 18:01 ` Nicolas Pitre
` (2 subsequent siblings)
4 siblings, 0 replies; 27+ messages in thread
From: Matthew Wilcox @ 2001-05-19 16:44 UTC (permalink / raw)
To: Alan Cox
Cc: Ben LaHaise, Alexander Viro, Andrew Morton, Andries.Brouwer,
torvalds, linux-fsdevel, linux-kernel
On Sat, May 19, 2001 at 05:25:22PM +0100, Alan Cox wrote:
> Only to an English speaker. I suspect Quebec City canadians would prefer a
> different command set.
Should we support `pas387' as well as `no387' as a kernel boot parameter
then? Face it, a sysadmin has to know the limited subset of english
which is used to configure a kernel.
--
Revolutions do not require corporate support.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 16:25 ` [RFD w/info-PATCH] device arguments from lookup, partion code Alan Cox
2001-05-19 16:36 ` Alexander Viro
2001-05-19 16:44 ` Matthew Wilcox
@ 2001-05-19 18:01 ` Nicolas Pitre
2001-05-19 18:34 ` Linus Torvalds
2001-05-20 19:53 ` Pavel Machek
4 siblings, 0 replies; 27+ messages in thread
From: Nicolas Pitre @ 2001-05-19 18:01 UTC (permalink / raw)
To: Alan Cox
Cc: Ben LaHaise, Alexander Viro, Andrew Morton, Andries.Brouwer,
torvalds, linux-fsdevel, linux-kernel
On Sat, 19 May 2001, Alan Cox wrote:
> > Now that I'm awake and refreshed, yeah, that's awful. But
> > echo "hot-add,slot=5,device=/dev/sda" >/dev/md0/control *is* sane. Heck,
> > the system can even send back result codes that way.
>
> Only to an English speaker. I suspect Quebec City canadians would prefer a
> different command set.
Well... Around here we've been used to Microsoft translations like:
ETES-VOUS CERTAIN [O/N] ?
... and of course pressing 'o' doesn't work while 'y' does. :-)
Wanting to localize such low-level keywords is utopia. Otherwise you'll
want to translate command names like free, rm, mv, etc. and yet programming
languages as well like C keywords. And then you come to a point where
nothing could be interoperable any more.
Nicolas
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 6:23 Ben LaHaise
` (3 preceding siblings ...)
2001-05-19 13:53 ` Daniel Phillips
@ 2001-05-19 18:31 ` Linus Torvalds
4 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2001-05-19 18:31 UTC (permalink / raw)
To: Ben LaHaise; +Cc: viro, linux-kernel, linux-fsdevel
On Sat, 19 May 2001, Ben LaHaise wrote:
>
> 1. Generic lookup method and argument parsiing (fs/lookupargs.c)
Looks sane.
> 2. Restricted block device (drivers/block/blkrestrict.c)
This is not very user-friendly, but along with symlinks this makes perfect
sense. It would make partition handling a _lot_ simpler.
Note, however, that I think the "restricted block device" is a much more
generic issue than just block devices. I've already discussed with Alan
the possibility of making _all_ file descriptors have the notion of
"restrictions", notably the "start, end" kind of things.
It is very useful for other things too - imagine opening /dev/mem, and
wanting to pass a restricted portiong of it to other processes with the
standard file descriptor passing facilities (think "secure DGA" for the X
server, but also think untrusted users that can read parts of shared files
etc - a suid program that opens a file, restricts it, drops privileges and
knows that the program can only access a specific part of the file)
> 3. Userspace partition code proposal
Yes and no.
I absolutely thihnk the idea that users actually _using_ these names is a
horrible one, and fraught with potential for much too easy mistakes that
end up being disastrous.
But having symlinks that are created by a special program would be ok.
[ Also, note how symlinks would make the point of initrd completely
moot. You don't have to have initrd to initialize the thing, you can
initialize the thing at installation time and when doing fdisk, and the
symlinks would act as the permanent markers. ]
HOWEVER, you have to realize that there are serious security and
maintenance issues here, and I think your idea breaks down completely
because of that.
The thing is, you only have permissions on a "per-object" basis, and it's
common practice to have different permissions for different partitions.
Your scheme does not allow this. Which means that it is fundamentally
broken. Sorry.
So don't go overboard. The name-based thing is useful, but it's useful for
only certain things. And you must _never_ forget the security and
management issues.
For example, if you can open a serial port in the first place, you can set
its baud-rate. So it's ok to make baud-rate part of the name. And once you
have permission to read /dev/fd0 it doesn't make sense to limit you to one
particular format. So it's ok to have the disk format be part of the name.
But it's not possible to make the partition be a "name" issue. Because
while you obviously need different names, you _also_ need different
permissions.
Linus
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 16:25 ` [RFD w/info-PATCH] device arguments from lookup, partion code Alan Cox
` (2 preceding siblings ...)
2001-05-19 18:01 ` Nicolas Pitre
@ 2001-05-19 18:34 ` Linus Torvalds
2001-05-19 22:34 ` Ingo Oeser
2001-05-20 19:53 ` Pavel Machek
4 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2001-05-19 18:34 UTC (permalink / raw)
To: Alan Cox
Cc: Ben LaHaise, Alexander Viro, Andrew Morton, Andries.Brouwer,
linux-fsdevel, linux-kernel
On Sat, 19 May 2001, Alan Cox wrote:
>
> > Now that I'm awake and refreshed, yeah, that's awful. But
> > echo "hot-add,slot=5,device=/dev/sda" >/dev/md0/control *is* sane. Heck,
> > the system can even send back result codes that way.
>
> Only to an English speaker. I suspect Quebec City canadians would prefer a
> different command set.
I was waiting for the "anglo-saxon" argument.
I don't think it's a valid argument. You already have "/dev". You already
have english names for the numbers in ioctl's (and let's not be mentally
dishonest and say "numbers are cross-cultural", because NOBODY MUST EVER
USE THE RAW NUMBERS - you have to use the anglo-saxon #define'd names
because the numbers aren't even cross-platform on Linux, much less
portable to other systems).
So the "English is bad" argument is a complete non-argument.
Linus
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 18:34 ` Linus Torvalds
@ 2001-05-19 22:34 ` Ingo Oeser
2001-05-19 23:42 ` Alexander Viro
2001-05-20 17:10 ` Padraig Brady
0 siblings, 2 replies; 27+ messages in thread
From: Ingo Oeser @ 2001-05-19 22:34 UTC (permalink / raw)
To: Linus Torvalds
Cc: Alan Cox, Ben LaHaise, Alexander Viro, Andrew Morton,
Andries.Brouwer, linux-fsdevel, linux-kernel
On Sat, May 19, 2001 at 11:34:48AM -0700, Linus Torvalds wrote:
[Reasons]
> So the "English is bad" argument is a complete non-argument.
Jepp, I have to agree.
English is used more or less as an communication protocol in
computer science and for operating computers.
Once you know how to operate an computer in English, you can
operate nearly every computer in the world, because they have
English as default locale.
Let's not repeat Babel please :-(
PS: English is neither mine, nor Linus native language. Why do
the English natives complain instead of us? ;-)
<off topic side note>
And be glad that's not German, that has this role. English
sentences are WAY easier to parse by computers, because it
doesn't use much suffixes and prefixes on words and has very
few exceptions. Also these exceptions are eleminated from
command languages WITHOUT influencing readability and
comprehensability.
</off topic side note>
Regards
Ingo Oeser
--
To the systems programmer,
users and applications serve only to provide a test load.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 22:34 ` Ingo Oeser
@ 2001-05-19 23:42 ` Alexander Viro
2001-05-20 0:11 ` Alan Cox
2001-05-20 17:10 ` Padraig Brady
1 sibling, 1 reply; 27+ messages in thread
From: Alexander Viro @ 2001-05-19 23:42 UTC (permalink / raw)
To: Ingo Oeser
Cc: Linus Torvalds, Alan Cox, Ben LaHaise, Andrew Morton,
Andries.Brouwer, linux-fsdevel, linux-kernel
On Sun, 20 May 2001, Ingo Oeser wrote:
> PS: English is neither mine, nor Linus native language. Why do
> the English natives complain instead of us? ;-)
Because we had some experience with, erm, localized systems and for
Alan it's most likely pure theory? ;-)
Al, still shuddering at the memories
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 23:42 ` Alexander Viro
@ 2001-05-20 0:11 ` Alan Cox
0 siblings, 0 replies; 27+ messages in thread
From: Alan Cox @ 2001-05-20 0:11 UTC (permalink / raw)
To: Alexander Viro
Cc: Ingo Oeser, Linus Torvalds, Alan Cox, Ben LaHaise, Andrew Morton,
Andries.Brouwer, linux-fsdevel, linux-kernel
> On Sun, 20 May 2001, Ingo Oeser wrote:
> > PS: English is neither mine, nor Linus native language. Why do
> > the English natives complain instead of us? ;-)
>
> Because we had some experience with, erm, localized systems and for
> Alan it's most likely pure theory? ;-)
I think its important its considered. I do like the idea of a sensible ioctl
encoding (including ascii potentially) and being able to ship ioctls over the
network.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 22:34 ` Ingo Oeser
2001-05-19 23:42 ` Alexander Viro
@ 2001-05-20 17:10 ` Padraig Brady
1 sibling, 0 replies; 27+ messages in thread
From: Padraig Brady @ 2001-05-20 17:10 UTC (permalink / raw)
To: Ingo Oeser; +Cc: linux-kernel
Obviously there has to be some standard base
with which to work, especially for computer language
keywords as these can't be converted due to name
clashes. What would be cool is to pick a better base
language than English that everyone would have to
learn to "use computers". This is especially important
for opensource as it would greatly ease the operation
of the collective brain. Something easily parseable
would be an obvious criterion and would allow us
to interact with computers by voice(-recognition)
with no ambiguity, etc. etc...
tada: http://www.lojban.org/
will everything be changed over in the 2.5 timeframe? :-)
Padraig.
Ingo Oeser wrote:
>On Sat, May 19, 2001 at 11:34:48AM -0700, Linus Torvalds wrote:
>[Reasons]
>
>>So the "English is bad" argument is a complete non-argument.
>>
>
>Jepp, I have to agree.
>
>English is used more or less as an communication protocol in
>computer science and for operating computers.
>
>Once you know how to operate an computer in English, you can
>operate nearly every computer in the world, because they have
>English as default locale.
>
>Let's not repeat Babel please :-(
>
>PS: English is neither mine, nor Linus native language. Why do
> the English natives complain instead of us? ;-)
>
><off topic side note>
> And be glad that's not German, that has this role. English
> sentences are WAY easier to parse by computers, because it
> doesn't use much suffixes and prefixes on words and has very
> few exceptions. Also these exceptions are eleminated from
> command languages WITHOUT influencing readability and
> comprehensability.
></off topic side note>
>
>
>Regards
>
>Ingo Oeser
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code
2001-05-19 16:25 ` [RFD w/info-PATCH] device arguments from lookup, partion code Alan Cox
` (3 preceding siblings ...)
2001-05-19 18:34 ` Linus Torvalds
@ 2001-05-20 19:53 ` Pavel Machek
4 siblings, 0 replies; 27+ messages in thread
From: Pavel Machek @ 2001-05-20 19:53 UTC (permalink / raw)
To: Alan Cox, Ben LaHaise; +Cc: linux-fsdevel, linux-kernel
Hi!
> > Now that I'm awake and refreshed, yeah, that's awful. But
> > echo "hot-add,slot=5,device=/dev/sda" >/dev/md0/control *is* sane. Heck,
> > the system can even send back result codes that way.
>
> Only to an English speaker. I suspect Quebec City canadians would prefer a
> different command set.
Alan, bad idea.
This is less evil than magic numbers, and *users* should not be
touching this anyway. They should have nice gui tools that do it for
them.
English is *way* better than magic numbers. It makes sense at least
for someone.
Pavel
--
I'm pavel@ucw.cz. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at discuss@linmodems.org
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-19 14:25 ` Daniel Phillips
@ 2001-05-21 8:14 ` Lars Marowsky-Bree
2001-05-22 9:07 ` Daniel Phillips
0 siblings, 1 reply; 27+ messages in thread
From: Lars Marowsky-Bree @ 2001-05-21 8:14 UTC (permalink / raw)
To: Daniel Phillips
Cc: Eric W. Biederman, Ben LaHaise, torvalds, viro, linux-kernel,
linux-fsdevel
On 2001-05-19T16:25:47,
Daniel Phillips <phillips@bonn-fries.net> said:
> How about:
>
> # mkpart /dev/sda /dev/mypartition -o size=1024k,type=swap
> # ls /dev/mypartition
> base size device type
> # cat /dev/mypartition/size
> 1048576
> # cat /dev/mypartition/device
> /dev/sda
> # mke2fs /dev/mypartition
Ek. You want to run mke2fs on a _directory_ ?
If anything, /dev/mypartition/realdev
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
Perfection is our goal, excellence will be tolerated. -- J. Yahl
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
@ 2001-05-21 12:43 Andries.Brouwer
2001-05-21 16:08 ` Daniel Phillips
0 siblings, 1 reply; 27+ messages in thread
From: Andries.Brouwer @ 2001-05-21 12:43 UTC (permalink / raw)
To: bcrl, phillips; +Cc: linux-fsdevel, linux-kernel, torvalds, viro
How about:
# mkpart /dev/sda /dev/mypartition -o size=1024k,type=swap
# ls /dev/mypartition
base size device type
Generally, we shouldn't care which order the kernel enumerates
devices in or which device number gets assigned internally. If we
did need to care, we'd just do:
# echo 666 >/dev/mypartition/number
Only a single thing is of interest.
What is the communication between user space and kernel
that transports device identities?
Note that there is user (human) / user space (programs) / kernel.
This user has interesting machinery in his hands,
but his programs have only strings (path names, fake or not)
to give to the kernel in open() and mount() calls.
Now the device path is so complicated that the user is unable to
describe it using a path name. devfs made an attempt listing controller,
lun, etc etc but /dev/ide/host0/bus1/target1/lun0/disc is not very
attractive, and things only get worse.
When I go to a bookshop to buy a book, I can do so without specifying
all of Author, Editors, Title, Publisher, Date, ISBN, nr of pages, ...
A few items suffice. Often the Title alone will do.
We want an interface where the kernel exports what it has to offer
and the user can pick. Yes, that Zip drive - never mind the bus.
But can distinguish - Yes, that USB Zip drive, not the one
on the parallel port.
The five minute hack would number devices 1, 2, 3 in order of detection,
offer the detection message in /devices/<nr>/detectionmessage
and a corresponding device node in /devices/<nr>/devicenode.
The sysadmin figures out what is what, makes a collection of
symlinks with his favorite names, and everybody is happy.
Until the next reboot. Or until device removal and addition.
There must be a way to give permanence to an association
between name and device. Symlinks into a virtual filesystem
like /devices are not good enough. Turning the five minute
hack into a ten minute hack we take the md5sum of the part
of the bootmessage that is expected to be the same the next time
we encounter this device and use that as device number.
I think a system somewhat in this style could be made to work well.
Andries
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-21 12:43 Andries.Brouwer
@ 2001-05-21 16:08 ` Daniel Phillips
0 siblings, 0 replies; 27+ messages in thread
From: Daniel Phillips @ 2001-05-21 16:08 UTC (permalink / raw)
To: Andries.Brouwer, bcrl, phillips
Cc: linux-fsdevel, linux-kernel, torvalds, viro
On Monday 21 May 2001 14:43, Andries.Brouwer@cwi.nl wrote:
> How about:
>
> # mkpart /dev/sda /dev/mypartition -o size=1024k,type=swap
> # ls /dev/mypartition
> base size device type
>
> Generally, we shouldn't care which order the kernel enumerates
> devices in or which device number gets assigned internally. If
> we did need to care, we'd just do:
>
> # echo 666 >/dev/mypartition/number
>
> Only a single thing is of interest.
> What is the communication between user space and kernel
> that transports device identities?
It doesn't change, the same symbolic names still work. What's
happening in my example is, we've gotten rid of the
can't-get-there-from-here device naming heirarchy. It should
be clear by now that we can't capture 'physical device location'
and 'device function' in one tree. So instead, 'physical device'
is a property of 'logical device'. The tree is now optional.
> Note that there is user (human) / user space (programs) / kernel.
>
> This user has interesting machinery in his hands,
> but his programs have only strings (path names, fake or not)
> to give to the kernel in open() and mount() calls.
>
> Now the device path is so complicated that the user is unable to
> describe it using a path name. devfs made an attempt listing
> controller, lun, etc etc but /dev/ide/host0/bus1/target1/lun0/disc is
> not very attractive, and things only get worse.
Yes, we flatten that by making host, bus, target and lun all
properties of /proc/ide/hda.
Our mistake up to now is that we've tried to carry the logical
view and physical view of the device in one name, or equivalently,
in path+name. Let the physical device be a property of the logical
device and we no longer have our thumb tied to our nose.
> When I go to a bookshop to buy a book, I can do so without specifying
> all of Author, Editors, Title, Publisher, Date, ISBN, nr of pages,
> ... A few items suffice. Often the Title alone will do.
>
> We want an interface where the kernel exports what it has to offer
> and the user can pick. Yes, that Zip drive - never mind the bus.
> But can distinguish - Yes, that USB Zip drive, not the one
> on the parallel port.
100% agreed. IOW, when the device *does* move we can usually
deduce where it's moved to, so lets update the hda's bus location
automatically whenever we can (log a message!) and only bother
the user about it if it's ambiguous. For good measure, have a
system setting that says 'on a scale of 0 to 5, this is how interested
I am in being bothered about the fact that a device seems to have
moved'.
> The five minute hack would number devices 1, 2, 3 in order of
> detection, offer the detection message in
> /devices/<nr>/detectionmessage and a corresponding device node in
> /devices/<nr>/devicenode. The sysadmin figures out what is what,
> makes a collection of symlinks with his favorite names, and everybody
> is happy.
>
> Until the next reboot. Or until device removal and addition.
> There must be a way to give permanence to an association
> between name and device. Symlinks into a virtual filesystem
> like /devices are not good enough. Turning the five minute
> hack into a ten minute hack we take the md5sum of the part
> of the bootmessage that is expected to be the same the next time
> we encounter this device and use that as device number.
>
> I think a system somewhat in this style could be made to work well.
Yes, we are advocating the same thing. I didn't mention that the
device properties are supposed to be persistent, did I? If you
accept the idea of persistent device properties then the obvious
thing to do is to match them up against the detected devices.
I didn't want to bring up the persistency thing right away because
it begs the question of where you store the persistent data for the
root device. Until the namespace issue is resolved this is mainly
a distraction.
--
Daniel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
2001-05-21 8:14 ` Lars Marowsky-Bree
@ 2001-05-22 9:07 ` Daniel Phillips
0 siblings, 0 replies; 27+ messages in thread
From: Daniel Phillips @ 2001-05-22 9:07 UTC (permalink / raw)
To: Lars Marowsky-Bree
Cc: Eric W. Biederman, Ben LaHaise, torvalds, viro, linux-kernel,
linux-fsdevel
On Monday 21 May 2001 10:14, Lars Marowsky-Bree wrote:
> On 2001-05-19T16:25:47,
>
> Daniel Phillips <phillips@bonn-fries.net> said:
> > How about:
> >
> > # mkpart /dev/sda /dev/mypartition -o size=1024k,type=swap
> > # ls /dev/mypartition
> > base size device type
> > # cat /dev/mypartition/size
> > 1048576
> > # cat /dev/mypartition/device
> > /dev/sda
> > # mke2fs /dev/mypartition
>
> Ek. You want to run mke2fs on a _directory_ ?
Could you be specific about what is wrong with that? Assuming that
this device directory lives on a special purpose filesystem?
> If anything, /dev/mypartition/realdev
Then every fstab in the world has to change, not to mention adding
verbosity to interactive commands.
--
Daniel
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [RFD w/info-PATCH] device arguments from lookup, partion code in userspace
@ 2001-05-22 18:45 Andries.Brouwer
0 siblings, 0 replies; 27+ messages in thread
From: Andries.Brouwer @ 2001-05-22 18:45 UTC (permalink / raw)
To: Andries.Brouwer, bcrl, phillips
Cc: linux-fsdevel, linux-kernel, torvalds, viro
>> What is the communication between user space and kernel
>> that transports device identities?
> It doesn't change, the same symbolic names still work.
But today, unless you think of devfs or so, device identities
are not transported by symbolic names. They are given by
device numbers.
[Yes, symbolic names have a certain secondary role, e.g. in error
messages, or perhaps to indicate the boot device.]
Andries
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2001-05-22 18:46 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-05-19 11:09 [RFD w/info-PATCH] device arguments from lookup, partion code in userspace Andries.Brouwer
2001-05-19 11:43 ` Andrew Morton
2001-05-19 12:00 ` Alexander Viro
2001-05-19 12:06 ` [RFD w/info-PATCH] device arguments from lookup, partion codein userspace Andrew Morton
2001-05-19 15:56 ` [RFD w/info-PATCH] device arguments from lookup, partion code in userspace Ben LaHaise
2001-05-19 16:25 ` [RFD w/info-PATCH] device arguments from lookup, partion code Alan Cox
2001-05-19 16:36 ` Alexander Viro
2001-05-19 16:44 ` Matthew Wilcox
2001-05-19 18:01 ` Nicolas Pitre
2001-05-19 18:34 ` Linus Torvalds
2001-05-19 22:34 ` Ingo Oeser
2001-05-19 23:42 ` Alexander Viro
2001-05-20 0:11 ` Alan Cox
2001-05-20 17:10 ` Padraig Brady
2001-05-20 19:53 ` Pavel Machek
-- strict thread matches above, loose matches on Subject: below --
2001-05-22 18:45 [RFD w/info-PATCH] device arguments from lookup, partion code in userspace Andries.Brouwer
2001-05-21 12:43 Andries.Brouwer
2001-05-21 16:08 ` Daniel Phillips
2001-05-19 6:23 Ben LaHaise
2001-05-19 9:42 ` Christer Weinigel
2001-05-19 9:51 ` Christer Weinigel
2001-05-19 11:37 ` Eric W. Biederman
2001-05-19 14:25 ` Daniel Phillips
2001-05-21 8:14 ` Lars Marowsky-Bree
2001-05-22 9:07 ` Daniel Phillips
2001-05-19 13:53 ` Daniel Phillips
2001-05-19 18:31 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox