public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* catch-22 in current partitioning layer
@ 2002-10-30 11:26 Peter T. Breuer
  2002-10-30 19:45 ` Luc Van Oostenryck
  0 siblings, 1 reply; 2+ messages in thread
From: Peter T. Breuer @ 2002-10-30 11:26 UTC (permalink / raw)
  To: linux kernel

Kernel 2.5.44

1) a module which needs to create a device needs to call add_disk
   in it's init, or it seems the kernel won't allow opens of the
   device node.

   * But it can't call add_disk normally without invoking
   * check_partitions (escape, do so with capacity=0 or minors=1)
   * which will stall as it needs to read the device or its
   * backer on disk, and we're still in init ...

   This hits NBD. Pavel's kernel NBD escaped (luckily?-) because his
   device is not partitionable, and so he called add_disk with
   minors=1 even though capacity was set to maximum. But he narrowly
   missed hanging in init.

   It's not clear how we're supposed to /subsequently/ go out and find
   some partitions over NBD, or how to avoid the races in doing so,
   since we are already sort of open for business.  HEEEEELP. My own
   ENBD is being hit by this because it is partitionable.

   Can someone document the new genhd and partitions interfaces?
   I'm confused as anything, and fed up hunting for disappeared things
   that now seem to be available via strings of pointer references off
   of new structs (how do I get to rescan_partitions, for example!).

2) Partitions, ah yes, partitions.

   Well, seems the kernel is similarly not permitting opens of minors
   until we've run the partition check in add_disk. That's good
   encapsulation if you are planning on doing i/o to them, but I
   was planning on doing ioctls! And I don't think it's fair to
   stop me doing ioctls to the minors. Well, we can argue that, but it
   will break my userspace tools, and I'd rather not introduce 
   incompatibilities between kernel versions.
   
   In any case, I don't think it's fair to equate minors with
   partitions. Partitions are ONE use of minors. I use minors as
   indices, to indicate which channel to a device should be used.
   The device has its partition structure, and any partition can
   be served (I'm speaking of requests) via any channel, i.e.,
   any minor.

   * My catch 22 is that I can't start using the channels/minors to
   * talk to the remote device (ENBD) until I've opened the device
   * and done the partition search via add_disk, which won't work 
   * until the channels are open ...


Help. I'm still fairly sane, but this is hurting.

Ideally, what I would like, is to be able to open the minors for 
ioctls before allowing them to be open for r/w as partitions. Then
I don't need to do any changes. I open NOBLOCK for ioctls, if
that's any help.

Now, what's with this register_region business?  Can I maybe open 256
devices on the same major, with 1 minor each, just to enable
communications to the remote, then read the remote partition table, then
unregister them all, then register them all again this time with 16
minors per device, as I originally wanted?  No!  The partition check
will fail, because there are too many partitions (>0) !

Help.

OK, I know what the userspace solution is, but I don't want to do it:
Make all ioctls go the the whole disk minor. And keep them there. And
even then, how am I supposed to rescan the device for partitions and
avoid the races that must be in progress for access to those partitions
at startup?




Peter

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: catch-22 in current partitioning layer
  2002-10-30 11:26 catch-22 in current partitioning layer Peter T. Breuer
@ 2002-10-30 19:45 ` Luc Van Oostenryck
  0 siblings, 0 replies; 2+ messages in thread
From: Luc Van Oostenryck @ 2002-10-30 19:45 UTC (permalink / raw)
  To: Kernel mailing list

Peter T. Breuer wrote:
> Kernel 2.5.44
> 
> 1) a module which needs to create a device needs to call add_disk
>    in it's init, or it seems the kernel won't allow opens of the
>    device node.
> 
>    * But it can't call add_disk normally without invoking
>    * check_partitions (escape, do so with capacity=0 or minors=1)
>    * which will stall as it needs to read the device or its
>    * backer on disk, and we're still in init ...
> 
>    This hits NBD. Pavel's kernel NBD escaped (luckily?-) because his
>    device is not partitionable, and so he called add_disk with
>    minors=1 even though capacity was set to maximum. But he narrowly
>    missed hanging in init.
> 
>    It's not clear how we're supposed to /subsequently/ go out and find
>    some partitions over NBD, or how to avoid the races in doing so,
>    since we are already sort of open for business.  HEEEEELP. My own
>    ENBD is being hit by this because it is partitionable.
> 
>    Can someone document the new genhd and partitions interfaces?
>    I'm confused as anything, and fed up hunting for disappeared things
>    that now seem to be available via strings of pointer references off
>    of new structs (how do I get to rescan_partitions, for example!).
> 
> 2) Partitions, ah yes, partitions.
> 
>    Well, seems the kernel is similarly not permitting opens of minors
>    until we've run the partition check in add_disk. That's good
>    encapsulation if you are planning on doing i/o to them, but I
>    was planning on doing ioctls! And I don't think it's fair to
>    stop me doing ioctls to the minors. Well, we can argue that, but it
>    will break my userspace tools, and I'd rather not introduce 
>    incompatibilities between kernel versions.
>    
>    In any case, I don't think it's fair to equate minors with
>    partitions. Partitions are ONE use of minors. I use minors as
>    indices, to indicate which channel to a device should be used.
>    The device has its partition structure, and any partition can
>    be served (I'm speaking of requests) via any channel, i.e.,
>    any minor.
> 
>    * My catch 22 is that I can't start using the channels/minors to
>    * talk to the remote device (ENBD) until I've opened the device
>    * and done the partition search via add_disk, which won't work 
>    * until the channels are open ...
> 
> 
> Help. I'm still fairly sane, but this is hurting.
> 
> Ideally, what I would like, is to be able to open the minors for 
> ioctls before allowing them to be open for r/w as partitions. Then
> I don't need to do any changes. I open NOBLOCK for ioctls, if
> that's any help.
> 
> Now, what's with this register_region business?  Can I maybe open 256
> devices on the same major, with 1 minor each, just to enable
> communications to the remote, then read the remote partition table, then
> unregister them all, then register them all again this time with 16
> minors per device, as I originally wanted?  No!  The partition check
> will fail, because there are too many partitions (>0) !
> 
> Help.
> 
> OK, I know what the userspace solution is, but I don't want to do it:
> Make all ioctls go the the whole disk minor. And keep them there. And
> even then, how am I supposed to rescan the device for partitions and
> avoid the races that must be in progress for access to those partitions
> at startup?
> 
> 
> 
> 
> Peter
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

I suppose this is related to the fact that with 2.5.44 (also in -ac),
trying to mount an unformated floppy give the following oops:

Unable to handle kernel NULL pointer dereference at virtual address 00000290
  printing eip:
c01483bc
*pde = 00000000
Oops: 0000
CPU:    1
EIP:    0060:[<c01483bc>]    Not tainted
EFLAGS: 00010202
EIP is at do_open+0x2bc/0x470
eax: c039ba08   ebx: f7f78c00   ecx: 000002a0   edx: f78c4a80
esi: f7881ec0   edi: f7881e60   ebp: c03114cc   esp: f16f9d54
ds: 0068   es: 0068   ss: 0068
Process mount (pid: 159, threadinfo=f16f8000 task=f7921980)
Stack: 00000000 f7881ec0 f16f9e14 00000003 f7881ee0 00026c6c f7f78c00 00000000
        00000000 021c9df4 00000000 0000001c c01485de f7881ec0 f78c4a80 f16f9e14
        00000000 00000000 00000003 f7881ec0 00000000 00000000 f78c4a80 00000000
Call Trace:
  [<c01485de>] blkdev_get+0x6e/0x80
  [<c0146dba>] get_sb_bdev+0xaa/0x220
  [<c017ee2f>] vfat_get_sb+0x1f/0x30
  [<c017ed50>] vfat_fill_super+0x0/0xc0
  [<c01470d0>] do_kern_mount+0x50/0xc0
  [<c015b1b5>] do_add_mount+0x65/0x150
  [<c015b4aa>] do_mount+0x16a/0x190
  [<c015b2ed>] copy_mount_options+0x4d/0xa0
  [<c015b9c5>] sys_mount+0xb5/0x130
  [<c01072ef>] syscall_call+0x7/0xb

Code: 8b 41 f0 0b 41 f4 75 23 8b 46 44 ff 48 54 8b 46 44 8d 48 20



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2002-10-30 19:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-30 11:26 catch-22 in current partitioning layer Peter T. Breuer
2002-10-30 19:45 ` Luc Van Oostenryck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox