public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Partition check considered as error is breaking mounting in 2.6.27
@ 2008-09-12 16:56 Herton Ronaldo Krzesinski
  2008-09-12 23:34 ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: Herton Ronaldo Krzesinski @ 2008-09-12 16:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: inux-usb, bogdano, Luiz Fernando N. Capitulino

Hi,

Recently I found a problem with a buggy camera that doesn't mount anymore with 
2.6.27 (its memory is available via usb-storage), since commit 
04ebd4aee52b06a2c38127d9208546e5b96f3a19

The camera is an Olympus X-840. The original issue comes from the camera 
itself: its format program creates a partition with an off by one error, 
while the device reports that its memory has 42079 sectors, the partition 
table reports also that the only partition on the disk has the size of 42079, 
but it fails to account for the first sector in the memory that contains the 
partition table, so in the end the partition exceeds the limit of the device 
size (42080, first sector plus 42079 from the first partition).

In previous kernels (2.6.26 and before), I still could mount and access the 
device (/dev/sdb1), although with the following errors:

sd 6:0:0:0: [sdb] Assuming drive cache: write through
 sdb: sdb1
 sdb: p1 exceeds device capacity
sd 6:0:0:0: [sdb] Attached SCSI removable disk
sd 6:0:0:0: Attached scsi generic sg2 type 0
usb-storage: device scan complete
attempt to access beyond end of device
sdb: rw=0, want=42080, limit=42079
__ratelimit: 16 messages suppressed
Buffer I/O error on device sdb1, logical block 42078
attempt to access beyond end of device
sdb: rw=0, want=42080, limit=42079

If you note the log snippet above the first notable thing is "p1 exceeds 
device capacity", so looking at commit 
04ebd4aee52b06a2c38127d9208546e5b96f3a19 it is clear why sdb1 isn't created 
anymore.

After formatting the camera is this you get from fdisk (display units in 
sectors):

Disk /dev/sdb: 21 MB, 21544448 bytes
6 heads, 16 sectors/track, 438 cylinders, total 42079 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1       42079       21039+   1  FAT12
Partition 1 has different physical/logical endings:
     phys=(328, 5, 16) logical=(438, 1, 16)

Note the bogus reported CHS values, both physical and logical, but they don't 
affect anything here.

I don't know if this change of behaviour in 2.6.27 is desired (not creating 
partition nodes if its size exceeds media size). Anyway the device is buggy 
itself. Until now the only way I found to mount device inside 2.6.27 is using 
hexedit to edit directly the partition table on the device, decreasing by 1 
the length of the first partition (42079 -> 42078). Clearing the partition 
table and using fdisk to partition again is not an option, the camera 
firmware seems to be lost after this, it starts to report weird media size 
(not by 1 sector error only):

attempt to access beyond end of device
sdb: rw=0, want=42042, limit=42000
attempt to access beyond end of device
sdb: rw=0, want=42042, limit=42000

Disk /dev/sdb: 21 MB, 21504000 bytes
1 heads, 42 sectors/track, 1000 cylinders, total 42000 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x57ea65f0

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1              42       42041       21000    1  FAT12

(after zeroing out the partition table I only run fdisk on the device, created 
the a new partition with maximum size allowed and change the type to FAT12)

On both cases respectively this is relevant info that I get with usb-storage 
debugging turned on:

* With memory formatted by camera firmware:
usb-storage: queuecommand called
usb-storage: *** thread awakened.
usb-storage: Command READ_CAPACITY (10 bytes)
usb-storage:  25 00 00 00 00 00 00 00 00 00
usb-storage: Bulk Command S 0x43425355 T 0x3 L 8 F 128 Trg 0 LUN 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 8 bytes, 1 entries
usb-storage: Status code 0; transferred 8/8
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk Status S 0x53425355 T 0x3 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
sd 4:0:0:0: [sdb] 42079 512-byte hardware sectors (22 MB)

* With partition table recreated with fdisk and formated by hand:
usb-storage: queuecommand called
usb-storage: *** thread awakened.
usb-storage: Command READ_CAPACITY (10 bytes)
usb-storage:  25 00 00 00 00 00 00 00 00 00
usb-storage: Bulk Command S 0x43425355 T 0x3 L 8 F 128 Trg 0 LUN 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 8 bytes, 1 entries
usb-storage: Status code 0; transferred 8/8
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk Status S 0x53425355 T 0x3 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
sd 6:0:0:0: [sdb] 42000 512-byte hardware sectors (22 MB)

After looking at all this I'm in doubt on what fix could be made in this case. 
May be reverting the fatal error when partition exceeds media size? Or adding 
a new usb-storage quirk to report media size with one more sector (dangerous, 
but physically I don't know what's the real media size as firmware reports a 
way different value of media size after using fdisk), or quirk to force 
something in partition creation to automatically trim partition size by -1.

Any ideas?

More info about the device, may be it's useful:

T:  Bus=03 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  2 Spd=12  MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=07b4 ProdID=0109 Rev= 1.00
S:  Manufacturer=OLYMPUS
S:  Product=FE310,X840,C530
S:  SerialNumber=Y31013411
C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=  0mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms

Bus 003 Device 003: ID 07b4:0109 Olympus Optical Co., Ltd C-370Z/D-535Z/X-450
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  idVendor           0x07b4 Olympus Optical Co., Ltd
  idProduct          0x0109 C-370Z/D-535Z/X-450
  bcdDevice            1.00
  iManufacturer           1 OLYMPUS
  iProduct                2 FE310,X840,C530
  iSerial                 3 Y31013411
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           32
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xc0
      Self Powered
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk (Zip)
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
Device Status:     0x0001
  Self Powered

-- 
[]'s
Herton

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Partition check considered as error is breaking mounting in 2.6.27
@ 2008-09-12 17:01 Herton Ronaldo Krzesinski
  2008-09-12 17:36 ` Alan Stern
  0 siblings, 1 reply; 28+ messages in thread
From: Herton Ronaldo Krzesinski @ 2008-09-12 17:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-usb, bogdano, Luiz Fernando N. Capitulino

[ Sent again now with correct CC to linux-usb ]

Hi,

Recently I found a problem with a buggy camera that doesn't mount anymore with 
2.6.27 (its memory is available via usb-storage), since commit 
04ebd4aee52b06a2c38127d9208546e5b96f3a19

The camera is an Olympus X-840. The original issue comes from the camera 
itself: its format program creates a partition with an off by one error, 
while the device reports that its memory has 42079 sectors, the partition 
table reports also that the only partition on the disk has the size of 42079, 
but it fails to account for the first sector in the memory that contains the 
partition table, so in the end the partition exceeds the limit of the device 
size (42080, first sector plus 42079 from the first partition).

In previous kernels (2.6.26 and before), I still could mount and access the 
device (/dev/sdb1), although with the following errors:

sd 6:0:0:0: [sdb] Assuming drive cache: write through
 sdb: sdb1
 sdb: p1 exceeds device capacity
sd 6:0:0:0: [sdb] Attached SCSI removable disk
sd 6:0:0:0: Attached scsi generic sg2 type 0
usb-storage: device scan complete
attempt to access beyond end of device
sdb: rw=0, want=42080, limit=42079
__ratelimit: 16 messages suppressed
Buffer I/O error on device sdb1, logical block 42078
attempt to access beyond end of device
sdb: rw=0, want=42080, limit=42079

If you note the log snippet above the first notable thing is "p1 exceeds 
device capacity", so looking at commit 
04ebd4aee52b06a2c38127d9208546e5b96f3a19 it is clear why sdb1 isn't created 
anymore.

After formatting the camera is this you get from fdisk (display units in 
sectors):

Disk /dev/sdb: 21 MB, 21544448 bytes
6 heads, 16 sectors/track, 438 cylinders, total 42079 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1       42079       21039+   1  FAT12
Partition 1 has different physical/logical endings:
     phys=(328, 5, 16) logical=(438, 1, 16)

Note the bogus reported CHS values, both physical and logical, but they don't 
affect anything here.

I don't know if this change of behaviour in 2.6.27 is desired (not creating 
partition nodes if its size exceeds media size). Anyway the device is buggy 
itself. Until now the only way I found to mount device inside 2.6.27 is using 
hexedit to edit directly the partition table on the device, decreasing by 1 
the length of the first partition (42079 -> 42078). Clearing the partition 
table and using fdisk to partition again is not an option, the camera 
firmware seems to be lost after this, it starts to report weird media size 
(not by 1 sector error only):

attempt to access beyond end of device
sdb: rw=0, want=42042, limit=42000
attempt to access beyond end of device
sdb: rw=0, want=42042, limit=42000

Disk /dev/sdb: 21 MB, 21504000 bytes
1 heads, 42 sectors/track, 1000 cylinders, total 42000 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x57ea65f0

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1              42       42041       21000    1  FAT12

(after zeroing out the partition table I only run fdisk on the device, created 
the a new partition with maximum size allowed and change the type to FAT12)

On both cases respectively this is relevant info that I get with usb-storage 
debugging turned on:

* With memory formatted by camera firmware:
usb-storage: queuecommand called
usb-storage: *** thread awakened.
usb-storage: Command READ_CAPACITY (10 bytes)
usb-storage:  25 00 00 00 00 00 00 00 00 00
usb-storage: Bulk Command S 0x43425355 T 0x3 L 8 F 128 Trg 0 LUN 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 8 bytes, 1 entries
usb-storage: Status code 0; transferred 8/8
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk Status S 0x53425355 T 0x3 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
sd 4:0:0:0: [sdb] 42079 512-byte hardware sectors (22 MB)

* With partition table recreated with fdisk and formated by hand:
usb-storage: queuecommand called
usb-storage: *** thread awakened.
usb-storage: Command READ_CAPACITY (10 bytes)
usb-storage:  25 00 00 00 00 00 00 00 00 00
usb-storage: Bulk Command S 0x43425355 T 0x3 L 8 F 128 Trg 0 LUN 0 CL 10
usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
usb-storage: Status code 0; transferred 31/31
usb-storage: -- transfer complete
usb-storage: Bulk command transfer result=0
usb-storage: usb_stor_bulk_transfer_sglist: xfer 8 bytes, 1 entries
usb-storage: Status code 0; transferred 8/8
usb-storage: -- transfer complete
usb-storage: Bulk data transfer result 0x0
usb-storage: Attempting to get CSW...
usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
usb-storage: Status code 0; transferred 13/13
usb-storage: -- transfer complete
usb-storage: Bulk status result = 0
usb-storage: Bulk Status S 0x53425355 T 0x3 R 0 Stat 0x0
usb-storage: scsi cmd done, result=0x0
usb-storage: *** thread sleeping.
sd 6:0:0:0: [sdb] 42000 512-byte hardware sectors (22 MB)

After looking at all this I'm in doubt on what fix could be made in this case. 
May be reverting the fatal error when partition exceeds media size? Or adding 
a new usb-storage quirk to report media size with one more sector (dangerous, 
but physically I don't know what's the real media size as firmware reports a 
way different value of media size after using fdisk), or quirk to force 
something in partition creation to automatically trim partition size by -1.

Any ideas?

More info about the device, may be it's useful:

T:  Bus=03 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#=  2 Spd=12  MxCh= 0
D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
P:  Vendor=07b4 ProdID=0109 Rev= 1.00
S:  Manufacturer=OLYMPUS
S:  Product=FE310,X840,C530
S:  SerialNumber=Y31013411
C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=  0mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms

Bus 003 Device 003: ID 07b4:0109 Olympus Optical Co., Ltd C-370Z/D-535Z/X-450
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0 (Defined at Interface level)
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  idVendor           0x07b4 Olympus Optical Co., Ltd
  idProduct          0x0109 C-370Z/D-535Z/X-450
  bcdDevice            1.00
  iManufacturer           1 OLYMPUS
  iProduct                2 FE310,X840,C530
  iSerial                 3 Y31013411
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           32
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0xc0
      Self Powered
    MaxPower                0mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         8 Mass Storage
      bInterfaceSubClass      6 SCSI
      bInterfaceProtocol     80 Bulk (Zip)
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               0
Device Status:     0x0001
  Self Powered

-- 
[]'s
Herton

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
@ 2008-09-12 17:32 Toralf Förster
  0 siblings, 0 replies; 28+ messages in thread
From: Toralf Förster @ 2008-09-12 17:32 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 426 bytes --]

On Fri, 12 Sep 2008 14:01:39 -0300, Herton Ronaldo Krzesinski wrote:
>The camera is an Olympus X-840. The original issue comes from the camera 
>itself: its format program creates a partition with an off by one error, 

Interesting,

b/c I had same issue with Panasonic TZ3 : http://lkml.org/lkml/2008/4/20/44

-- 
MfG/Sincerely

Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 17:01 Herton Ronaldo Krzesinski
@ 2008-09-12 17:36 ` Alan Stern
  2008-09-12 17:59   ` Bob Copeland
  2008-09-12 18:02   ` Herton Ronaldo Krzesinski
  0 siblings, 2 replies; 28+ messages in thread
From: Alan Stern @ 2008-09-12 17:36 UTC (permalink / raw)
  To: Herton Ronaldo Krzesinski
  Cc: linux-kernel, linux-usb, bogdano, Luiz Fernando N. Capitulino

On Fri, 12 Sep 2008, Herton Ronaldo Krzesinski wrote:

> Hi,
> 
> Recently I found a problem with a buggy camera that doesn't mount anymore with 
> 2.6.27 (its memory is available via usb-storage), since commit 
> 04ebd4aee52b06a2c38127d9208546e5b96f3a19
> 
> The camera is an Olympus X-840. The original issue comes from the camera 
> itself: its format program creates a partition with an off by one error, 
> while the device reports that its memory has 42079 sectors, the partition 
> table reports also that the only partition on the disk has the size of 42079, 
> but it fails to account for the first sector in the memory that contains the 
> partition table, so in the end the partition exceeds the limit of the device 
> size (42080, first sector plus 42079 from the first partition).
> 
> In previous kernels (2.6.26 and before), I still could mount and access the 
> device (/dev/sdb1), although with the following errors:
> 
> sd 6:0:0:0: [sdb] Assuming drive cache: write through
>  sdb: sdb1
>  sdb: p1 exceeds device capacity
> sd 6:0:0:0: [sdb] Attached SCSI removable disk
> sd 6:0:0:0: Attached scsi generic sg2 type 0
> usb-storage: device scan complete
> attempt to access beyond end of device
> sdb: rw=0, want=42080, limit=42079
> __ratelimit: 16 messages suppressed
> Buffer I/O error on device sdb1, logical block 42078
> attempt to access beyond end of device
> sdb: rw=0, want=42080, limit=42079
> 
> If you note the log snippet above the first notable thing is "p1 exceeds 
> device capacity", so looking at commit 
> 04ebd4aee52b06a2c38127d9208546e5b96f3a19 it is clear why sdb1 isn't created 
> anymore.
> 
> After formatting the camera is this you get from fdisk (display units in 
> sectors):
> 
> Disk /dev/sdb: 21 MB, 21544448 bytes
> 6 heads, 16 sectors/track, 438 cylinders, total 42079 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Disk identifier: 0x00000000
> 
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdb1   *           1       42079       21039+   1  FAT12
> Partition 1 has different physical/logical endings:
>      phys=(328, 5, 16) logical=(438, 1, 16)
> 
> Note the bogus reported CHS values, both physical and logical, but they don't 
> affect anything here.
> 
> I don't know if this change of behaviour in 2.6.27 is desired (not creating 
> partition nodes if its size exceeds media size).

I have to believe that it _is_ desired.  Why else would that commit 
have been merged?

And why didn't you CC: the author of that commit?

> Anyway the device is buggy 
> itself. Until now the only way I found to mount device inside 2.6.27 is using 
> hexedit to edit directly the partition table on the device, decreasing by 1 
> the length of the first partition (42079 -> 42078). Clearing the partition 
> table and using fdisk to partition again is not an option, the camera 
> firmware seems to be lost after this, it starts to report weird media size 
> (not by 1 sector error only):

Why clear the partition table?  Just replace the bogus entry with a 
correct entry.  (Although using hexedit may be easier...)

> attempt to access beyond end of device
> sdb: rw=0, want=42042, limit=42000
> attempt to access beyond end of device
> sdb: rw=0, want=42042, limit=42000
> 
> Disk /dev/sdb: 21 MB, 21504000 bytes
> 1 heads, 42 sectors/track, 1000 cylinders, total 42000 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Disk identifier: 0x57ea65f0
> 
>    Device Boot      Start         End      Blocks   Id  System
> /dev/sdb1              42       42041       21000    1  FAT12

This doesn't resemble the original partition table at all.  Did you 
create this strange table or did the camera's firmware change the data 
you entered?

> (after zeroing out the partition table I only run fdisk on the device, created 
> the a new partition with maximum size allowed and change the type to FAT12)

That is not a valid procedure.  You have to go into Expert mode and set 
the number of sectors, heads, and tracks first.  Most likely you'll 
want to set them to the same values used by the firmware; it looks like 
the firmware thinks there are 16 sectors/track, 8 heads, and 329 
tracks.

> On both cases respectively this is relevant info that I get with usb-storage 
> debugging turned on:
> 
> * With memory formatted by camera firmware:
> usb-storage: queuecommand called
> usb-storage: *** thread awakened.
> usb-storage: Command READ_CAPACITY (10 bytes)
> usb-storage:  25 00 00 00 00 00 00 00 00 00
> usb-storage: Bulk Command S 0x43425355 T 0x3 L 8 F 128 Trg 0 LUN 0 CL 10
> usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> usb-storage: Status code 0; transferred 31/31
> usb-storage: -- transfer complete
> usb-storage: Bulk command transfer result=0
> usb-storage: usb_stor_bulk_transfer_sglist: xfer 8 bytes, 1 entries
> usb-storage: Status code 0; transferred 8/8
> usb-storage: -- transfer complete
> usb-storage: Bulk data transfer result 0x0
> usb-storage: Attempting to get CSW...
> usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
> usb-storage: Status code 0; transferred 13/13
> usb-storage: -- transfer complete
> usb-storage: Bulk status result = 0
> usb-storage: Bulk Status S 0x53425355 T 0x3 R 0 Stat 0x0
> usb-storage: scsi cmd done, result=0x0
> usb-storage: *** thread sleeping.
> sd 4:0:0:0: [sdb] 42079 512-byte hardware sectors (22 MB)
> 
> * With partition table recreated with fdisk and formated by hand:
> usb-storage: queuecommand called
> usb-storage: *** thread awakened.
> usb-storage: Command READ_CAPACITY (10 bytes)
> usb-storage:  25 00 00 00 00 00 00 00 00 00
> usb-storage: Bulk Command S 0x43425355 T 0x3 L 8 F 128 Trg 0 LUN 0 CL 10
> usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> usb-storage: Status code 0; transferred 31/31
> usb-storage: -- transfer complete
> usb-storage: Bulk command transfer result=0
> usb-storage: usb_stor_bulk_transfer_sglist: xfer 8 bytes, 1 entries
> usb-storage: Status code 0; transferred 8/8
> usb-storage: -- transfer complete
> usb-storage: Bulk data transfer result 0x0
> usb-storage: Attempting to get CSW...
> usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
> usb-storage: Status code 0; transferred 13/13
> usb-storage: -- transfer complete
> usb-storage: Bulk status result = 0
> usb-storage: Bulk Status S 0x53425355 T 0x3 R 0 Stat 0x0
> usb-storage: scsi cmd done, result=0x0
> usb-storage: *** thread sleeping.
> sd 6:0:0:0: [sdb] 42000 512-byte hardware sectors (22 MB)

If you don't mind losing 79 sectors, you could just live with this.

> After looking at all this I'm in doubt on what fix could be made in this case. 
> May be reverting the fatal error when partition exceeds media size? Or adding 
> a new usb-storage quirk to report media size with one more sector (dangerous, 
> but physically I don't know what's the real media size as firmware reports a 
> way different value of media size after using fdisk), or quirk to force 
> something in partition creation to automatically trim partition size by -1.
> 
> Any ideas?

Adding quirks to alter partition sizes sounds very dangerous.  Your 
best bet is simply to write a valid partition table.

Alan Stern


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 17:36 ` Alan Stern
@ 2008-09-12 17:59   ` Bob Copeland
  2008-09-12 18:21     ` Alan Stern
  2008-09-12 18:02   ` Herton Ronaldo Krzesinski
  1 sibling, 1 reply; 28+ messages in thread
From: Bob Copeland @ 2008-09-12 17:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Herton Ronaldo Krzesinski, linux-kernel, linux-usb, bogdano,
	Luiz Fernando N. Capitulino

On Fri, Sep 12, 2008 at 1:36 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
>> I don't know if this change of behaviour in 2.6.27 is desired (not creating
>> partition nodes if its size exceeds media size).
>
> I have to believe that it _is_ desired.  Why else would that commit
> have been merged?

Several people objected to similar patches when they were posted on
LKML in the past, usually for the reason "this breaks my forensics use
case."  I guess no one objected this time around.

http://marc.info/?l=linux-kernel&m=116069718613683&w=2

(I don't have an opinion, I'm just providing collective memory.)

-- 
Bob Copeland %% www.bobcopeland.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 17:36 ` Alan Stern
  2008-09-12 17:59   ` Bob Copeland
@ 2008-09-12 18:02   ` Herton Ronaldo Krzesinski
  2008-09-12 18:40     ` Alan Stern
  1 sibling, 1 reply; 28+ messages in thread
From: Herton Ronaldo Krzesinski @ 2008-09-12 18:02 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-kernel, linux-usb, bogdano, Luiz Fernando N. Capitulino,
	Abdel Benamrouche, Damien Lallement

On Friday 12 September 2008 14:36:47 Alan Stern wrote:
> On Fri, 12 Sep 2008, Herton Ronaldo Krzesinski wrote:
> > Hi,
> >
> > Recently I found a problem with a buggy camera that doesn't mount anymore
> > with 2.6.27 (its memory is available via usb-storage), since commit
> > 04ebd4aee52b06a2c38127d9208546e5b96f3a19
> >
> > The camera is an Olympus X-840. The original issue comes from the camera
> > itself: its format program creates a partition with an off by one error,
> > while the device reports that its memory has 42079 sectors, the partition
> > table reports also that the only partition on the disk has the size of
> > 42079, but it fails to account for the first sector in the memory that
> > contains the partition table, so in the end the partition exceeds the
> > limit of the device size (42080, first sector plus 42079 from the first
> > partition).
> >
> > In previous kernels (2.6.26 and before), I still could mount and access
> > the device (/dev/sdb1), although with the following errors:
> >
> > sd 6:0:0:0: [sdb] Assuming drive cache: write through
> >  sdb: sdb1
> >  sdb: p1 exceeds device capacity
> > sd 6:0:0:0: [sdb] Attached SCSI removable disk
> > sd 6:0:0:0: Attached scsi generic sg2 type 0
> > usb-storage: device scan complete
> > attempt to access beyond end of device
> > sdb: rw=0, want=42080, limit=42079
> > __ratelimit: 16 messages suppressed
> > Buffer I/O error on device sdb1, logical block 42078
> > attempt to access beyond end of device
> > sdb: rw=0, want=42080, limit=42079
> >
> > If you note the log snippet above the first notable thing is "p1 exceeds
> > device capacity", so looking at commit
> > 04ebd4aee52b06a2c38127d9208546e5b96f3a19 it is clear why sdb1 isn't
> > created anymore.
> >
> > After formatting the camera is this you get from fdisk (display units in
> > sectors):
> >
> > Disk /dev/sdb: 21 MB, 21544448 bytes
> > 6 heads, 16 sectors/track, 438 cylinders, total 42079 sectors
> > Units = sectors of 1 * 512 = 512 bytes
> > Disk identifier: 0x00000000
> >
> >    Device Boot      Start         End      Blocks   Id  System
> > /dev/sdb1   *           1       42079       21039+   1  FAT12
> > Partition 1 has different physical/logical endings:
> >      phys=(328, 5, 16) logical=(438, 1, 16)
> >
> > Note the bogus reported CHS values, both physical and logical, but they
> > don't affect anything here.
> >
> > I don't know if this change of behaviour in 2.6.27 is desired (not
> > creating partition nodes if its size exceeds media size).
>
> I have to believe that it _is_ desired.  Why else would that commit
> have been merged?
>
> And why didn't you CC: the author of that commit?

I forgot, CC'ing him now.

>
> > Anyway the device is buggy
> > itself. Until now the only way I found to mount device inside 2.6.27 is
> > using hexedit to edit directly the partition table on the device,
> > decreasing by 1 the length of the first partition (42079 -> 42078).
> > Clearing the partition table and using fdisk to partition again is not an
> > option, the camera firmware seems to be lost after this, it starts to
> > report weird media size (not by 1 sector error only):
>
> Why clear the partition table?  Just replace the bogus entry with a
> correct entry.  (Although using hexedit may be easier...)

Clearing partition table seemed easier than fixing values with fdisk, to make 
fdisk recreate partition table automatically and set right CHS values.

>
> > attempt to access beyond end of device
> > sdb: rw=0, want=42042, limit=42000
> > attempt to access beyond end of device
> > sdb: rw=0, want=42042, limit=42000
> >
> > Disk /dev/sdb: 21 MB, 21504000 bytes
> > 1 heads, 42 sectors/track, 1000 cylinders, total 42000 sectors
> > Units = sectors of 1 * 512 = 512 bytes
> > Disk identifier: 0x57ea65f0
> >
> >    Device Boot      Start         End      Blocks   Id  System
> > /dev/sdb1              42       42041       21000    1  FAT12
>
> This doesn't resemble the original partition table at all.  Did you
> create this strange table or did the camera's firmware change the data
> you entered?

No, was what fdisk did automatically.

>
> > (after zeroing out the partition table I only run fdisk on the device,
> > created the a new partition with maximum size allowed and change the type
> > to FAT12)
>
> That is not a valid procedure.  You have to go into Expert mode and set
> the number of sectors, heads, and tracks first.  Most likely you'll
> want to set them to the same values used by the firmware; it looks like
> the firmware thinks there are 16 sectors/track, 8 heads, and 329
> tracks.

8 heads? it reports 6 from the fdisk output:
phys=(328, 5, 16) logical=(438, 1, 16)

>
> > On both cases respectively this is relevant info that I get with
> > usb-storage debugging turned on:
> >
> > * With memory formatted by camera firmware:
> > usb-storage: queuecommand called
> > usb-storage: *** thread awakened.
> > usb-storage: Command READ_CAPACITY (10 bytes)
> > usb-storage:  25 00 00 00 00 00 00 00 00 00
> > usb-storage: Bulk Command S 0x43425355 T 0x3 L 8 F 128 Trg 0 LUN 0 CL 10
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> > usb-storage: Status code 0; transferred 31/31
> > usb-storage: -- transfer complete
> > usb-storage: Bulk command transfer result=0
> > usb-storage: usb_stor_bulk_transfer_sglist: xfer 8 bytes, 1 entries
> > usb-storage: Status code 0; transferred 8/8
> > usb-storage: -- transfer complete
> > usb-storage: Bulk data transfer result 0x0
> > usb-storage: Attempting to get CSW...
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
> > usb-storage: Status code 0; transferred 13/13
> > usb-storage: -- transfer complete
> > usb-storage: Bulk status result = 0
> > usb-storage: Bulk Status S 0x53425355 T 0x3 R 0 Stat 0x0
> > usb-storage: scsi cmd done, result=0x0
> > usb-storage: *** thread sleeping.
> > sd 4:0:0:0: [sdb] 42079 512-byte hardware sectors (22 MB)
> >
> > * With partition table recreated with fdisk and formated by hand:
> > usb-storage: queuecommand called
> > usb-storage: *** thread awakened.
> > usb-storage: Command READ_CAPACITY (10 bytes)
> > usb-storage:  25 00 00 00 00 00 00 00 00 00
> > usb-storage: Bulk Command S 0x43425355 T 0x3 L 8 F 128 Trg 0 LUN 0 CL 10
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
> > usb-storage: Status code 0; transferred 31/31
> > usb-storage: -- transfer complete
> > usb-storage: Bulk command transfer result=0
> > usb-storage: usb_stor_bulk_transfer_sglist: xfer 8 bytes, 1 entries
> > usb-storage: Status code 0; transferred 8/8
> > usb-storage: -- transfer complete
> > usb-storage: Bulk data transfer result 0x0
> > usb-storage: Attempting to get CSW...
> > usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
> > usb-storage: Status code 0; transferred 13/13
> > usb-storage: -- transfer complete
> > usb-storage: Bulk status result = 0
> > usb-storage: Bulk Status S 0x53425355 T 0x3 R 0 Stat 0x0
> > usb-storage: scsi cmd done, result=0x0
> > usb-storage: *** thread sleeping.
> > sd 6:0:0:0: [sdb] 42000 512-byte hardware sectors (22 MB)
>
> If you don't mind losing 79 sectors, you could just live with this.
>
> > After looking at all this I'm in doubt on what fix could be made in this
> > case. May be reverting the fatal error when partition exceeds media size?
> > Or adding a new usb-storage quirk to report media size with one more
> > sector (dangerous, but physically I don't know what's the real media size
> > as firmware reports a way different value of media size after using
> > fdisk), or quirk to force something in partition creation to
> > automatically trim partition size by -1.
> >
> > Any ideas?
>
> Adding quirks to alter partition sizes sounds very dangerous.  Your
> best bet is simply to write a valid partition table.

Yes, but this is breaking other devices, not only this olympus camera. Bogdano 
reported what could be the same case with his cel. phone (Bogdano can you 
give more details?), also on IRC today Damien Lallement complained about what 
looks the same issue of "p* exceeds device capacity" (Damien can you give 
more info too?). But I'm afraid of how much devices this partition error 
fatal check now can affect, after all it's not "user friendly" to instruct 
the user to use hexedit or fdisk to fix the device's partitions.

>
> Alan Stern

-- 
[]'s
Herton

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 17:59   ` Bob Copeland
@ 2008-09-12 18:21     ` Alan Stern
  0 siblings, 0 replies; 28+ messages in thread
From: Alan Stern @ 2008-09-12 18:21 UTC (permalink / raw)
  To: Bob Copeland
  Cc: Herton Ronaldo Krzesinski, linux-kernel, linux-usb, bogdano,
	Luiz Fernando N. Capitulino

On Fri, 12 Sep 2008, Bob Copeland wrote:

> On Fri, Sep 12, 2008 at 1:36 PM, Alan Stern <stern@rowland.harvard.edu> wrote:
> >> I don't know if this change of behaviour in 2.6.27 is desired (not creating
> >> partition nodes if its size exceeds media size).
> >
> > I have to believe that it _is_ desired.  Why else would that commit
> > have been merged?
> 
> Several people objected to similar patches when they were posted on
> LKML in the past, usually for the reason "this breaks my forensics use
> case."  I guess no one objected this time around.
> 
> http://marc.info/?l=linux-kernel&m=116069718613683&w=2
> 
> (I don't have an opinion, I'm just providing collective memory.)

Well, Herton can object and ask to have the offending commit reverted.

Alan Stern


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 18:02   ` Herton Ronaldo Krzesinski
@ 2008-09-12 18:40     ` Alan Stern
  2008-09-12 20:14       ` Herton Ronaldo Krzesinski
  0 siblings, 1 reply; 28+ messages in thread
From: Alan Stern @ 2008-09-12 18:40 UTC (permalink / raw)
  To: Herton Ronaldo Krzesinski
  Cc: linux-kernel, linux-usb, bogdano, Luiz Fernando N. Capitulino,
	Abdel Benamrouche, Damien Lallement

On Fri, 12 Sep 2008, Herton Ronaldo Krzesinski wrote:

> > > After formatting the camera is this you get from fdisk (display units in
> > > sectors):
> > >
> > > Disk /dev/sdb: 21 MB, 21544448 bytes
> > > 6 heads, 16 sectors/track, 438 cylinders, total 42079 sectors
> > > Units = sectors of 1 * 512 = 512 bytes
> > > Disk identifier: 0x00000000
> > >
> > >    Device Boot      Start         End      Blocks   Id  System
> > > /dev/sdb1   *           1       42079       21039+   1  FAT12
> > > Partition 1 has different physical/logical endings:
> > >      phys=(328, 5, 16) logical=(438, 1, 16)

> > > (after zeroing out the partition table I only run fdisk on the device,
> > > created the a new partition with maximum size allowed and change the type
> > > to FAT12)
> >
> > That is not a valid procedure.  You have to go into Expert mode and set
> > the number of sectors, heads, and tracks first.  Most likely you'll
> > want to set them to the same values used by the firmware; it looks like
> > the firmware thinks there are 16 sectors/track, 8 heads, and 329
> > tracks.
> 
> 8 heads? it reports 6 from the fdisk output:
> phys=(328, 5, 16) logical=(438, 1, 16)

No.  phys=(328, 5, 16) doesn't mean that the device has 329 tracks, 6
heads, and 16 sectors.  It means that the last sector of the partition
(as recorded in the table) is track 328, head 5, sector 16.  
Note that partitions don't have to end on cylinder boundaries.

Guessing that the firmware intends there to be 16 sectors per track,
and also guessing that the 42079 value is too low by one, we get a
total of 42080 / 16 = 2630 as the product of heads and tracks.  Since
2630 = 8 * 328.75 this seems to say that the firmware thinks there are
approximately 329 tracks and 8 heads.  That explains the "328" above.

Continuing this reasoning, the physical end (328,5,16) corresponds to
sector (328*8*16) + (5*16) + 15 = 42079 (the 15 is because sector
numbers start at 1 instead of at 0).  Which would be valid if the first
sector of the partition was sector 1, or phys=(0,0,2), and if the
capacity was 42080.  However you didn't tell us what value was stored
in the table for the start of the partition.

> > Adding quirks to alter partition sizes sounds very dangerous.  Your
> > best bet is simply to write a valid partition table.
> 
> Yes, but this is breaking other devices, not only this olympus camera. Bogdano 
> reported what could be the same case with his cel. phone (Bogdano can you 
> give more details?), also on IRC today Damien Lallement complained about what 
> looks the same issue of "p* exceeds device capacity" (Damien can you give 
> more info too?). But I'm afraid of how much devices this partition error 
> fatal check now can affect, after all it's not "user friendly" to instruct 
> the user to use hexedit or fdisk to fix the device's partitions.

You should complain to the people who wrote and accepted the
troublesome commit.  Tell them it caused a regression; that always gets
people's attention!

Alan Stern


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 18:40     ` Alan Stern
@ 2008-09-12 20:14       ` Herton Ronaldo Krzesinski
  2008-09-12 20:17         ` Herton Ronaldo Krzesinski
  2008-09-12 20:27         ` Bob Copeland
  0 siblings, 2 replies; 28+ messages in thread
From: Herton Ronaldo Krzesinski @ 2008-09-12 20:14 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-kernel, linux-usb, bogdano, Luiz Fernando N. Capitulino,
	Abdel Benamrouche, Damien Lallement, pterjan@mandriva.com,
	Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 4681 bytes --]

On Friday 12 September 2008 15:40:41 Alan Stern wrote:
> On Fri, 12 Sep 2008, Herton Ronaldo Krzesinski wrote:
> > > > After formatting the camera is this you get from fdisk (display units
> > > > in sectors):
> > > >
> > > > Disk /dev/sdb: 21 MB, 21544448 bytes
> > > > 6 heads, 16 sectors/track, 438 cylinders, total 42079 sectors
> > > > Units = sectors of 1 * 512 = 512 bytes
> > > > Disk identifier: 0x00000000
> > > >
> > > >    Device Boot      Start         End      Blocks   Id  System
> > > > /dev/sdb1   *           1       42079       21039+   1  FAT12
> > > > Partition 1 has different physical/logical endings:
> > > >      phys=(328, 5, 16) logical=(438, 1, 16)
> > > >
> > > > (after zeroing out the partition table I only run fdisk on the
> > > > device, created the a new partition with maximum size allowed and
> > > > change the type to FAT12)
> > >
> > > That is not a valid procedure.  You have to go into Expert mode and set
> > > the number of sectors, heads, and tracks first.  Most likely you'll
> > > want to set them to the same values used by the firmware; it looks like
> > > the firmware thinks there are 16 sectors/track, 8 heads, and 329
> > > tracks.
> >
> > 8 heads? it reports 6 from the fdisk output:
> > phys=(328, 5, 16) logical=(438, 1, 16)
>
> No.  phys=(328, 5, 16) doesn't mean that the device has 329 tracks, 6
> heads, and 16 sectors.  It means that the last sector of the partition
> (as recorded in the table) is track 328, head 5, sector 16.
> Note that partitions don't have to end on cylinder boundaries.
>
> Guessing that the firmware intends there to be 16 sectors per track,
> and also guessing that the 42079 value is too low by one, we get a
> total of 42080 / 16 = 2630 as the product of heads and tracks.  Since
> 2630 = 8 * 328.75 this seems to say that the firmware thinks there are
> approximately 329 tracks and 8 heads.  That explains the "328" above.
>
> Continuing this reasoning, the physical end (328,5,16) corresponds to
> sector (328*8*16) + (5*16) + 15 = 42079 (the 15 is because sector
> numbers start at 1 instead of at 0).  Which would be valid if the first
> sector of the partition was sector 1, or phys=(0,0,2), and if the
> capacity was 42080.  However you didn't tell us what value was stored
> in the table for the start of the partition.

Ok, thanks for the explanation, indeed you are right, it's about the physical 
end, I confused things, I'm not used to CHS... The start sector is 1, 
(0,0,2), so what you said is valid. I attach the dump of first sector of the 
memory here just for reference that shows this.

>
> > > Adding quirks to alter partition sizes sounds very dangerous.  Your
> > > best bet is simply to write a valid partition table.
> >
> > Yes, but this is breaking other devices, not only this olympus camera.
> > Bogdano reported what could be the same case with his cel. phone (Bogdano
> > can you give more details?), also on IRC today Damien Lallement
> > complained about what looks the same issue of "p* exceeds device
> > capacity" (Damien can you give more info too?). But I'm afraid of how
> > much devices this partition error fatal check now can affect, after all
> > it's not "user friendly" to instruct the user to use hexedit or fdisk to
> > fix the device's partitions.
>
> You should complain to the people who wrote and accepted the
> troublesome commit.  Tell them it caused a regression; that always gets
> people's attention!

Ok, here goes a patch that revert the relevant part of the commit and 
goes back to behaviour of previous kernels, and fixes the regression here.
If considered a good solution, can be applied.

---

fs/partition/check.c: revert part of commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19

Fix regression introduced by commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19,
where kernel changed behaviour making fatal the error when some partition
exceeds the limit of the device size. Some buggy devices become inacessible
because of errors in their partition table if the error is fatal.

This closes http://bugzilla.kernel.org/show_bug.cgi?id=11554

diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 7d6b34e..15c70df 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -499,9 +499,9 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
 		if (!size)
 			continue;
 		if (from + size > get_capacity(disk)) {
-			printk(KERN_ERR " %s: p%d exceeds device capacity\n",
+			printk(KERN_WARNING
+				" %s: p%d exceeds device capacity\n",
 				disk->disk_name, p);
-			continue;
 		}
 		res = add_partition(disk, p, from, size, state->parts[p].flags);
 		if (res) {


>
> Alan Stern

-- 
[]'s
Herton

[-- Attachment #2: firstsector --]
[-- Type: application/octet-stream, Size: 512 bytes --]

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 20:14       ` Herton Ronaldo Krzesinski
@ 2008-09-12 20:17         ` Herton Ronaldo Krzesinski
  2008-09-12 20:27         ` Bob Copeland
  1 sibling, 0 replies; 28+ messages in thread
From: Herton Ronaldo Krzesinski @ 2008-09-12 20:17 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-kernel, linux-usb, bogdano, Luiz Fernando N. Capitulino,
	Abdel Benamrouche, Damien Lallement, pterjan@mandriva.com,
	Jens Axboe

On Friday 12 September 2008 17:14:04 Herton Ronaldo Krzesinski wrote:
> On Friday 12 September 2008 15:40:41 Alan Stern wrote:
> > On Fri, 12 Sep 2008, Herton Ronaldo Krzesinski wrote:
> > > > > After formatting the camera is this you get from fdisk (display units
> > > > > in sectors):
> > > > >
> > > > > Disk /dev/sdb: 21 MB, 21544448 bytes
> > > > > 6 heads, 16 sectors/track, 438 cylinders, total 42079 sectors
> > > > > Units = sectors of 1 * 512 = 512 bytes
> > > > > Disk identifier: 0x00000000
> > > > >
> > > > >    Device Boot      Start         End      Blocks   Id  System
> > > > > /dev/sdb1   *           1       42079       21039+   1  FAT12
> > > > > Partition 1 has different physical/logical endings:
> > > > >      phys=(328, 5, 16) logical=(438, 1, 16)
> > > > >
> > > > > (after zeroing out the partition table I only run fdisk on the
> > > > > device, created the a new partition with maximum size allowed and
> > > > > change the type to FAT12)
> > > >
> > > > That is not a valid procedure.  You have to go into Expert mode and set
> > > > the number of sectors, heads, and tracks first.  Most likely you'll
> > > > want to set them to the same values used by the firmware; it looks like
> > > > the firmware thinks there are 16 sectors/track, 8 heads, and 329
> > > > tracks.
> > >
> > > 8 heads? it reports 6 from the fdisk output:
> > > phys=(328, 5, 16) logical=(438, 1, 16)
> >
> > No.  phys=(328, 5, 16) doesn't mean that the device has 329 tracks, 6
> > heads, and 16 sectors.  It means that the last sector of the partition
> > (as recorded in the table) is track 328, head 5, sector 16.
> > Note that partitions don't have to end on cylinder boundaries.
> >
> > Guessing that the firmware intends there to be 16 sectors per track,
> > and also guessing that the 42079 value is too low by one, we get a
> > total of 42080 / 16 = 2630 as the product of heads and tracks.  Since
> > 2630 = 8 * 328.75 this seems to say that the firmware thinks there are
> > approximately 329 tracks and 8 heads.  That explains the "328" above.
> >
> > Continuing this reasoning, the physical end (328,5,16) corresponds to
> > sector (328*8*16) + (5*16) + 15 = 42079 (the 15 is because sector
> > numbers start at 1 instead of at 0).  Which would be valid if the first
> > sector of the partition was sector 1, or phys=(0,0,2), and if the
> > capacity was 42080.  However you didn't tell us what value was stored
> > in the table for the start of the partition.
> 
> Ok, thanks for the explanation, indeed you are right, it's about the physical 
> end, I confused things, I'm not used to CHS... The start sector is 1, 
> (0,0,2), so what you said is valid. I attach the dump of first sector of the 
> memory here just for reference that shows this.
> 
> >
> > > > Adding quirks to alter partition sizes sounds very dangerous.  Your
> > > > best bet is simply to write a valid partition table.
> > >
> > > Yes, but this is breaking other devices, not only this olympus camera.
> > > Bogdano reported what could be the same case with his cel. phone (Bogdano
> > > can you give more details?), also on IRC today Damien Lallement
> > > complained about what looks the same issue of "p* exceeds device
> > > capacity" (Damien can you give more info too?). But I'm afraid of how
> > > much devices this partition error fatal check now can affect, after all
> > > it's not "user friendly" to instruct the user to use hexedit or fdisk to
> > > fix the device's partitions.
> >
> > You should complain to the people who wrote and accepted the
> > troublesome commit.  Tell them it caused a regression; that always gets
> > people's attention!
> 
> Ok, here goes a patch that revert the relevant part of the commit and 
> goes back to behaviour of previous kernels, and fixes the regression here.
> If considered a good solution, can be applied.
> 
> ---
> 
> fs/partition/check.c: revert part of commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19
> 
> Fix regression introduced by commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19,
> where kernel changed behaviour making fatal the error when some partition
> exceeds the limit of the device size. Some buggy devices become inacessible
> because of errors in their partition table if the error is fatal.
> 
> This closes http://bugzilla.kernel.org/show_bug.cgi?id=11554

And I forgot my Signed-off....

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>

> 
> diff --git a/fs/partitions/check.c b/fs/partitions/check.c
> index 7d6b34e..15c70df 100644
> --- a/fs/partitions/check.c
> +++ b/fs/partitions/check.c
> @@ -499,9 +499,9 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
>  		if (!size)
>  			continue;
>  		if (from + size > get_capacity(disk)) {
> -			printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> +			printk(KERN_WARNING
> +				" %s: p%d exceeds device capacity\n",
>  				disk->disk_name, p);
> -			continue;
>  		}
>  		res = add_partition(disk, p, from, size, state->parts[p].flags);
>  		if (res) {
> 
> 
> >
> > Alan Stern
> 

--
[]'s
Herton

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 20:14       ` Herton Ronaldo Krzesinski
  2008-09-12 20:17         ` Herton Ronaldo Krzesinski
@ 2008-09-12 20:27         ` Bob Copeland
  2008-09-12 21:07           ` Herton Ronaldo Krzesinski
  1 sibling, 1 reply; 28+ messages in thread
From: Bob Copeland @ 2008-09-12 20:27 UTC (permalink / raw)
  To: Herton Ronaldo Krzesinski
  Cc: Alan Stern, linux-kernel, linux-usb, bogdano,
	Luiz Fernando N. Capitulino, Abdel Benamrouche, Damien Lallement,
	pterjan@mandriva.com, Jens Axboe

On Fri, Sep 12, 2008 at 4:14 PM, Herton Ronaldo Krzesinski
<herton@mandriva.com.br> wrote:
> diff --git a/fs/partitions/check.c b/fs/partitions/check.c
> index 7d6b34e..15c70df 100644
> --- a/fs/partitions/check.c
> +++ b/fs/partitions/check.c
> @@ -499,9 +499,9 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
>                if (!size)
>                        continue;
>                if (from + size > get_capacity(disk)) {
> -                       printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> +                       printk(KERN_WARNING
> +                               " %s: p%d exceeds device capacity\n",
>                                disk->disk_name, p);
> -                       continue;

It might make sense to comment here that we intentionally want to add the
partition anyway.  That might keep someone from re-adding the continue (same
patch has already come up at least 3 times).

-- 
Bob Copeland %% www.bobcopeland.com

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 20:27         ` Bob Copeland
@ 2008-09-12 21:07           ` Herton Ronaldo Krzesinski
  2008-09-12 23:36             ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: Herton Ronaldo Krzesinski @ 2008-09-12 21:07 UTC (permalink / raw)
  To: Bob Copeland
  Cc: Alan Stern, linux-kernel, linux-usb, bogdano,
	Luiz Fernando N. Capitulino, Abdel Benamrouche, Damien Lallement,
	pterjan@mandriva.com, Jens Axboe

On Friday 12 September 2008 17:27:54 Bob Copeland wrote:
> On Fri, Sep 12, 2008 at 4:14 PM, Herton Ronaldo Krzesinski
> <herton@mandriva.com.br> wrote:
> > diff --git a/fs/partitions/check.c b/fs/partitions/check.c
> > index 7d6b34e..15c70df 100644
> > --- a/fs/partitions/check.c
> > +++ b/fs/partitions/check.c
> > @@ -499,9 +499,9 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
> >                if (!size)
> >                        continue;
> >                if (from + size > get_capacity(disk)) {
> > -                       printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> > +                       printk(KERN_WARNING
> > +                               " %s: p%d exceeds device capacity\n",
> >                                disk->disk_name, p);
> > -                       continue;
> 
> It might make sense to comment here that we intentionally want to add the
> partition anyway.  That might keep someone from re-adding the continue (same
> patch has already come up at least 3 times).
> 

Yes, here goes a new version:

___

fs/partition/check.c: revert part of commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19

Fix regression introduced by commit 04ebd4aee52b06a2c38127d9208546e5b96f3a19,
where kernel changed behaviour making fatal the error when some partition
exceeds the limit of the device size. Some buggy devices become inacessible
because of errors in their partition table if the error is fatal.

This closes http://bugzilla.kernel.org/show_bug.cgi?id=11554

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>

diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 7d6b34e..2dd346d 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -499,9 +499,13 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
 		if (!size)
 			continue;
 		if (from + size > get_capacity(disk)) {
-			printk(KERN_ERR " %s: p%d exceeds device capacity\n",
+			printk(KERN_WARNING
+				" %s: p%d exceeds device capacity\n",
 				disk->disk_name, p);
-			continue;
+			/* note: we don't want to break access to
+			 * devices with buggy partition tables, so we
+			 * don't want to fail here, just go on and
+			 * add partition */
 		}
 		res = add_partition(disk, p, from, size, state->parts[p].flags);
 		if (res) {



-- 
[]'s
Herton

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 16:56 Herton Ronaldo Krzesinski
@ 2008-09-12 23:34 ` Andrew Morton
  2008-09-13 15:54   ` Bill Davidsen
  2008-09-13 22:56   ` Herton Ronaldo Krzesinski
  0 siblings, 2 replies; 28+ messages in thread
From: Andrew Morton @ 2008-09-12 23:34 UTC (permalink / raw)
  To: Herton Ronaldo Krzesinski; +Cc: linux-kernel, inux-usb, bogdano, lcapitulino

On Fri, 12 Sep 2008 13:56:49 -0300
Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:

> Recently I found a problem with a buggy camera that doesn't mount anymore with 
> 2.6.27 (its memory is available via usb-storage), since commit 
> 04ebd4aee52b06a2c38127d9208546e5b96f3a19
> 
> The camera is an Olympus X-840. The original issue comes from the camera 
> itself: its format program creates a partition with an off by one error, 
> while the device reports that its memory has 42079 sectors, the partition 
> table reports also that the only partition on the disk has the size of 42079, 
> but it fails to account for the first sector in the memory that contains the 
> partition table, so in the end the partition exceeds the limit of the device 
> size (42080, first sector plus 42079 from the first partition).
> 
> In previous kernels (2.6.26 and before), I still could mount and access the 
> device (/dev/sdb1), although with the following errors:

Yeah.

Can you test this please?

From: Andrew Morton <akpm@linux-foundation.org>

Herton Krzesinski reports that the error-checking changes in
04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
fs/partition/check.c: check value returned by add_partition") cause his
buggy USB camera to no longer mount.  "The camera is an Olympus X-840. 
The original issue comes from the camera itself: its format program
creates a partition with an off by one error".

Buggy devices happen.  It is better for the kernel to warn and to proceed
with the mount.

Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: Abdel Benamrouche <draconux@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/partitions/check.c |    1 -
 1 file changed, 1 deletion(-)

diff -puN fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal fs/partitions/check.c
--- a/fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal
+++ a/fs/partitions/check.c
@@ -540,7 +540,6 @@ int rescan_partitions(struct gendisk *di
 		if (from + size > get_capacity(disk)) {
 			printk(KERN_ERR " %s: p%d exceeds device capacity\n",
 				disk->disk_name, p);
-			continue;
 		}
 		res = add_partition(disk, p, from, size, state->parts[p].flags);
 		if (res) {
_


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 21:07           ` Herton Ronaldo Krzesinski
@ 2008-09-12 23:36             ` Andrew Morton
  2008-09-12 23:46               ` David Brownell
  2008-10-08 16:01               ` Kay Sievers
  0 siblings, 2 replies; 28+ messages in thread
From: Andrew Morton @ 2008-09-12 23:36 UTC (permalink / raw)
  To: Herton Ronaldo Krzesinski
  Cc: me, stern, linux-kernel, linux-usb, bogdano, lcapitulino,
	draconux, dlallement, pterjan, axboe

On Fri, 12 Sep 2008 18:07:27 -0300
Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:

> Yes, here goes a new version:

Well gee.  Given a choice, I went and replied to the wrong thread. 
Here's what I think:

From: Andrew Morton <akpm@linux-foundation.org>

Herton Krzesinski reports that the error-checking changes in
04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
fs/partition/check.c: check value returned by add_partition") cause his
buggy USB camera to no longer mount.  "The camera is an Olympus X-840. 
The original issue comes from the camera itself: its format program
creates a partition with an off by one error".

Buggy devices happen.  It is better for the kernel to warn and to proceed
with the mount.

Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: Abdel Benamrouche <draconux@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/partitions/check.c |    1 -
 1 file changed, 1 deletion(-)

diff -puN fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal fs/partitions/check.c
--- a/fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal
+++ a/fs/partitions/check.c
@@ -540,7 +540,6 @@ int rescan_partitions(struct gendisk *di
 		if (from + size > get_capacity(disk)) {
 			printk(KERN_ERR " %s: p%d exceeds device capacity\n",
 				disk->disk_name, p);
-			continue;
 		}
 		res = add_partition(disk, p, from, size, state->parts[p].flags);
 		if (res) {
_


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 23:36             ` Andrew Morton
@ 2008-09-12 23:46               ` David Brownell
  2008-09-12 23:52                 ` Andrew Morton
  2008-09-13  2:22                 ` Alan Stern
  2008-10-08 16:01               ` Kay Sievers
  1 sibling, 2 replies; 28+ messages in thread
From: David Brownell @ 2008-09-12 23:46 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Herton Ronaldo Krzesinski, me, stern, linux-kernel, linux-usb,
	bogdano, lcapitulino, draconux, dlallement, pterjan, axboe

On Friday 12 September 2008, Andrew Morton wrote:
>                         printk(KERN_ERR " %s: p%d exceeds device capacity\n",
>                                 disk->disk_name, p);
> -                       continue;
>                 }

So that now deserves to be KERN_WARN not KERN_ERR, yes?


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 23:46               ` David Brownell
@ 2008-09-12 23:52                 ` Andrew Morton
  2008-09-12 23:59                   ` David Brownell
  2008-09-13  2:22                 ` Alan Stern
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Morton @ 2008-09-12 23:52 UTC (permalink / raw)
  To: David Brownell
  Cc: herton, me, stern, linux-kernel, linux-usb, bogdano, lcapitulino,
	draconux, dlallement, pterjan, axboe

On Fri, 12 Sep 2008 16:46:08 -0700
David Brownell <david-b@pacbell.net> wrote:

> On Friday 12 September 2008, Andrew Morton wrote:
> > ________________________________________________printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> > ________________________________________________________________disk->disk_name, p);
> > -______________________________________________continue;
> > ________________________________}

wtf-your-email-client-is-insane.

> So that now deserves to be KERN_WARN not KERN_ERR, yes?

spose so.

I'm fairly unenthused about the recent KERN_correctness fad since it
went and broke sysrq-T output (you have to set the loglevel beforehand
to avoid getting only partial output).



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 23:52                 ` Andrew Morton
@ 2008-09-12 23:59                   ` David Brownell
  2008-09-13  0:13                     ` Andrew Morton
  0 siblings, 1 reply; 28+ messages in thread
From: David Brownell @ 2008-09-12 23:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: herton, me, stern, linux-kernel, linux-usb, bogdano, lcapitulino,
	draconux, dlallement, pterjan, axboe

> > > ________________________________________________________________disk->disk_name, p);
> > > -______________________________________________continue;
> > > ________________________________}
> 
> wtf-your-email-client-is-insane.

I have no idea where that came from.  Wasn't in the original or
in my copy.


> > So that now deserves to be KERN_WARN not KERN_ERR, yes?
> 
> spose so.
> 
> I'm fairly unenthused about the recent KERN_correctness fad since it
> went and broke sysrq-T output (you have to set the loglevel beforehand
> to avoid getting only partial output).

On development systems I generally "echo 8 > /proc/sysrq*"
to make sure KERN_DEBUG isn't hidden.

In this case it's just that I saw flamage a few minutes
earlier from someone trying to keep a distro boot from
spewing scarey messages for things that were NOT errors.
Like ... this message.  :)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 23:59                   ` David Brownell
@ 2008-09-13  0:13                     ` Andrew Morton
  0 siblings, 0 replies; 28+ messages in thread
From: Andrew Morton @ 2008-09-13  0:13 UTC (permalink / raw)
  To: David Brownell
  Cc: herton, me, stern, linux-kernel, linux-usb, bogdano, lcapitulino,
	draconux, dlallement, pterjan, axboe

On Fri, 12 Sep 2008 16:59:20 -0700
David Brownell <david-b@pacbell.net> wrote:

> > > > ________________________________________________________________disk->disk_name, p);
> > > > -______________________________________________continue;
> > > > ________________________________}
> > 
> > wtf-your-email-client-is-insane.
> 
> I have no idea where that came from.  Wasn't in the original or
> in my copy.
> 

Your email client added it.  What I sent:

0008A0: 68 65 63 6B 2E 63 0A 40 40 20 2D 35 34 30 2C 37      >heck.c@@@ -540,7<
0008B0: 20 2B 35 34 30 2C 36 20 40 40 20 69 6E 74 20 72      > +540,6 @@ int r<
0008C0: 65 73 63 61 6E 5F 70 61 72 74 69 74 69 6F 6E 73      >escan_partitions<
0008D0: 28 73 74 72 75 63 74 20 67 65 6E 64 69 73 6B 20      >(struct gendisk <
0008E0: 2A 64 69 0A 20 09 09 69 66 20 28 66 72 6F 6D 20      >*di@ @@if (from <
0008F0: 2B 20 73 69 7A 65 20 3E 20 67 65 74 5F 63 61 70      >+ size > get_cap<
000900: 61 63 69 74 79 28 64 69 73 6B 29 29 20 7B 0A 20      >acity(disk)) {@ <
000910: 09 09 09 70 72 69 6E 74 6B 28 4B 45 52 4E 5F 45      >@@@printk(KERN_E<
000920: 52 52 20 22 20 25 73 3A 20 70 25 64 20 65 78 63      >RR " %s: p%d exc<
000930: 65 65 64 73 20 64 65 76 69 63 65 20 63 61 70 61      >eeds device capa<
000940: 63 69 74 79 5C 6E 22 2C 0A 20 09 09 09 09 64 69      >city\n",@ @@@@di<
000950: 73 6B 2D 3E 64 69 73 6B 5F 6E 61 6D 65 2C 20 70      >sk->disk_name, p<
000960: 29 3B 0A 2D 09 09 09 63 6F 6E 74 69 6E 75 65 3B      >);@-@@@continue;<
000970: 0A 20 09 09 7D 0A 20 09 09 72 65 73 20 3D 20 61      >@ @@}@ @@res = a<

your reply:

000C50: A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0      >@@@@@@@@@@@@@@@@<
000C60: A0 70 72 69 6E 74 6B 28 4B 45 52 4E 5F 45 52 52      >@printk(KERN_ERR<
000C70: 20 22 20 25 73 3A 20 70 25 64 20 65 78 63 65 65      > " %s: p%d excee<
000C80: 64 73 20 64 65 76 69 63 65 20 63 61 70 61 63 69      >ds device capaci<
000C90: 74 79 5C 6E 22 2C 0A 3E 20 A0 A0 A0 A0 A0 A0 A0      >ty\n",@> @@@@@@@<
000CA0: A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0      >@@@@@@@@@@@@@@@@<
000CB0: A0 A0 A0 A0 A0 A0 A0 A0 A0 64 69 73 6B 2D 3E 64      >@@@@@@@@@disk->d<
000CC0: 69 73 6B 5F 6E 61 6D 65 2C 20 70 29 3B 0A 3E 20      >isk_name, p);@> <
000CD0: 2D A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0      >-@@@@@@@@@@@@@@@<
000CE0: A0 A0 A0 A0 A0 A0 A0 A0 63 6F 6E 74 69 6E 75 65      >@@@@@@@@continue<

I've ceased to be amazed at the stupid tricks which MUAs inflict upon
us in recent years.

> 
> > > So that now deserves to be KERN_WARN not KERN_ERR, yes?
> > 
> > spose so.
> > 
> > I'm fairly unenthused about the recent KERN_correctness fad since it
> > went and broke sysrq-T output (you have to set the loglevel beforehand
> > to avoid getting only partial output).
> 
> On development systems I generally "echo 8 > /proc/sysrq*"
> to make sure KERN_DEBUG isn't hidden.

Yeah, but it's another step we need to walk reporters through when
diagnosing problems.  Madly machine-gunning each others' feet.

> In this case it's just that I saw flamage a few minutes
> earlier from someone trying to keep a distro boot from
> spewing scarey messages for things that were NOT errors.
> Like ... this message.  :)

Lots of our messages should just be deleted.  I think people put them
in at development time and are then reluctant to clean them up.

Proposed algorithm:

- choose the message well so it can be googled for.

- time passes

- google for it

- if nobody's reporting it: kill.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 23:46               ` David Brownell
  2008-09-12 23:52                 ` Andrew Morton
@ 2008-09-13  2:22                 ` Alan Stern
  1 sibling, 0 replies; 28+ messages in thread
From: Alan Stern @ 2008-09-13  2:22 UTC (permalink / raw)
  To: David Brownell
  Cc: Andrew Morton, Herton Ronaldo Krzesinski, me, linux-kernel,
	linux-usb, bogdano, lcapitulino, draconux, dlallement, pterjan,
	axboe

On Fri, 12 Sep 2008, David Brownell wrote:

> On Friday 12 September 2008, Andrew Morton wrote:
> >                         printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> >                                 disk->disk_name, p);
> > -                       continue;
> >                 }
> 
> So that now deserves to be KERN_WARN not KERN_ERR, yes?

In addition it deserves to be commented so that somebody doesn't go and 
change it yet again.

Alan Stern


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
       [not found] ` <bbnhC-7Vd-5@gated-at.bofh.it>
@ 2008-09-13  9:24   ` Bodo Eggert
  2008-09-13 23:25     ` Herton Ronaldo Krzesinski
  0 siblings, 1 reply; 28+ messages in thread
From: Bodo Eggert @ 2008-09-13  9:24 UTC (permalink / raw)
  To: Herton Ronaldo Krzesinski, Alan Stern, linux-kernel, linux-usb,
	bogdano, Luiz Fernando N. Capitulino, Abdel Benamrouche,
	Damien Lallement, pterjan@mandriva.com, Jens Axboe

Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:

> +++ b/fs/partitions/check.c
> @@ -499,9 +499,9 @@ int rescan_partitions(struct gendisk *disk, struct
> block_device *bdev)
>  if (!size)
>  continue;
>  if (from + size > get_capacity(disk)) {
> -                     printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> +                     printk(KERN_WARNING
> +                             " %s: p%d exceeds device capacity"
                                ", trimmed\n",
>  disk->disk_name, p);

Add "size = get_capacity(disk) - from" here?
(Possibly using a cached value for get_capacity(disk)?)



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in  2.6.27
  2008-09-12 23:34 ` Andrew Morton
@ 2008-09-13 15:54   ` Bill Davidsen
  2008-09-13 22:56   ` Herton Ronaldo Krzesinski
  1 sibling, 0 replies; 28+ messages in thread
From: Bill Davidsen @ 2008-09-13 15:54 UTC (permalink / raw)
  To: linux-kernel

Andrew Morton wrote:
> On Fri, 12 Sep 2008 13:56:49 -0300
> Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:
> 
>> Recently I found a problem with a buggy camera that doesn't mount anymore with 
>> 2.6.27 (its memory is available via usb-storage), since commit 
>> 04ebd4aee52b06a2c38127d9208546e5b96f3a19
>>
>> The camera is an Olympus X-840. The original issue comes from the camera 
>> itself: its format program creates a partition with an off by one error, 
>> while the device reports that its memory has 42079 sectors, the partition 
>> table reports also that the only partition on the disk has the size of 42079, 
>> but it fails to account for the first sector in the memory that contains the 
>> partition table, so in the end the partition exceeds the limit of the device 
>> size (42080, first sector plus 42079 from the first partition).
>>
>> In previous kernels (2.6.26 and before), I still could mount and access the 
>> device (/dev/sdb1), although with the following errors:
> 
> Yeah.
> 
> Can you test this please?
> 
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Herton Krzesinski reports that the error-checking changes in
> 04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
> fs/partition/check.c: check value returned by add_partition") cause his
> buggy USB camera to no longer mount.  "The camera is an Olympus X-840. 
> The original issue comes from the camera itself: its format program
> creates a partition with an off by one error".
> 
> Buggy devices happen.  It is better for the kernel to warn and to proceed
> with the mount.
> 
> Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
> Cc: Abdel Benamrouche <draconux@gmail.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  fs/partitions/check.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff -puN fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal fs/partitions/check.c
> --- a/fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal
> +++ a/fs/partitions/check.c
> @@ -540,7 +540,6 @@ int rescan_partitions(struct gendisk *di
>  		if (from + size > get_capacity(disk)) {
>  			printk(KERN_ERR " %s: p%d exceeds device capacity\n",
>  				disk->disk_name, p);
> -			continue;
>  		}
>  		res = add_partition(disk, p, from, size, state->parts[p].flags);
>  		if (res) {
> _
> 
I really like the patch Herton posted, with the big comment on why you don't 
want continue, to avoid repeating this conversation in a few years.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 23:34 ` Andrew Morton
  2008-09-13 15:54   ` Bill Davidsen
@ 2008-09-13 22:56   ` Herton Ronaldo Krzesinski
  1 sibling, 0 replies; 28+ messages in thread
From: Herton Ronaldo Krzesinski @ 2008-09-13 22:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, inux-usb, bogdano, lcapitulino

On Friday 12 September 2008 20:34:16 Andrew Morton wrote:
> On Fri, 12 Sep 2008 13:56:49 -0300
> Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:
> 
> > Recently I found a problem with a buggy camera that doesn't mount anymore with 
> > 2.6.27 (its memory is available via usb-storage), since commit 
> > 04ebd4aee52b06a2c38127d9208546e5b96f3a19
> > 
> > The camera is an Olympus X-840. The original issue comes from the camera 
> > itself: its format program creates a partition with an off by one error, 
> > while the device reports that its memory has 42079 sectors, the partition 
> > table reports also that the only partition on the disk has the size of 42079, 
> > but it fails to account for the first sector in the memory that contains the 
> > partition table, so in the end the partition exceeds the limit of the device 
> > size (42080, first sector plus 42079 from the first partition).
> > 
> > In previous kernels (2.6.26 and before), I still could mount and access the 
> > device (/dev/sdb1), although with the following errors:
> 
> Yeah.
> 
> Can you test this please?

I have seen your reply on the other thread, but just to confirm yes, I tested
yesterday the same change that fixed the problem. I saw that the commit is
now on mainline, so regression fixed, thanks.

> 
> From: Andrew Morton <akpm@linux-foundation.org>
> 
> Herton Krzesinski reports that the error-checking changes in
> 04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
> fs/partition/check.c: check value returned by add_partition") cause his
> buggy USB camera to no longer mount.  "The camera is an Olympus X-840. 
> The original issue comes from the camera itself: its format program
> creates a partition with an off by one error".
> 
> Buggy devices happen.  It is better for the kernel to warn and to proceed
> with the mount.
> 
> Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
> Cc: Abdel Benamrouche <draconux@gmail.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  fs/partitions/check.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff -puN fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal fs/partitions/check.c
> --- a/fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal
> +++ a/fs/partitions/check.c
> @@ -540,7 +540,6 @@ int rescan_partitions(struct gendisk *di
>  		if (from + size > get_capacity(disk)) {
>  			printk(KERN_ERR " %s: p%d exceeds device capacity\n",
>  				disk->disk_name, p);
> -			continue;
>  		}
>  		res = add_partition(disk, p, from, size, state->parts[p].flags);
>  		if (res) {
> _
> 
> 

-- 
[]'s
Herton

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-13  9:24   ` Partition check considered as error is breaking mounting in 2.6.27 Bodo Eggert
@ 2008-09-13 23:25     ` Herton Ronaldo Krzesinski
  2008-09-14 12:36       ` Bodo Eggert
  0 siblings, 1 reply; 28+ messages in thread
From: Herton Ronaldo Krzesinski @ 2008-09-13 23:25 UTC (permalink / raw)
  To: 7eggert
  Cc: Alan Stern, linux-kernel, linux-usb, bogdano,
	Luiz Fernando N. Capitulino, Abdel Benamrouche, Damien Lallement,
	pterjan@mandriva.com, Jens Axboe

On Saturday 13 September 2008 06:24:27 Bodo Eggert wrote:
> Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:
> 
> > +++ b/fs/partitions/check.c
> > @@ -499,9 +499,9 @@ int rescan_partitions(struct gendisk *disk, struct
> > block_device *bdev)
> >  if (!size)
> >  continue;
> >  if (from + size > get_capacity(disk)) {
> > -                     printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> > +                     printk(KERN_WARNING
> > +                             " %s: p%d exceeds device capacity"
>                                 ", trimmed\n",
> >  disk->disk_name, p);
> 
> Add "size = get_capacity(disk) - from" here?
> (Possibly using a cached value for get_capacity(disk)?)

Trim was already proposed here before (not accepted):
http://lkml.org/lkml/2006/10/12/364

--
[]'s
Herton

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-13 23:25     ` Herton Ronaldo Krzesinski
@ 2008-09-14 12:36       ` Bodo Eggert
  2008-09-15 17:01         ` Bill Davidsen
  0 siblings, 1 reply; 28+ messages in thread
From: Bodo Eggert @ 2008-09-14 12:36 UTC (permalink / raw)
  To: Herton Ronaldo Krzesinski
  Cc: 7eggert, Alan Stern, linux-kernel, linux-usb, bogdano,
	Luiz Fernando N. Capitulino, Abdel Benamrouche, Damien Lallement,
	pterjan@mandriva.com, Jens Axboe

On Sat, 13 Sep 2008, Herton Ronaldo Krzesinski wrote:
> On Saturday 13 September 2008 06:24:27 Bodo Eggert wrote:
> > Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:

> > > +++ b/fs/partitions/check.c
> > > @@ -499,9 +499,9 @@ int rescan_partitions(struct gendisk *disk, struct
> > > block_device *bdev)
> > >  if (!size)
> > >  continue;
> > >  if (from + size > get_capacity(disk)) {
> > > -                     printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> > > +                     printk(KERN_WARNING
> > > +                             " %s: p%d exceeds device capacity"
> >                                 ", trimmed\n",
> > >  disk->disk_name, p);
> > 
> > Add "size = get_capacity(disk) - from" here?
> > (Possibly using a cached value for get_capacity(disk)?)
> 
> Trim was already proposed here before (not accepted):
> http://lkml.org/lkml/2006/10/12/364

Because of - as far as I read - possible special-case-forensic-situations 
and possible meddling-with-the-jumper.

IMO:

a) If you need fancy partition handling, you can do it in userspace.
Copying a few MB to a completely different machine, expecting it to work
correctly while depending on the partition numbering to stay the same is
just stupid, and it won't work if you mount by path, anyway. It won't work 
for extended partitions, too, if they don't start in the copied area, 
because they are a chain of partition tables.

Ordinary users need a correctly-working, non-logspewing ("Access beyond
end of disk") system, including the possibility to correctly create
filesystems or to use md on the disks. Having an IO-error-area at the end
of the partition won't exactly help, but maybe destroy the 
HAL-automounted-using-the-wrong-options* filesystem.

Advanced users like me can set up a system in qemu, transfer it to a 
temporary directory, start the old system into /bin/sh, make some bind 
mounts and start the new system.

I might also copy one disk onto two disks, planning to concatenate the
split partition using dmsetup.

Even more advanced users do forensics. Do they really need us to prevent
partition renumbering, as long as there ia a hex editor to trim the
partitions? Can't they just write a small program to fix these partitions
and run it after dd finished? Or create an intrd doing what they want?
Or wouldn't they be be better of with something like "hda.size=0815"?

(*BTW: Is there any documentation for the pile-of-HAL-DBUS-KDE-and-stuff,
which would allow me to actually configure it, and possibly unmount the
devices without sudo?)


b) If there is a soft-clipped disk, find out the unclipped size (and 
unclip it on finding a large partition, if possible and if the partition 
is not "hidden").

If you need to enter a password, the partial partitions should be
read-only and be switched to read-write after entering the password.
Having a corrupted fs because of entering a wrong password is not a good
start for a day. (Yes, my suggestion to just clip it was not enough.)

If you should ever need to access a partition exceeding the true disk
size, you can always use dmsetup or losetup. And don't write to it until
you've copied all the data you want. And especially dont complain about
resulting corruptions.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in  2.6.27
  2008-09-14 12:36       ` Bodo Eggert
@ 2008-09-15 17:01         ` Bill Davidsen
  0 siblings, 0 replies; 28+ messages in thread
From: Bill Davidsen @ 2008-09-15 17:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-usb

Bodo Eggert wrote:

> Ordinary users need a correctly-working, non-logspewing ("Access beyond
> end of disk") system, including the possibility to correctly create
> filesystems or to use md on the disks. Having an IO-error-area at the end
> of the partition won't exactly help, but maybe destroy the 
> HAL-automounted-using-the-wrong-options* filesystem.
> 
It's not clear that "ordinary users" ever have this problem. It seems to happen 
*only* when using a drive which has something which looks like a partition table 
but isn't, or is a defective partition table, or will only work when diddling 
with HPA magic.

And should some naive user be trying to use such a drive, it seems better to 
tell them, to beat them over the head with the problem, because if you hide the 
problem those users are least likely to be able to recover when problems arise.

I'm not against some sensible recovery, I just don't agree that "pretend it 
isn't wrong" is a good default solution.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-09-12 23:36             ` Andrew Morton
  2008-09-12 23:46               ` David Brownell
@ 2008-10-08 16:01               ` Kay Sievers
  2008-10-09 14:04                 ` Kay Sievers
  1 sibling, 1 reply; 28+ messages in thread
From: Kay Sievers @ 2008-10-08 16:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Herton Ronaldo Krzesinski, me, stern, linux-kernel, linux-usb,
	bogdano, lcapitulino, draconux, dlallement, pterjan, axboe

On Sat, Sep 13, 2008 at 1:36 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Fri, 12 Sep 2008 18:07:27 -0300
> Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:
>
>> Yes, here goes a new version:
>
> Well gee.  Given a choice, I went and replied to the wrong thread.
> Here's what I think:
>
> From: Andrew Morton <akpm@linux-foundation.org>
>
> Herton Krzesinski reports that the error-checking changes in
> 04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
> fs/partition/check.c: check value returned by add_partition") cause his
> buggy USB camera to no longer mount.  "The camera is an Olympus X-840.
> The original issue comes from the camera itself: its format program
> creates a partition with an off by one error".
>
> Buggy devices happen.  It is better for the kernel to warn and to proceed
> with the mount.
>
> Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
> Cc: Abdel Benamrouche <draconux@gmail.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
>  fs/partitions/check.c |    1 -
>  1 file changed, 1 deletion(-)
>
> diff -puN fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal fs/partitions/check.c
> --- a/fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal
> +++ a/fs/partitions/check.c
> @@ -540,7 +540,6 @@ int rescan_partitions(struct gendisk *di
>                if (from + size > get_capacity(disk)) {
>                        printk(KERN_ERR " %s: p%d exceeds device capacity\n",
>                                disk->disk_name, p);
> -                       continue;
>                }
>                res = add_partition(disk, p, from, size, state->parts[p].flags);
>                if (res) {

I was happy to see the original fix, as if causes real problems for
userspace, that the kernel creates invalid block devices with a size
that exceeds the physical disk. So, if we can not make that partition
to skip, like original patch did, because of broken hardware we don't
want to break, can we make it at least do the obvious thing, and limit
the partition with the broken entry to the size of the underlying
hardware. So that the kernel does no longer pretend to have devices of
a size which the hardware does not have.

It breaks all sort of userspace tools which read the "size" file in
sysfs, or do BLKGETSIZE and we get, if we are lucky, only:
  attempt to access beyond end of device
  sda: rw=0, want=1953535936, limit=976773168
in other cases it might cause corruption, or lead mkfs to create
devices which will fail when they get used.

Thanks,
Kay

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-10-08 16:01               ` Kay Sievers
@ 2008-10-09 14:04                 ` Kay Sievers
  2008-10-13  9:01                   ` Jens Axboe
  0 siblings, 1 reply; 28+ messages in thread
From: Kay Sievers @ 2008-10-09 14:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Herton Ronaldo Krzesinski, me, stern, linux-kernel, linux-usb,
	bogdano, lcapitulino, draconux, dlallement, pterjan, axboe

On Wed, 2008-10-08 at 18:01 +0200, Kay Sievers wrote:
> On Sat, Sep 13, 2008 at 1:36 AM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > On Fri, 12 Sep 2008 18:07:27 -0300
> > Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:
> >
> >> Yes, here goes a new version:
> >
> > Well gee.  Given a choice, I went and replied to the wrong thread.
> > Here's what I think:
> >
> > From: Andrew Morton <akpm@linux-foundation.org>
> >
> > Herton Krzesinski reports that the error-checking changes in
> > 04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
> > fs/partition/check.c: check value returned by add_partition") cause his
> > buggy USB camera to no longer mount.  "The camera is an Olympus X-840.
> > The original issue comes from the camera itself: its format program
> > creates a partition with an off by one error".
> >
> > Buggy devices happen.  It is better for the kernel to warn and to proceed
> > with the mount.
> >
> > Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
> > Cc: Abdel Benamrouche <draconux@gmail.com>
> > Cc: Jens Axboe <jens.axboe@oracle.com>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> >
> >  fs/partitions/check.c |    1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff -puN fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal fs/partitions/check.c
> > --- a/fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal
> > +++ a/fs/partitions/check.c
> > @@ -540,7 +540,6 @@ int rescan_partitions(struct gendisk *di
> >                if (from + size > get_capacity(disk)) {
> >                        printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> >                                disk->disk_name, p);
> > -                       continue;
> >                }
> >                res = add_partition(disk, p, from, size, state->parts[p].flags);
> >                if (res) {
> 
> I was happy to see the original fix, as if causes real problems for
> userspace, that the kernel creates invalid block devices with a size
> that exceeds the physical disk. So, if we can not make that partition
> to skip, like original patch did, because of broken hardware we don't
> want to break, can we make it at least do the obvious thing, and limit
> the partition with the broken entry to the size of the underlying
> hardware. So that the kernel does no longer pretend to have devices of
> a size which the hardware does not have.
> 
> It breaks all sort of userspace tools which read the "size" file in
> sysfs, or do BLKGETSIZE and we get, if we are lucky, only:
>   attempt to access beyond end of device
>   sda: rw=0, want=1953535936, limit=976773168
> in other cases it might cause corruption, or lead mkfs to create
> devices which will fail when they get used.


From: Kay Sievers <kay.sievers@vrfy.org>
Subject: block: sanitize invalid partition table entries

We currently follow blindly what the partition table lies about the
disk, and let the kernel create block devices which can not be accessed.
Trying to identify the device leads to kernel logs full of:
  sdb: rw=0, want=73392, limit=28800
  attempt to access beyond end of device

Here is an example of a broken partition table, where sda2 starts
behind the end of the disk, and sdb3 is larger than the entire disk:
  Disk /dev/sdb: 14 MB, 14745600 bytes
  1 heads, 29 sectors/track, 993 cylinders, total 28800 sectors
     Device Boot      Start         End      Blocks   Id  System
  /dev/sdb1              29        7800        3886   83  Linux
  /dev/sdb2           37801       45601        3900+  83  Linux
  /dev/sdb3           15602       73402       28900+  83  Linux
  /dev/sdb4           23403       28796        2697   83  Linux

The kernel creates these completely invalid devices, which can not be
accessed, or may lead to other unpredictable failures:
  grep . /sys/class/block/sdb*/{start,size}
  /sys/class/block/sdb/size:28800
  /sys/class/block/sdb1/start:29
  /sys/class/block/sdb1/size:7772
  /sys/class/block/sdb2/start:37801
  /sys/class/block/sdb2/size:7801
  /sys/class/block/sdb3/start:15602
  /sys/class/block/sdb3/size:57801
  /sys/class/block/sdb4/start:23403
  /sys/class/block/sdb4/size:5394

With this patch, we ignore partitions which start behind the end of the disk,
and limit partitions to the end of the disk if they pretend to be larger:
  grep . /sys/class/block/sdb*/{start,size}
  /sys/class/block/sdb/size:28800
  /sys/class/block/sdb1/start:29
  /sys/class/block/sdb1/size:7772
  /sys/class/block/sdb3/start:15602
  /sys/class/block/sdb3/size:13198
  /sys/class/block/sdb4/start:23403
  /sys/class/block/sdb4/size:5394

These warnings are printed to the kernel log:
  sdb: p2 ignored, start 37801 is behind the end of the disk
  sdb: p3 size 57801 limited to end of disk

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
---

 check.c |   17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)


diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index ecc3330..9545d8c 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -498,10 +498,23 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
 		sector_t from = state->parts[p].from;
 		if (!size)
 			continue;
+		if (from >= get_capacity(disk)) {
+			printk(KERN_WARNING
+			       "%s: p%d ignored, start %llu is behind the end of the disk\n",
+			       disk->disk_name, p, (unsigned long long) from);
+			continue;
+		}
 		if (from + size > get_capacity(disk)) {
+			/*
+			 * we can not ignore partitions of broken tables
+			 * created by for example camera firmware, but we
+			 * limit them to the end of the disk to avoid
+			 * creating invalid block devices
+			 */
 			printk(KERN_WARNING
-				"%s: p%d exceeds device capacity\n",
-				disk->disk_name, p);
+			       "%s: p%d size %llu limited to end of disk\n",
+			       disk->disk_name, p, (unsigned long long) size);
+			size = get_capacity(disk) - from;
 		}
 		res = add_partition(disk, p, from, size, state->parts[p].flags);
 		if (res) {



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: Partition check considered as error is breaking mounting in 2.6.27
  2008-10-09 14:04                 ` Kay Sievers
@ 2008-10-13  9:01                   ` Jens Axboe
  0 siblings, 0 replies; 28+ messages in thread
From: Jens Axboe @ 2008-10-13  9:01 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Andrew Morton, Herton Ronaldo Krzesinski, me, stern, linux-kernel,
	linux-usb, bogdano, lcapitulino, draconux, dlallement, pterjan

On Thu, Oct 09 2008, Kay Sievers wrote:
> On Wed, 2008-10-08 at 18:01 +0200, Kay Sievers wrote:
> > On Sat, Sep 13, 2008 at 1:36 AM, Andrew Morton
> > <akpm@linux-foundation.org> wrote:
> > > On Fri, 12 Sep 2008 18:07:27 -0300
> > > Herton Ronaldo Krzesinski <herton@mandriva.com.br> wrote:
> > >
> > >> Yes, here goes a new version:
> > >
> > > Well gee.  Given a choice, I went and replied to the wrong thread.
> > > Here's what I think:
> > >
> > > From: Andrew Morton <akpm@linux-foundation.org>
> > >
> > > Herton Krzesinski reports that the error-checking changes in
> > > 04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
> > > fs/partition/check.c: check value returned by add_partition") cause his
> > > buggy USB camera to no longer mount.  "The camera is an Olympus X-840.
> > > The original issue comes from the camera itself: its format program
> > > creates a partition with an off by one error".
> > >
> > > Buggy devices happen.  It is better for the kernel to warn and to proceed
> > > with the mount.
> > >
> > > Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
> > > Cc: Abdel Benamrouche <draconux@gmail.com>
> > > Cc: Jens Axboe <jens.axboe@oracle.com>
> > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > > ---
> > >
> > >  fs/partitions/check.c |    1 -
> > >  1 file changed, 1 deletion(-)
> > >
> > > diff -puN fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal fs/partitions/check.c
> > > --- a/fs/partitions/check.c~rescan_partitions-make-device-capacity-errors-non-fatal
> > > +++ a/fs/partitions/check.c
> > > @@ -540,7 +540,6 @@ int rescan_partitions(struct gendisk *di
> > >                if (from + size > get_capacity(disk)) {
> > >                        printk(KERN_ERR " %s: p%d exceeds device capacity\n",
> > >                                disk->disk_name, p);
> > > -                       continue;
> > >                }
> > >                res = add_partition(disk, p, from, size, state->parts[p].flags);
> > >                if (res) {
> > 
> > I was happy to see the original fix, as if causes real problems for
> > userspace, that the kernel creates invalid block devices with a size
> > that exceeds the physical disk. So, if we can not make that partition
> > to skip, like original patch did, because of broken hardware we don't
> > want to break, can we make it at least do the obvious thing, and limit
> > the partition with the broken entry to the size of the underlying
> > hardware. So that the kernel does no longer pretend to have devices of
> > a size which the hardware does not have.
> > 
> > It breaks all sort of userspace tools which read the "size" file in
> > sysfs, or do BLKGETSIZE and we get, if we are lucky, only:
> >   attempt to access beyond end of device
> >   sda: rw=0, want=1953535936, limit=976773168
> > in other cases it might cause corruption, or lead mkfs to create
> > devices which will fail when they get used.
> 
> 
> From: Kay Sievers <kay.sievers@vrfy.org>
> Subject: block: sanitize invalid partition table entries
> 
> We currently follow blindly what the partition table lies about the
> disk, and let the kernel create block devices which can not be accessed.
> Trying to identify the device leads to kernel logs full of:
>   sdb: rw=0, want=73392, limit=28800
>   attempt to access beyond end of device
> 
> Here is an example of a broken partition table, where sda2 starts
> behind the end of the disk, and sdb3 is larger than the entire disk:
>   Disk /dev/sdb: 14 MB, 14745600 bytes
>   1 heads, 29 sectors/track, 993 cylinders, total 28800 sectors
>      Device Boot      Start         End      Blocks   Id  System
>   /dev/sdb1              29        7800        3886   83  Linux
>   /dev/sdb2           37801       45601        3900+  83  Linux
>   /dev/sdb3           15602       73402       28900+  83  Linux
>   /dev/sdb4           23403       28796        2697   83  Linux
> 
> The kernel creates these completely invalid devices, which can not be
> accessed, or may lead to other unpredictable failures:
>   grep . /sys/class/block/sdb*/{start,size}
>   /sys/class/block/sdb/size:28800
>   /sys/class/block/sdb1/start:29
>   /sys/class/block/sdb1/size:7772
>   /sys/class/block/sdb2/start:37801
>   /sys/class/block/sdb2/size:7801
>   /sys/class/block/sdb3/start:15602
>   /sys/class/block/sdb3/size:57801
>   /sys/class/block/sdb4/start:23403
>   /sys/class/block/sdb4/size:5394
> 
> With this patch, we ignore partitions which start behind the end of the disk,
> and limit partitions to the end of the disk if they pretend to be larger:
>   grep . /sys/class/block/sdb*/{start,size}
>   /sys/class/block/sdb/size:28800
>   /sys/class/block/sdb1/start:29
>   /sys/class/block/sdb1/size:7772
>   /sys/class/block/sdb3/start:15602
>   /sys/class/block/sdb3/size:13198
>   /sys/class/block/sdb4/start:23403
>   /sys/class/block/sdb4/size:5394
> 
> These warnings are printed to the kernel log:
>   sdb: p2 ignored, start 37801 is behind the end of the disk
>   sdb: p3 size 57801 limited to end of disk
> 
> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
> ---
> 
>  check.c |   17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/fs/partitions/check.c b/fs/partitions/check.c
> index ecc3330..9545d8c 100644
> --- a/fs/partitions/check.c
> +++ b/fs/partitions/check.c
> @@ -498,10 +498,23 @@ int rescan_partitions(struct gendisk *disk, struct block_device *bdev)
>  		sector_t from = state->parts[p].from;
>  		if (!size)
>  			continue;
> +		if (from >= get_capacity(disk)) {
> +			printk(KERN_WARNING
> +			       "%s: p%d ignored, start %llu is behind the end of the disk\n",
> +			       disk->disk_name, p, (unsigned long long) from);
> +			continue;
> +		}
>  		if (from + size > get_capacity(disk)) {
> +			/*
> +			 * we can not ignore partitions of broken tables
> +			 * created by for example camera firmware, but we
> +			 * limit them to the end of the disk to avoid
> +			 * creating invalid block devices
> +			 */
>  			printk(KERN_WARNING
> -				"%s: p%d exceeds device capacity\n",
> -				disk->disk_name, p);
> +			       "%s: p%d size %llu limited to end of disk\n",
> +			       disk->disk_name, p, (unsigned long long) size);
> +			size = get_capacity(disk) - from;
>  		}
>  		res = add_partition(disk, p, from, size, state->parts[p].flags);
>  		if (res) {
> 
> 

Kay, I added this variant.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2008-10-13  9:01 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bblSy-60j-19@gated-at.bofh.it>
     [not found] ` <bbnhC-7Vd-5@gated-at.bofh.it>
2008-09-13  9:24   ` Partition check considered as error is breaking mounting in 2.6.27 Bodo Eggert
2008-09-13 23:25     ` Herton Ronaldo Krzesinski
2008-09-14 12:36       ` Bodo Eggert
2008-09-15 17:01         ` Bill Davidsen
2008-09-12 17:32 Toralf Förster
  -- strict thread matches above, loose matches on Subject: below --
2008-09-12 17:01 Herton Ronaldo Krzesinski
2008-09-12 17:36 ` Alan Stern
2008-09-12 17:59   ` Bob Copeland
2008-09-12 18:21     ` Alan Stern
2008-09-12 18:02   ` Herton Ronaldo Krzesinski
2008-09-12 18:40     ` Alan Stern
2008-09-12 20:14       ` Herton Ronaldo Krzesinski
2008-09-12 20:17         ` Herton Ronaldo Krzesinski
2008-09-12 20:27         ` Bob Copeland
2008-09-12 21:07           ` Herton Ronaldo Krzesinski
2008-09-12 23:36             ` Andrew Morton
2008-09-12 23:46               ` David Brownell
2008-09-12 23:52                 ` Andrew Morton
2008-09-12 23:59                   ` David Brownell
2008-09-13  0:13                     ` Andrew Morton
2008-09-13  2:22                 ` Alan Stern
2008-10-08 16:01               ` Kay Sievers
2008-10-09 14:04                 ` Kay Sievers
2008-10-13  9:01                   ` Jens Axboe
2008-09-12 16:56 Herton Ronaldo Krzesinski
2008-09-12 23:34 ` Andrew Morton
2008-09-13 15:54   ` Bill Davidsen
2008-09-13 22:56   ` Herton Ronaldo Krzesinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox