* Re: Linux 2.6.24.atmel.1 MMC/SD
[not found] ` <47BD1760.9080007@yahoo.es>
@ 2008-02-21 11:35 ` Haavard Skinnemoen
2008-02-21 17:45 ` Hein_Tibosch
2008-03-04 19:42 ` Ext2 - ext3 unstable under 2.6.24: now solved (?) Hein_Tibosch
0 siblings, 2 replies; 4+ messages in thread
From: Haavard Skinnemoen @ 2008-02-21 11:35 UTC (permalink / raw)
To: Hein_Tibosch; +Cc: James Stewart, kernel, linux-ext4
(Adding the ext2/ext3/ext4 list to Cc)
Note that the MMC/SD card driver in question, atmel-mci, is not in
mainline, and may be the real cause of this problem. But it looks like
there might be a potential problem in the ext3 code as well?
Haavard
On Thu, 21 Feb 2008 14:17:04 +0800
Hein_Tibosch <hein_tibosch@yahoo.es> wrote:
> Hi James,
>
>
> I've had all kinds of problems with the SD-card hooked to an NGW100, just as John Voltz reported earlier:
>
> http://www.avr32linux.org/archives/kernel/2007-November/000421.html
> http://www.avr32linux.org/archives/kernel/2007-November/000425.html
>
> I debugged this problem and my conclusion is: using an SD-card may lead to both BUS-errors and a complete hanging of the system, with 2.6.23.atmel.5 as well as 2.6.24.atmel.1.
>
> Both the driver for ext2 and ext3 are using this type of function to iterate through a array of inodes:
>
> static inline ext2_dirent *ext2_next_entry(ext2_dirent *p)
> {
> return (ext2_dirent *)((char*)p + le16_to_cpu(p->rec_len));
> }
>
> static inline struct ext3_dir_entry_2 *
> ext3_next_entry(struct ext3_dir_entry_2 *p)
> {
> return (struct ext3_dir_entry_2 *)((char *)p +
> ext3_rec_len_from_disk(p->rec_len));
> }
>
>
> Sometimes, rec_len is checked for a zero-value, sometimes the entry is checked thoroughly for validity (like with ext2_check_page() or ext3_check_dir_entry()), but in other cases rec_len isn't checked at all! This is the case in e.g. fs/ext3/namei.c, function ext3_dx_find_entry(). This function is always enabled since 2.6.24 (CONFIG_EXT3_INDEX not used anymore).
>
> I had a card on which at one place rec_len turned out to be a small negative number. When iterating, it would either cycle for ever (until WDT) or it could enter invalid memory (OOPS: BUS error).
>
> ( strange though that the rec_len appeared to have a negative number, I just did a "mkfs -t ext3" on Ubuntu. Could that be caused by the Atmel-driver? )
>
> I don't yet feel qualified to make a patch for this, I only did it for myself. Maybe someone can pick this up: a validity check should be made before any call to xxx_next_entry().
>
>
> Regards,
>
> Hein Tibosch (HeinBali at avr32linux)
>
>
>
> James Stewart wrote:
> Hi,
>
> I'm wondering if there are any known issues with booting from SD card on the ATNGW100 using this kernel. I get a bunch of ext2 looking errors and then a stack dump immediately after mounting VFS. 2.6.23.atmel.5 runs perfectly, however.
>
> This is just compiling using atngw100_defconfig.
>
> Thanks,
>
> James
>
> ------------------------------------------------
>
> _______________________________________________
> Kernel mailing list
> Kernel@avr32linux.org
> http://duppen.flaskehals.net/cgi-bin/mailman/listinfo/kernel
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Linux 2.6.24.atmel.1 MMC/SD
2008-02-21 11:35 ` Linux 2.6.24.atmel.1 MMC/SD Haavard Skinnemoen
@ 2008-02-21 17:45 ` Hein_Tibosch
2008-03-04 19:42 ` Ext2 - ext3 unstable under 2.6.24: now solved (?) Hein_Tibosch
1 sibling, 0 replies; 4+ messages in thread
From: Hein_Tibosch @ 2008-02-21 17:45 UTC (permalink / raw)
To: James Stewart; +Cc: kernel, linux-ext4
My email got refused because it contained HTML. Here again:
Crashes in ext2 and ext3 filesystem:
John's dump shows that PC was in ext2_find_entry while it crashed. I had
exactly the same type of OOPS with ext3_find_entry and found it was
actually in the static function ext3_dx_find_entry, which was inlined by
the compiler. There it got stuck because of an invalid rec_len, which
made the pointer decrease (wrap around) in stead of increase.
And so, ext3 looks as vulnerable as ext2, because in both drivers this
iteration sometimes takes place without checking the validity of data
read from the SD-card.
Whatever data the Atmel driver delivers, I think it should be checked,
also for rec_len values that cause a wrap-around of the pointer. Now
that I have it checked it more thoroughly, my NGW100 boots and runs
perfect from an SD-card.
Hein Tibosch
John Voltz wrote:
> I believe this might be what he is talking about. I had to reformat
my SD card to ext3 to use 2.6.23/24
>
> John
>
> Oops: Unhandled exception in kernel mode, sig: 7 [#1]
> PREEMPT FRAME_POINTER chip: 0x01f:0x1e82 rev 2
> Modules linked in: snd_pcm_oss snd_mixer_oss snd_atmel_ac97
snd_ac97_codec =
> snd_pcm snd_timer snd soundcore snd_page_alloc ac97_bus
> PC is at ext2_find_entry+0x9c/0x16c
Haavard Skinnemoen wrote:
> (Adding the ext2/ext3/ext4 list to Cc)
>
> Note that the MMC/SD card driver in question, atmel-mci, is not in
> mainline, and may be the real cause of this problem. But it looks like
> there might be a potential problem in the ext3 code as well?
>
> Haavard
>
> On Thu, 21 Feb 2008 14:17:04 +0800
> Hein_Tibosch <hein_tibosch@yahoo.es> wrote:
>
>
>> Hi James,
>>
>>
>> I've had all kinds of problems with the SD-card hooked to an NGW100, just as John Voltz reported earlier:
>>
>> http://www.avr32linux.org/archives/kernel/2007-November/000421.html
>> http://www.avr32linux.org/archives/kernel/2007-November/000425.html
>>
>> I debugged this problem and my conclusion is: using an SD-card may lead to both BUS-errors and a complete hanging of the system, with 2.6.23.atmel.5 as well as 2.6.24.atmel.1.
>>
>> Both the driver for ext2 and ext3 are using this type of function to iterate through a array of inodes:
>>
>> static inline ext2_dirent *ext2_next_entry(ext2_dirent *p)
>> {
>> return (ext2_dirent *)((char*)p + le16_to_cpu(p->rec_len));
>> }
>>
>> static inline struct ext3_dir_entry_2 *
>> ext3_next_entry(struct ext3_dir_entry_2 *p)
>> {
>> return (struct ext3_dir_entry_2 *)((char *)p +
>> ext3_rec_len_from_disk(p->rec_len));
>> }
>>
>>
>> Sometimes, rec_len is checked for a zero-value, sometimes the entry is checked thoroughly for validity (like with ext2_check_page() or ext3_check_dir_entry()), but in other cases rec_len isn't checked at all! This is the case in e.g. fs/ext3/namei.c, function ext3_dx_find_entry(). This function is always enabled since 2.6.24 (CONFIG_EXT3_INDEX not used anymore).
>>
>> I had a card on which at one place rec_len turned out to be a small negative number. When iterating, it would either cycle for ever (until WDT) or it could enter invalid memory (OOPS: BUS error).
>>
>> ( strange though that the rec_len appeared to have a negative number, I just did a "mkfs -t ext3" on Ubuntu. Could that be caused by the Atmel-driver? )
>>
>> I don't yet feel qualified to make a patch for this, I only did it for myself. Maybe someone can pick this up: a validity check should be made before any call to xxx_next_entry().
>>
>>
>> Regards,
>>
>> Hein Tibosch (HeinBali at avr32linux)
>>
>>
>>
>> James Stewart wrote:
>> Hi,
>>
>> I'm wondering if there are any known issues with booting from SD card on the ATNGW100 using this kernel. I get a bunch of ext2 looking errors and then a stack dump immediately after mounting VFS. 2.6.23.atmel.5 runs perfectly, however.
>>
>> This is just compiling using atngw100_defconfig.
>>
>> Thanks,
>>
>> James
>>
>> ------------------------------------------------
>>
>> _______________________________________________
>> Kernel mailing list
>> Kernel@avr32linux.org
>> http://duppen.flaskehals.net/cgi-bin/mailman/listinfo/kernel
>>
>>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Ext2 - ext3 unstable under 2.6.24: now solved (?)
2008-02-21 11:35 ` Linux 2.6.24.atmel.1 MMC/SD Haavard Skinnemoen
2008-02-21 17:45 ` Hein_Tibosch
@ 2008-03-04 19:42 ` Hein_Tibosch
2008-03-05 0:22 ` Andreas Dilger
1 sibling, 1 reply; 4+ messages in thread
From: Hein_Tibosch @ 2008-03-04 19:42 UTC (permalink / raw)
To: James Stewart; +Cc: linux-ext4
Could someone please check the following?
The ext2 and ext3 filesystems of 2.6.24 show many Oops and hangups.
After debugging I found the following common cause:
In a new 2.6.24 function an unwanted sign-extension takes place in:
fs/ext2/dir.c
static inline unsigned ext2_rec_len_from_disk(__le16 dlen)
{
unsigned len = le16_to_cpu(dlen);
if (len == EXT2_MAX_REC_LEN)
return 1 << 16;
return len;
}
include/ext3_fs.h :
static inline unsigned ext3_rec_len_from_disk(__le16 dlen)
{
unsigned len = le16_to_cpu(dlen);
if (len == EXT3_MAX_REC_LEN)
return 1 << 16;
return len;
}
00A0 will be returned as 0xFFFFA000 !!
Many code which iterates through dirent's, uses the above function to
determine the start of the next dirent.(ext2_dirent, ext3_dir_entry_2)
See fs/ext2/dir.c and fs/ext3/namei.c
As a test I replaced "le16_to_cpu()" by a simple:
static inline unsigned my_le16_to_cpu (__le16 value)
{
return ((value & 0x00FF) << 8) | ((value & 0xFF00) >> 8);
}
It showed no more "negative" rec_len values which cause the crashes, and
both ext2/3 now run stable.
Compiler: gcc version 4.1.2 (Ubuntu 4.1.2-0ubuntu4)
Kernel: 2.6.24.atmel.1
Platform: Atmel AP7000 CPU, compiling with "ARCH=avr32
CROSS_COMPILE=avr32-linux-"
Hein Tibosch
Haavard Skinnemoen wrote:
> (Adding the ext2/ext3/ext4 list to Cc)
>
> Note that the MMC/SD card driver in question, atmel-mci, is not in
> mainline, and may be the real cause of this problem. But it looks like
> there might be a potential problem in the ext3 code as well?
>
> Haavard
>
> On Thu, 21 Feb 2008 14:17:04 +0800
> Hein_Tibosch <hein_tibosch@yahoo.es> wrote:
>
>
>> Hi James,
>>
>>
>> I've had all kinds of problems with the SD-card hooked to an NGW100, just as John Voltz reported earlier:
>>
>> http://www.avr32linux.org/archives/kernel/2007-November/000421.html
>> http://www.avr32linux.org/archives/kernel/2007-November/000425.html
>>
>> I debugged this problem and my conclusion is: using an SD-card may lead to both BUS-errors and a complete hanging of the system, with 2.6.23.atmel.5 as well as 2.6.24.atmel.1.
>>
>> Both the driver for ext2 and ext3 are using this type of function to iterate through a array of inodes:
>>
>> static inline ext2_dirent *ext2_next_entry(ext2_dirent *p)
>> {
>> return (ext2_dirent *)((char*)p + le16_to_cpu(p->rec_len));
>> }
>>
>> static inline struct ext3_dir_entry_2 *
>> ext3_next_entry(struct ext3_dir_entry_2 *p)
>> {
>> return (struct ext3_dir_entry_2 *)((char *)p +
>> ext3_rec_len_from_disk(p->rec_len));
>> }
>>
>>
>> Sometimes, rec_len is checked for a zero-value, sometimes the entry is checked thoroughly for validity (like with ext2_check_page() or ext3_check_dir_entry()), but in other cases rec_len isn't checked at all! This is the case in e.g. fs/ext3/namei.c, function ext3_dx_find_entry(). This function is always enabled since 2.6.24 (CONFIG_EXT3_INDEX not used anymore).
>>
>> I had a card on which at one place rec_len turned out to be a small negative number. When iterating, it would either cycle for ever (until WDT) or it could enter invalid memory (OOPS: BUS error).
>>
>> ( strange though that the rec_len appeared to have a negative number, I just did a "mkfs -t ext3" on Ubuntu. Could that be caused by the Atmel-driver? )
>>
>> I don't yet feel qualified to make a patch for this, I only did it for myself. Maybe someone can pick this up: a validity check should be made before any call to xxx_next_entry().
>>
>>
>> Regards,
>>
>> Hein Tibosch (HeinBali at avr32linux)
>>
>>
>>
>> James Stewart wrote:
>> Hi,
>>
>> I'm wondering if there are any known issues with booting from SD card on the ATNGW100 using this kernel. I get a bunch of ext2 looking errors and then a stack dump immediately after mounting VFS. 2.6.23.atmel.5 runs perfectly, however.
>>
>> This is just compiling using atngw100_defconfig.
>>
>> Thanks,
>>
>> James
>>
>> ------------------------------------------------
>>
>> _______________________________________________
>> Kernel mailing list
>> Kernel@avr32linux.org
>> http://duppen.flaskehals.net/cgi-bin/mailman/listinfo/kernel
>>
>>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Ext2 - ext3 unstable under 2.6.24: now solved (?)
2008-03-04 19:42 ` Ext2 - ext3 unstable under 2.6.24: now solved (?) Hein_Tibosch
@ 2008-03-05 0:22 ` Andreas Dilger
0 siblings, 0 replies; 4+ messages in thread
From: Andreas Dilger @ 2008-03-05 0:22 UTC (permalink / raw)
To: Hein_Tibosch; +Cc: James Stewart, linux-ext4
On Mar 05, 2008 03:42 +0800, Hein_Tibosch wrote:
> Could someone please check the following?
>
> The ext2 and ext3 filesystems of 2.6.24 show many Oops and hangups. After
> debugging I found the following common cause:
>
> In a new 2.6.24 function an unwanted sign-extension takes place in:
>
> fs/ext2/dir.c
>
> static inline unsigned ext2_rec_len_from_disk(__le16 dlen)
> {
> unsigned len = le16_to_cpu(dlen);
>
> if (len == EXT2_MAX_REC_LEN)
> return 1 << 16;
> return len;
> }
>
> include/ext3_fs.h :
>
> static inline unsigned ext3_rec_len_from_disk(__le16 dlen)
> {
> unsigned len = le16_to_cpu(dlen);
>
> if (len == EXT3_MAX_REC_LEN)
> return 1 << 16;
> return len;
> }
>
> 00A0 will be returned as 0xFFFFA000 !!
Presumably this is a big-endian architecture? It would appear to be a bug
in the le16_to_cpu() code rather than the functions above, since they are
always using an unsigned variable.
I suppose it would be possible to mask off the returned value, but this
seems like it is fixing the problem at the wrong level:
return (len & 0xffffU);
> Many code which iterates through dirent's, uses the above function to
> determine the start of the next dirent.(ext2_dirent, ext3_dir_entry_2)
> See fs/ext2/dir.c and fs/ext3/namei.c
>
> As a test I replaced "le16_to_cpu()" by a simple:
>
> static inline unsigned my_le16_to_cpu (__le16 value)
> {
> return ((value & 0x00FF) << 8) | ((value & 0xFF00) >> 8);
> }
>
> It showed no more "negative" rec_len values which cause the crashes, and
> both ext2/3 now run stable.
>
> Compiler: gcc version 4.1.2 (Ubuntu 4.1.2-0ubuntu4)
> Kernel: 2.6.24.atmel.1
> Platform: Atmel AP7000 CPU, compiling with "ARCH=avr32
> CROSS_COMPILE=avr32-linux-"
>
>
> Hein Tibosch
>
>
> Haavard Skinnemoen wrote:
>> (Adding the ext2/ext3/ext4 list to Cc)
>>
>> Note that the MMC/SD card driver in question, atmel-mci, is not in
>> mainline, and may be the real cause of this problem. But it looks like
>> there might be a potential problem in the ext3 code as well?
>>
>> Haavard
>>
>> On Thu, 21 Feb 2008 14:17:04 +0800
>> Hein_Tibosch <hein_tibosch@yahoo.es> wrote:
>>
>>
>>> Hi James,
>>>
>>>
>>> I've had all kinds of problems with the SD-card hooked to an NGW100, just as John Voltz reported earlier:
>>>
>>> http://www.avr32linux.org/archives/kernel/2007-November/000421.html
>>> http://www.avr32linux.org/archives/kernel/2007-November/000425.html
>>>
>>> I debugged this problem and my conclusion is: using an SD-card may lead to both BUS-errors and a complete hanging of the system, with 2.6.23.atmel.5 as well as 2.6.24.atmel.1.
>>>
>>> Both the driver for ext2 and ext3 are using this type of function to iterate through a array of inodes:
>>>
>>> static inline ext2_dirent *ext2_next_entry(ext2_dirent *p)
>>> {
>>> return (ext2_dirent *)((char*)p + le16_to_cpu(p->rec_len));
>>> }
>>>
>>> static inline struct ext3_dir_entry_2 *
>>> ext3_next_entry(struct ext3_dir_entry_2 *p)
>>> {
>>> return (struct ext3_dir_entry_2 *)((char *)p +
>>> ext3_rec_len_from_disk(p->rec_len));
>>> }
>>>
>>>
>>> Sometimes, rec_len is checked for a zero-value, sometimes the entry is checked thoroughly for validity (like with ext2_check_page() or ext3_check_dir_entry()), but in other cases rec_len isn't checked at all! This is the case in e.g. fs/ext3/namei.c, function ext3_dx_find_entry(). This function is always enabled since 2.6.24 (CONFIG_EXT3_INDEX not used anymore).
>>>
>>> I had a card on which at one place rec_len turned out to be a small negative number. When iterating, it would either cycle for ever (until WDT) or it could enter invalid memory (OOPS: BUS error).
>>>
>>> ( strange though that the rec_len appeared to have a negative number, I just did a "mkfs -t ext3" on Ubuntu. Could that be caused by the Atmel-driver? )
>>>
>>> I don't yet feel qualified to make a patch for this, I only did it for myself. Maybe someone can pick this up: a validity check should be made before any call to xxx_next_entry().
>>>
>>>
>>> Regards,
>>>
>>> Hein Tibosch (HeinBali at avr32linux)
>>>
>>>
>>>
>>> James Stewart wrote:
>>> Hi,
>>> I'm wondering if there are any known issues with booting from SD card on
>>> the ATNGW100 using this kernel. I get a bunch of ext2 looking errors and
>>> then a stack dump immediately after mounting VFS. 2.6.23.atmel.5 runs
>>> perfectly, however.
>>> This is just compiling using atngw100_defconfig.
>>> Thanks,
>>> James
>>>
>>> ------------------------------------------------
>>>
>>> _______________________________________________
>>> Kernel mailing list
>>> Kernel@avr32linux.org
>>> http://duppen.flaskehals.net/cgi-bin/mailman/listinfo/kernel
>>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-03-05 0:22 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <71C39AE3DF382B4A9CD370AD1C63B855EA060C@stervanexmb01.teradici.local>
[not found] ` <47BD1760.9080007@yahoo.es>
2008-02-21 11:35 ` Linux 2.6.24.atmel.1 MMC/SD Haavard Skinnemoen
2008-02-21 17:45 ` Hein_Tibosch
2008-03-04 19:42 ` Ext2 - ext3 unstable under 2.6.24: now solved (?) Hein_Tibosch
2008-03-05 0:22 ` Andreas Dilger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox