public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: telldir() broken on large ext3 dir_index'd directories because getdents() gives d_off==0 for the first entry
@ 2004-11-09  6:18 r6144
  2004-11-16 18:07 ` [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after " r6144
  0 siblings, 1 reply; 3+ messages in thread
From: r6144 @ 2004-11-09  6:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: sct

[-- Attachment #1: Type: text/plain, Size: 4014 bytes --]

[1.] One line summary of the problem:   
telldir() broken on large ext3 dir_index'd directories because
getdents() gives d_off==0 for the first entry

[2.] Full description of the problem/report:
Create a large directory in an ext3 filesystem with dir_index enabled,
and running getdents() on it sometimes gives d_off==0 for the first
entry (usually "."), which seems to be wrong since lseek()'ing to that
offset does not make subsequent getdents() calls skip that entry.
This breaks programs that readdir() the first entry, call telldir() (which
erroneously returns 0) then seekdir() with telldir()'s result, expecting the
next readdir() to return the second entry.

[3.] Keywords (i.e., modules, networking, kernel):
fs,ext3

[4.] Kernel version (from /proc/version):
Linux version 2.6.9-skas3-v7 (root@Default) (gcc version 3.2.2
20030222 (Red Hat Linux 3.2.2-5)) #1 Mon Nov 8 20:07:10 CST 2004

The skas3-v7 patch comes from
http://www.user-mode-linux.org/~blaisorblade/patches/skas3-2.6/,
I don't think it will break things.  Will you please try to reproduce
the problem
on vanilla 2.6.9?

[6.] A small shell script or example program which triggers the
     problem (if possible)
1. Bunzip2 the attached e2test.bz2 file, and loop-mount it using ext3
onto e.g. /mnt/loop0
2. Compile test_readdir1.c and test_readdir2.c.
3. Run "./test_readdir1 /mnt/loop0/test4", which should run correctly,
giving results like

249e4711: ..
393238f2: 1
4a8ec49c: 6
...

4. Run "./test_readdir1 /mnt/loop0/test3", which ends up in an infinite loop:

00000000: .
...

5. Run "./test_readdir2 /mnt/loop0/test3":

d_ino=2052, d_off=0, d_name=.
d_ino=2, d_off=800177, d_name=..
d_ino=12, d_off=1177726, d_name=2662
...

The d_off=0 looks wrong.

6. Running e2fsck 1.34 on that filesystem does not reveal problems.

[7.] Environment
Mostly redhat 9.  In particular the glibc is the version in redhat 9,
which is glibc-2.3.2-27.9.

[7.1.] Software (add the output of the ver_linux script here)
Linux Default 2.6.9-skas3-v7 #1 Mon Nov 8 20:07:10 CST 2004 i686 i686
i386 GNU/Linux
 
Gnu C                  3.2.2
Gnu make               3.79.1
binutils               2.13.90.0.18
util-linux             2.11y
mount                  2.11y
module-init-tools      2.4.25
e2fsprogs              1.34
reiserfsprogs          2002------------->
reiser4progs           line
quota-tools            3.09.
nfs-utils              1.0.5
Linux C Library        2.3.2
Dynamic linker (ldd)   2.3.2
Linux C++ Library      new..
Linux C++ Library      5..
Procps                 2.0.14
Net-tools              1.60
Kbd                    1.06
Sh-utils               4.5.3
Modules Loaded         cramfs loop tun iptable_nat ip_conntrack
ipt_multiport ipt_REJECT iptable_filter ip_tables snd_mixer_oss
snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd_page_alloc
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore i830
intel_agp agpgart usb_storage scsi_mod pcspkr w83627hf eeprom
i2c_sensor i2c_isa i2c_i801 i2c_core binfmt_misc parport_pc lp parport
autofs via_rhine 8139too mii crc32 md5 ipv6 microcode nls_cp936 vfat
fat usbhid ehci_hcd uhci_hcd usbcore ext3 jbd

[7.2.] Processor information (from /proc/cpuinfo):
Intel Pentium 4

[7.3.] Module information (from /proc/modules):
[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
[7.5.] PCI information ('lspci -vvv' as root)
[7.6.] SCSI information (from /proc/scsi/scsi)
[7.7.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):
[X.] Other notes, patches, fixes, workarounds:

Workaround: in test_readdir1.c, don't seekdir() if the offset is zero.

Please CC me since I'm not subscribed.

-- 
"Everybody you meet will love you as long as you live." said the
Story Girl.  "There that's the very nicest fortune I can tell you, 
and it will come true whether the others do or not, and now we
must go in."
                      --- L.M.Montgomery, "The Golden Road", Ch.30

[-- Attachment #2: e2test.bz2 --]
[-- Type: application/x-bzip2, Size: 36210 bytes --]

[-- Attachment #3: test_readdir1.c --]
[-- Type: text/x-c, Size: 550 bytes --]

#include <sys/types.h>
#include <errno.h>
#include <stdio.h>
#include <dirent.h>

int main(int argc, char **argv)
{
  DIR *dir;
  struct dirent *ent;
  off_t offset = 0;
  
  if (argc < 2) return 1;
  dir = opendir(argv[1]);
  if (dir == NULL) { perror("opendir"); goto err; }
  while (1) {
    seekdir(dir, offset); /* Would be okay if "if(offset)..." is used */
    ent = readdir(dir);
    if (ent == NULL) break;
    offset = telldir(dir);
    printf("%08llx: %s\n", (unsigned long long) offset, ent->d_name);
  }
  return 0;
 err:
  return 1;
}


[-- Attachment #4: test_readdir2.c --]
[-- Type: text/x-c, Size: 890 bytes --]

#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <linux/types.h>
#include <linux/dirent.h>
#include <linux/unistd.h>

_syscall3(int, getdents, uint, fd, struct dirent *, dirp, uint, count);

#define BUF_SIZE 1024

int main(int argc, char **argv)
{
  int fd;
  char buf[BUF_SIZE];
  int result;
  unsigned nread, offset;
  struct dirent *ent;
  
  if (argc < 2) return 1;
  fd = open(argv[1], 0);
  if (fd == -1) { perror("open"); goto err; }
  result = getdents(fd, (struct dirent *) buf, BUF_SIZE);
  if (result == -1) { perror("getdents"); goto err; }
  nread = result; offset = 0;
  while (offset < nread) {
    ent = (struct dirent *) &buf[offset];
    printf("d_ino=%ld, d_off=%llu, d_name=%s\n", ent->d_ino, (unsigned long long) ent->d_off, ent->d_name);
    offset += ent->d_reclen;
  }
  return 0;

 err:
  return 1;
}


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry.
  2004-11-09  6:18 PROBLEM: telldir() broken on large ext3 dir_index'd directories because getdents() gives d_off==0 for the first entry r6144
@ 2004-11-16 18:07 ` r6144
  2004-11-16 18:29   ` Ryan Cumming
  0 siblings, 1 reply; 3+ messages in thread
From: r6144 @ 2004-11-16 18:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: sct

[-- Attachment #1: Type: text/plain, Size: 4247 bytes --]

(I think I had better resend this, for my last email about this a week ago got
absolutely no response.  I apologize for any inconvenience.)

[1.] One line summary of the problem:
telldir() broken on large ext3 dir_index'd directories because
getdents() gives d_off==0 for the first entry

[2.] Full description of the problem/report:
Create a large directory in an ext3 filesystem with dir_index enabled,
and running getdents() on it sometimes gives d_off==0 for the first
entry (usually "."), which seems to be wrong since lseek()'ing to that
offset does not make subsequent getdents() calls skip that entry.
This breaks programs that readdir() the first entry, call telldir() (which
erroneously returns 0) then seekdir() with telldir()'s result, expecting the
next readdir() to return the second entry.

[3.] Keywords (i.e., modules, networking, kernel):
fs,ext3

[4.] Kernel version (from /proc/version):
Linux version 2.6.9-skas3-v7 (root@Default) (gcc version 3.2.2
20030222 (Red Hat Linux 3.2.2-5)) #1 Mon Nov 8 20:07:10 CST 2004

The skas3-v7 patch comes from
http://www.user-mode-linux.org/~blaisorblade/patches/skas3-2.6/,
I don't think it will break things.  Will you please try to reproduce
the problem on vanilla 2.6.9?

[6.] A small shell script or example program which triggers the
    problem (if possible)
1. Bunzip2 the attached e2test.bz2 file, and loop-mount it using ext3
onto e.g. /mnt/loop0
2. Compile test_readdir1.c and test_readdir2.c.
3. Run "./test_readdir1 /mnt/loop0/test4", which should run correctly,
giving results like

249e4711: ..
393238f2: 1
4a8ec49c: 6
...

4. Run "./test_readdir1 /mnt/loop0/test3", which ends up in an infinite loop:

00000000: .
...

5. Run "./test_readdir2 /mnt/loop0/test3":

d_ino=2052, d_off=0, d_name=.
d_ino=2, d_off=800177, d_name=..
d_ino=12, d_off=1177726, d_name=2662
...

The d_off=0 looks wrong.

6. Running e2fsck 1.34 on that filesystem does not reveal problems.

[7.] Environment
Mostly redhat 9.  In particular the glibc is the version in redhat 9,
which is glibc-2.3.2-27.9.

[7.1.] Software (add the output of the ver_linux script here)
Linux Default 2.6.9-skas3-v7 #1 Mon Nov 8 20:07:10 CST 2004 i686 i686
i386 GNU/Linux

Gnu C                  3.2.2
Gnu make               3.79.1
binutils               2.13.90.0.18
util-linux             2.11y
mount                  2.11y
module-init-tools      2.4.25
e2fsprogs              1.34
reiserfsprogs          2002------------->
reiser4progs           line
quota-tools            3.09.
nfs-utils              1.0.5
Linux C Library        2.3.2
Dynamic linker (ldd)   2.3.2
Linux C++ Library      new..
Linux C++ Library      5..
Procps                 2.0.14
Net-tools              1.60
Kbd                    1.06
Sh-utils               4.5.3
Modules Loaded         cramfs loop tun iptable_nat ip_conntrack
ipt_multiport ipt_REJECT iptable_filter ip_tables snd_mixer_oss
snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd_page_alloc
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore i830
intel_agp agpgart usb_storage scsi_mod pcspkr w83627hf eeprom
i2c_sensor i2c_isa i2c_i801 i2c_core binfmt_misc parport_pc lp parport
autofs via_rhine 8139too mii crc32 md5 ipv6 microcode nls_cp936 vfat
fat usbhid ehci_hcd uhci_hcd usbcore ext3 jbd

[7.2.] Processor information (from /proc/cpuinfo):
Intel Pentium 4

[7.3.] Module information (from /proc/modules):
[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
[7.5.] PCI information ('lspci -vvv' as root)
[7.6.] SCSI information (from /proc/scsi/scsi)
[7.7.] Other information that might be relevant to the problem
      (please look in /proc and include all information that you
      think to be relevant):
[X.] Other notes, patches, fixes, workarounds:

This is the bug that causes user-mode-linux to lock up when reading
large hostfs directorys on ext3.

Workaround: in test_readdir1.c, don't seekdir() if the offset is zero.

Please CC me since I'm not subscribed.

--
"Everybody you meet will love you as long as you live." said the
Story Girl.  "There that's the very nicest fortune I can tell you,
and it will come true whether the others do or not, and now we
must go in."
                     --- L.M.Montgomery, "The Golden Road", Ch.30

[-- Attachment #2: e2test.bz2 --]
[-- Type: application/x-bzip2, Size: 36210 bytes --]

[-- Attachment #3: test_readdir1.c --]
[-- Type: text/x-c, Size: 551 bytes --]

#include <sys/types.h>
#include <errno.h>
#include <stdio.h>
#include <dirent.h>

int main(int argc, char **argv)
{
  DIR *dir;
  struct dirent *ent;
  off_t offset = 0;
  
  if (argc < 2) return 1;
  dir = opendir(argv[1]);
  if (dir == NULL) { perror("opendir"); goto err; }
  while (1) {
    seekdir(dir, offset); /* Would be okay if "if(offset)..." is used */
    ent = readdir(dir);
    if (ent == NULL) break;
    offset = telldir(dir);
    printf("%08llx: %s\n", (unsigned long long) offset, ent->d_name);
  }
  return 0;
 err:
  return 1;
}



[-- Attachment #4: test_readdir2.c --]
[-- Type: text/x-c, Size: 891 bytes --]

#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <linux/types.h>
#include <linux/dirent.h>
#include <linux/unistd.h>

_syscall3(int, getdents, uint, fd, struct dirent *, dirp, uint, count);

#define BUF_SIZE 1024

int main(int argc, char **argv)
{
  int fd;
  char buf[BUF_SIZE];
  int result;
  unsigned nread, offset;
  struct dirent *ent;
  
  if (argc < 2) return 1;
  fd = open(argv[1], 0);
  if (fd == -1) { perror("open"); goto err; }
  result = getdents(fd, (struct dirent *) buf, BUF_SIZE);
  if (result == -1) { perror("getdents"); goto err; }
  nread = result; offset = 0;
  while (offset < nread) {
    ent = (struct dirent *) &buf[offset];
    printf("d_ino=%ld, d_off=%llu, d_name=%s\n", ent->d_ino, (unsigned long long) ent->d_off, ent->d_name);
    offset += ent->d_reclen;
  }
  return 0;

 err:
  return 1;
}



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after the first entry.
  2004-11-16 18:07 ` [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after " r6144
@ 2004-11-16 18:29   ` Ryan Cumming
  0 siblings, 0 replies; 3+ messages in thread
From: Ryan Cumming @ 2004-11-16 18:29 UTC (permalink / raw)
  To: r6144; +Cc: linux-kernel, sct

[-- Attachment #1: Type: text/plain, Size: 266 bytes --]

On Tuesday 16 November 2004 10:07, r6144 wrote:
> (I think I had better resend this, for my last email about this a week ago
> got absolutely no response.  I apologize for any inconvenience.)
ext2-devel at lists.sourceforge.net might be interested too.

-Ryan

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-11-16 18:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-09  6:18 PROBLEM: telldir() broken on large ext3 dir_index'd directories because getdents() gives d_off==0 for the first entry r6144
2004-11-16 18:07 ` [POSSIBLE-BUG] telldir() broken on ext3 dir_index'd directories just after " r6144
2004-11-16 18:29   ` Ryan Cumming

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox