public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* case-duplicated filenames
@ 2009-08-11  2:05 Tony Mantler
  2009-08-11  2:22 ` David Newall
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Tony Mantler @ 2009-08-11  2:05 UTC (permalink / raw)
  To: linux-kernel

A co-worker reminded me this afternoon that the linux source tree does not
get along well with case-insensitive filesystems. Ever the curious and
bored sort, and in the middle of an sqlite bender, I decided to cook up a
quick script to find exactly what files those are.

The results (in lowercase) are as follows:
linux-source-2.6.30/include/linux/netfilter/xt_mark.h
linux-source-2.6.30/include/linux/netfilter/xt_connmark.h
linux-source-2.6.30/include/linux/netfilter/xt_dscp.h
linux-source-2.6.30/include/linux/netfilter/xt_rateest.h
linux-source-2.6.30/include/linux/netfilter/xt_tcpmss.h
linux-source-2.6.30/include/linux/netfilter_ipv6/ip6t_mark.h
linux-source-2.6.30/include/linux/netfilter_ipv6/ip6t_hl.h
linux-source-2.6.30/include/linux/netfilter_ipv4/ipt_ttl.h
linux-source-2.6.30/include/linux/netfilter_ipv4/ipt_tos.h
linux-source-2.6.30/include/linux/netfilter_ipv4/ipt_ecn.h
linux-source-2.6.30/include/linux/netfilter_ipv4/ipt_connmark.h
linux-source-2.6.30/include/linux/netfilter_ipv4/ipt_dscp.h
linux-source-2.6.30/include/linux/netfilter_ipv4/ipt_tcpmss.h
linux-source-2.6.30/include/linux/netfilter_ipv4/ipt_mark.h
linux-source-2.6.30/net/netfilter/xt_tcpmss.c
linux-source-2.6.30/net/netfilter/xt_connmark.c
linux-source-2.6.30/net/netfilter/xt_mark.c
linux-source-2.6.30/net/netfilter/xt_dscp.c
linux-source-2.6.30/net/netfilter/xt_rateest.c
linux-source-2.6.30/net/netfilter/xt_hl.c
linux-source-2.6.30/net/ipv4/netfilter/ipt_ecn.c
linux-source-2.6.30/documentation/io-mapping.txt

And the script, if you'd like to run it yourself, is as follows (hereby
released GPL2):
#!/bin/bash

#set -x

SQLITE3=`which sqlite3`
DBNAME=filenames.db
SQLFILE=filenames.sql

DIRECTORY=linux-source-2.6.30

$SQLITE3 $DBNAME "CREATE TABLE filenames ( id INTEGER PRIMARY KEY, name
TEXT );"

echo "PRAGMA synchronous=OFF;" >$SQLFILE
echo "BEGIN TRANSACTION;" >>$SQLFILE
find $DIRECTORY | tr  '[:upper:]' '[:lower:]' | sed 's/^\(.*\)/INSERT INTO
filenames (name) VALUES ("\1");/' >>$SQLFILE
echo "COMMIT;" >>$SQLFILE

$SQLITE3 $DBNAME <$SQLFILE

$SQLITE3 $DBNAME "SELECT a.name FROM filenames a, filenames b WHERE a.name
== b.name AND a.id < b.id;"

rm $SQLFILE
rm $DBNAME


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-duplicated filenames
  2009-08-11  2:05 case-duplicated filenames Tony Mantler
@ 2009-08-11  2:22 ` David Newall
  2009-08-11  2:26   ` Tony Mantler
  2009-08-11  2:35 ` Måns Rullgård
  2009-08-11  3:24 ` Charles Johnston
  2 siblings, 1 reply; 8+ messages in thread
From: David Newall @ 2009-08-11  2:22 UTC (permalink / raw)
  To: Tony Mantler; +Cc: linux-kernel

Tony Mantler wrote:
> A co-worker reminded me this afternoon that the linux source tree does not
> get along well with case-insensitive filesystems. Ever the curious and
> bored sort, and in the middle of an sqlite bender, I decided to cook up a
> quick script to find exactly what files those are.
>   

Such an extravagant process: you're just being silly.  There appears to
be a line missing:

find $DIRECTORY | sed 's/^\(.*\)/INSERT INTO filenames (name) VALUES ("\1");/' >>$SQLFILE


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-duplicated filenames
  2009-08-11  2:22 ` David Newall
@ 2009-08-11  2:26   ` Tony Mantler
  2009-08-11  9:43     ` Alan Jenkins
  2009-08-11 14:15     ` David Newall
  0 siblings, 2 replies; 8+ messages in thread
From: Tony Mantler @ 2009-08-11  2:26 UTC (permalink / raw)
  To: David Newall; +Cc: linux-kernel

David Newall wrote:
> Tony Mantler wrote:
>> A co-worker reminded me this afternoon that the linux source tree does not
>> get along well with case-insensitive filesystems. Ever the curious and
>> bored sort, and in the middle of an sqlite bender, I decided to cook up a
>> quick script to find exactly what files those are.
>>   
> 
> Such an extravagant process: you're just being silly.

Well I did say I was bored. ;)


 > There appears to be a line missing:
> 
> find $DIRECTORY | sed 's/^\(.*\)/INSERT INTO filenames (name) VALUES ("\1");/' >>$SQLFILE

?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-duplicated filenames
  2009-08-11  2:05 case-duplicated filenames Tony Mantler
  2009-08-11  2:22 ` David Newall
@ 2009-08-11  2:35 ` Måns Rullgård
  2009-08-11  3:24 ` Charles Johnston
  2 siblings, 0 replies; 8+ messages in thread
From: Måns Rullgård @ 2009-08-11  2:35 UTC (permalink / raw)
  To: linux-kernel

"Tony Mantler" <nicoya@ubb.ca> writes:

> A co-worker reminded me this afternoon that the linux source tree does not
> get along well with case-insensitive filesystems. Ever the curious and
> bored sort, and in the middle of an sqlite bender, I decided to cook up a
> quick script to find exactly what files those are.
>
> The results (in lowercase) are as follows:
> [22 files]
>
> And the script, if you'd like to run it yourself, is as follows (hereby
> released GPL2):
> [SQL]

Everybody know the shell is the hackers best friend:

$ git ls-files | tr A-Z a-z | sort | uniq -d
documentation/io-mapping.txt
include/linux/netfilter/xt_connmark.h
include/linux/netfilter/xt_dscp.h
include/linux/netfilter/xt_mark.h
include/linux/netfilter/xt_rateest.h
include/linux/netfilter/xt_tcpmss.h
include/linux/netfilter_ipv4/ipt_connmark.h
include/linux/netfilter_ipv4/ipt_dscp.h
include/linux/netfilter_ipv4/ipt_ecn.h
include/linux/netfilter_ipv4/ipt_mark.h
include/linux/netfilter_ipv4/ipt_tcpmss.h
include/linux/netfilter_ipv4/ipt_tos.h
include/linux/netfilter_ipv4/ipt_ttl.h
include/linux/netfilter_ipv6/ip6t_hl.h
include/linux/netfilter_ipv6/ip6t_mark.h
net/ipv4/netfilter/ipt_ecn.c
net/netfilter/xt_connmark.c
net/netfilter/xt_dscp.c
net/netfilter/xt_hl.c
net/netfilter/xt_mark.c
net/netfilter/xt_rateest.c
net/netfilter/xt_tcpmss.c

That runs in about 0.03 seconds here.

-- 
Måns Rullgård
mans@mansr.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-duplicated filenames
  2009-08-11  2:05 case-duplicated filenames Tony Mantler
  2009-08-11  2:22 ` David Newall
  2009-08-11  2:35 ` Måns Rullgård
@ 2009-08-11  3:24 ` Charles Johnston
  2 siblings, 0 replies; 8+ messages in thread
From: Charles Johnston @ 2009-08-11  3:24 UTC (permalink / raw)
  To: linux-kernel

Tony Mantler wrote:
> A co-worker reminded me this afternoon that the linux source tree does not
> get along well with case-insensitive filesystems. Ever the curious and
> bored sort, and in the middle of an sqlite bender, I decided to cook up a
> quick script to find exactly what files those are.
> 

This one-liner does it too:

find linux-2.6.30.4 |tr '[:upper:]' '[:lower:]' | sort | uniq -d

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-duplicated filenames
  2009-08-11  2:26   ` Tony Mantler
@ 2009-08-11  9:43     ` Alan Jenkins
  2009-08-11 12:38       ` Måns Rullgård
  2009-08-11 14:15     ` David Newall
  1 sibling, 1 reply; 8+ messages in thread
From: Alan Jenkins @ 2009-08-11  9:43 UTC (permalink / raw)
  To: Tony Mantler; +Cc: David Newall, linux-kernel

On 8/11/09, Tony Mantler <nicoya@ubb.ca> wrote:
> David Newall wrote:
>> Tony Mantler wrote:
>>> A co-worker reminded me this afternoon that the linux source tree does
>>> not
>>> get along well with case-insensitive filesystems. Ever the curious and
>>> bored sort, and in the middle of an sqlite bender, I decided to cook up a
>>> quick script to find exactly what files those are.
>>>
>>
>> Such an extravagant process: you're just being silly.
>
> Well I did say I was bored. ;)

Performance aside, it's a one-liner in shell* ;).

find | sort | uniq -ic | grep -v " 1 "

Alan

[*]  Posix compliance not guaranteed.  I don't know how standard uniq -i is.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-duplicated filenames
  2009-08-11  9:43     ` Alan Jenkins
@ 2009-08-11 12:38       ` Måns Rullgård
  0 siblings, 0 replies; 8+ messages in thread
From: Måns Rullgård @ 2009-08-11 12:38 UTC (permalink / raw)
  To: linux-kernel

Alan Jenkins <sourcejedi.lkml@googlemail.com> writes:

> On 8/11/09, Tony Mantler <nicoya@ubb.ca> wrote:
>> David Newall wrote:
>>> Tony Mantler wrote:
>>>> A co-worker reminded me this afternoon that the linux source tree does
>>>> not
>>>> get along well with case-insensitive filesystems. Ever the curious and
>>>> bored sort, and in the middle of an sqlite bender, I decided to cook up a
>>>> quick script to find exactly what files those are.
>>>>
>>>
>>> Such an extravagant process: you're just being silly.
>>
>> Well I did say I was bored. ;)
>
> Performance aside, it's a one-liner in shell* ;).
>
> find | sort | uniq -ic | grep -v " 1 "
>
> [*]  Posix compliance not guaranteed.  I don't know how standard uniq -i is.

It's not standard.  You're also assuming case-insensitive sort.  Using
sort -f increases the chances of this doing what you want, but it
still depends on the locale settings.

-- 
Måns Rullgård
mans@mansr.com


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: case-duplicated filenames
  2009-08-11  2:26   ` Tony Mantler
  2009-08-11  9:43     ` Alan Jenkins
@ 2009-08-11 14:15     ` David Newall
  1 sibling, 0 replies; 8+ messages in thread
From: David Newall @ 2009-08-11 14:15 UTC (permalink / raw)
  To: Tony Mantler; +Cc: linux-kernel

Tony Mantler wrote:
> David Newall wrote:
>> There appears to be a line missing:
>>
>> find $DIRECTORY | sed 's/^\(.*\)/INSERT INTO filenames (name) VALUES
>> ("\1");/' >>$SQLFILE
>
> ?


Sorry.  I just now realised what it does.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-08-11 14:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-11  2:05 case-duplicated filenames Tony Mantler
2009-08-11  2:22 ` David Newall
2009-08-11  2:26   ` Tony Mantler
2009-08-11  9:43     ` Alan Jenkins
2009-08-11 12:38       ` Måns Rullgård
2009-08-11 14:15     ` David Newall
2009-08-11  2:35 ` Måns Rullgård
2009-08-11  3:24 ` Charles Johnston

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox