public inbox for linux-newbie@vger.kernel.org
 help / color / mirror / Atom feed
* convert windows file names
@ 2005-04-15 15:04 James Miller
  2005-04-15 15:15 ` Flemming Greve Skovengaard
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: James Miller @ 2005-04-15 15:04 UTC (permalink / raw)
  To: linux-newbie

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset="en_US.UTF-8"; format=flowed, Size: 2942 bytes --]

Among various frustrations recently I've had the gratifying success of 
learning how to use streamripper to augment my music collection. 
Streamripper is a program that writes an audio stream (e.g., from internet 
radio) to your hard drive as an mp3 file. This is about the closest thing 
to the mythical "Rivo" (Tivo for radio) that currently exists, I think, 
and could maybe serve as the basis for a *real* Rivo-type program, should 
someone really decide to develop one.

Despite the success, there are some problems--mainly having to do with 
file names. I've found a nice commercial-free classical (Baroque) station 
and have been happily recording away for the last 24 hrs or so. The 
streamripper program was evidently written for rock or more popular genres 
and tries to detect breaks between songs so as to make discrete files from 
them. For whatever strange reason, it has a problem detecting beginnings 
and endings between movements in classical music (despite the noticeable 
pause) and wants to break between movements about 30 seconds into the next 
movement, rather than at the pause. The cat command seems to fix this, 
though:

cat movement1.mp3 >full-piece.mp3
cat movement2.mp3 >>full-piece.mp3
cat movement3.mp3 >>full-piece.mp3

The breaks at 30 seconds into the following movement are hardly even 
noticeable in the full-piece.mp3 (I don't have the kind of purist 
standards I used to when it comes to audio quality, though).

But, on to file names. unfortunately, the names for the pieces I'm 
recording from this station follow Windows long-file-naming conventions. 
Even worse, the names tend to be quite complex and long. Here are a couple 
of examples:

Anton\ Reicha-\ Albert\ Schweitzer\ Quintett\ -\ Wind\ Quintet\ No.9\ 
in\ D\ major\ Op.91\ No.3-\ Finale-\ Allegretto.mp3

Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ 
Strings\ In\ D\ Major\,\ Op.56£¯5\,\ G411\ -¥².\ Andante\ Come\ 
Prima.mp3

Feeding those names to cat so I can join the movements into a single file 
is going to be a major pain in the wazoo, as they say down at symphony 
hall. What I was hoping to find is a script that would automatically 
convert all the wierd characters into more standard Unix file-naming 
characters. But so far I've come up empty-handed. Can anyone point me to 
some utility that might do what I need?

As a last resort, I might try to write my own script. I'm not too hot on 
doing that though, since I'm at an extremely rudimentary level when it 
comes to script writing. If it comes to that, could someone maybe help me 
get started by giving an example for a script that would do the renaming I 
want? I'd like to retain the bulk of the information, though I don't mind 
truncating words at, say 5 letters. I suppose the main thing would be 
replaing all the spaces and/or punctuation with dashes and/or underscores.

Thanks, James

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 15:04 convert windows file names James Miller
@ 2005-04-15 15:15 ` Flemming Greve Skovengaard
  2005-04-15 16:37   ` James Miller
  2005-04-15 15:39 ` Peter
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Flemming Greve Skovengaard @ 2005-04-15 15:15 UTC (permalink / raw)
  To: linux-newbie; +Cc: James Miller

James Miller wrote:
> Among various frustrations recently I've had the gratifying success of 
> learning how to use streamripper to augment my music collection. 
> Streamripper is a program that writes an audio stream (e.g., from 
> internet radio) to your hard drive as an mp3 file. This is about the 
> closest thing to the mythical "Rivo" (Tivo for radio) that currently 
> exists, I think, and could maybe serve as the basis for a *real* 
> Rivo-type program, should someone really decide to develop one.
> 
> Despite the success, there are some problems--mainly having to do with 
> file names. I've found a nice commercial-free classical (Baroque) 
> station and have been happily recording away for the last 24 hrs or so. 
> The streamripper program was evidently written for rock or more popular 
> genres and tries to detect breaks between songs so as to make discrete 
> files from them. For whatever strange reason, it has a problem detecting 
> beginnings and endings between movements in classical music (despite the 
> noticeable pause) and wants to break between movements about 30 seconds 
> into the next movement, rather than at the pause. The cat command seems 
> to fix this, though:
> 
> cat movement1.mp3 >full-piece.mp3
> cat movement2.mp3 >>full-piece.mp3
> cat movement3.mp3 >>full-piece.mp3
> 
> The breaks at 30 seconds into the following movement are hardly even 
> noticeable in the full-piece.mp3 (I don't have the kind of purist 
> standards I used to when it comes to audio quality, though).
> 
> But, on to file names. unfortunately, the names for the pieces I'm 
> recording from this station follow Windows long-file-naming conventions. 
> Even worse, the names tend to be quite complex and long. Here are a 
> couple of examples:
> 
> Anton\ Reicha-\ Albert\ Schweitzer\ Quintett\ -\ Wind\ Quintet\ No.9\ 
> in\ D\ major\ Op.91\ No.3-\ Finale-\ Allegretto.mp3
> 
> Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ 
> Strings\ In\ D\ Major\,\ Op.565\,\ G411\ -.\ Andante\ Come\ Prima.mp3
> 
> Feeding those names to cat so I can join the movements into a single 
> file is going to be a major pain in the wazoo, as they say down at 
> symphony hall. What I was hoping to find is a script that would 
> automatically convert all the wierd characters into more standard Unix 
> file-naming characters. But so far I've come up empty-handed. Can anyone 
> point me to some utility that might do what I need?
> 
> As a last resort, I might try to write my own script. I'm not too hot on 
> doing that though, since I'm at an extremely rudimentary level when it 
> comes to script writing. If it comes to that, could someone maybe help 
> me get started by giving an example for a script that would do the 
> renaming I want? I'd like to retain the bulk of the information, though 
> I don't mind truncating words at, say 5 letters. I suppose the main 
> thing would be replaing all the spaces and/or punctuation with dashes 
> and/or underscores.
> 
> Thanks, Jam
> es

You can use my little perl script for that.

#!/usr/bin/perl


# remove_invalid - Removes invalid characters from filenames.
# Copyright (C) 2004  Flemming Greve Skovengaard
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.


# File:					remove_invalid
# Version:				0.4.6
# Date (YYYY-MM-DD):	2004-07-24
# Author:				Flemming Greve Skovengaard
# Contact:				dsl58893@vip.cybercity.dk

## Version 0.1.0	
## Date: 2004-04-15
##		Replaces spaces with underscores.
##
## Version 0.2.0
## Date: 2004-05-13
##		Replaces !, @, $, & (, ), {, }, [, ], <, >, ' and ".
##
## Version 0.3.0
## Date 2004-05-14
##		Removes any leading - (minus/dash).
##
## Version 0.4.0
## Date: 2004-05-15
##		Added option 'verbose' and 'help'.
##		Added 'Files renamed: x'.
##
## Version 0.4.1
## Date: 2004-05-15
##		Added option 'version'.
##
## Version 0.4.2
## Date: 2004-05-15
##		Removes ,'s (comma).
##
## Version 0.4.3
## Date: 2004-06-29
##		Uses File::Basename to get basename if --help
##
## Version 0.4.4
## Date: 2004-07-23
##		Simplified substitute procedure.
##
## Version 0.4.5
## Date: 2004-07-23
##		Now removes ':' and ';'.
##
## Version 0.4.6
## Date: 2004-08-03
##		Correctly removes '!' and '$'.

## Removes all invalid characters in filenames in the current directory.

use strict;
use warnings;
use Getopt::Long;
use File::Basename qw/ basename /;

Getopt::Long::Configure("gnu_getopt");

my ($verbose, $help, $version);
my $current_version = "0.4.6";    # REMEMBER TO UPDATE.
my $dir = '.';
my $num_renamed = 0;

GetOptions('v|verbose' => \$verbose,
		   'help' => \$help,
		   'V|version' => \$version,
);

if ($help) {
	print "Version: $current_version\n";
	print "Usage: ", basename($0), " [-v|--verbose]\n";
	exit 0;
}

if ($version) {
	print "File:\t\tremove_invalid.pl\n";
	print "Version:\t$current_version\n";
	print "Written by Flemming Greve Skovengaard.\n";
	exit 0;
}

sub rename_file {
	my ($old, $new) = @_;
	
	rename $old, $new
	or warn "Could not rename '$old' to '$new': $!\n";
	
	return 0;
}

opendir DH, $dir or die "Cannot opendir '$dir': $!\n";

foreach my $file (sort readdir DH) {
	my $new_name = $file;
	my $rename_failed = 1;
	
	if ($new_name =~ m/(^[-+]|[ (){},'":;<>\!\$\&\@\[\]|])/) {
		$new_name =~ s/^[-+]//;
		$new_name =~ s/ /_/g;
		$new_name =~ s/,/./g;
		$new_name =~ s/\@/_at_/g;
		$new_name =~ s/\&/_and_/g;
		$new_name =~ s/['":;\$\!]//g;
		$new_name =~ s/[({<]/_ld_/g;
		$new_name =~ s/[)}>]/_rd_/g;
		$new_name =~ s/\[/_ld_/g;
		$new_name =~ s/\]/_rd_/g;
		
		if ($verbose) {
			print "'$file' => '$new_name'\n";
			$rename_failed = rename_file($file, $new_name);
		}
		else {
			$rename_failed = rename_file($file, $new_name);
		}
		++$num_renamed unless $rename_failed;
	}
}

print "Files renamed: $num_renamed\n";

-- 
Flemming Greve Skovengaard           FAITH, n.
a.k.a Greven, TuxPower                   Belief without evidence in what is told
<dsl58893@vip.cybercity.dk>              by one who speaks without knowledge,
4112.38 BogoMIPS                         of things without parallel.

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 15:04 convert windows file names James Miller
  2005-04-15 15:15 ` Flemming Greve Skovengaard
@ 2005-04-15 15:39 ` Peter
  2005-04-15 16:48 ` Ray Olszewski
  2005-04-15 18:00 ` chuck gelm
  3 siblings, 0 replies; 10+ messages in thread
From: Peter @ 2005-04-15 15:39 UTC (permalink / raw)
  To: James Miller; +Cc: linux-newbie

On Fri, 2005-04-15 at 10:04 -0500, James Miller wrote:
> Among various frustrations recently I've had the gratifying success of 
> learning how to use streamripper to augment my music collection. 
> Streamripper is a program that writes an audio stream (e.g., from internet 
> radio) to your hard drive as an mp3 file. This is about the closest thing 
> to the mythical "Rivo" (Tivo for radio) that currently exists, I think, 
> and could maybe serve as the basis for a *real* Rivo-type program, should 
> someone really decide to develop one.
> 
> Despite the success, there are some problems--mainly having to do with 
> file names. I've found a nice commercial-free classical (Baroque) station 
> and have been happily recording away for the last 24 hrs or so. The 
> streamripper program was evidently written for rock or more popular genres 
> and tries to detect breaks between songs so as to make discrete files from 
> them. For whatever strange reason, it has a problem detecting beginnings 
> and endings between movements in classical music (despite the noticeable 
> pause) and wants to break between movements about 30 seconds into the next 
> movement, rather than at the pause. The cat command seems to fix this, 
> though:
> 
> cat movement1.mp3 >full-piece.mp3
> cat movement2.mp3 >>full-piece.mp3
> cat movement3.mp3 >>full-piece.mp3
> 
> The breaks at 30 seconds into the following movement are hardly even 
> noticeable in the full-piece.mp3 (I don't have the kind of purist 
> standards I used to when it comes to audio quality, though).
> 
> But, on to file names. unfortunately, the names for the pieces I'm 
> recording from this station follow Windows long-file-naming conventions. 
> Even worse, the names tend to be quite complex and long. Here are a couple 
> of examples:
> 
> Anton\ Reicha-\ Albert\ Schweitzer\ Quintett\ -\ Wind\ Quintet\ No.9\ 
> in\ D\ major\ Op.91\ No.3-\ Finale-\ Allegretto.mp3
> 
> Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ 
> Strings\ In\ D\ Major\,\ Op.565\,\ G411\ -.\ Andante\ Come\ 
> Prima.mp3
> 
> Feeding those names to cat so I can join the movements into a single file 
> is going to be a major pain in the wazoo, as they say down at symphony 
> hall. What I was hoping to find is a script that would automatically 
> convert all the wierd characters into more standard Unix file-naming 
> characters. But so far I've come up empty-handed. Can anyone point me to 
> some utility that might do what I need?
> 
> As a last resort, I might try to write my own script. I'm not too hot on 
> doing that though, since I'm at an extremely rudimentary level when it 
> comes to script writing. If it comes to that, could someone maybe help me 
> get started by giving an example for a script that would do the renaming I 
> want? I'd like to retain the bulk of the information, though I don't mind 
> truncating words at, say 5 letters. I suppose the main thing would be 
> replaing all the spaces and/or punctuation with dashes and/or underscores.
> 
> Thanks, James

I just tried this with some mp3 tracks I have here:

cat Really\ Annoying\ File <hit tab key> Second Ultra\ Irritating\ File
<hit tab key>  > testfile.mpg

Seems to work and saves a lot of typing. Of course if you have a large
number of such files even this could get to be a pain, I guess.

By the way, since you like Streamripper, have a look at Streamtuner:

http://www.nongnu.org/streamtuner/

It integrates with Streamripper for click-the-button recording of
streams. Very nice app - the latest version includes a number of sources
as indices, including Xiph, Shoutcast and Live365 as well as a complete
index of your local files, each of these categories having its own tab.

Peter

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 15:15 ` Flemming Greve Skovengaard
@ 2005-04-15 16:37   ` James Miller
  2005-04-15 17:16     ` J.
  2005-04-15 18:17     ` Flemming Greve Skovengaard
  0 siblings, 2 replies; 10+ messages in thread
From: James Miller @ 2005-04-15 16:37 UTC (permalink / raw)
  To: linux-newbie

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset="en_US.UTF-8"; format=flowed, Size: 1192 bytes --]

I tried your perl script and it works really well, Flemming. Thanks for 
bringing it to my attention. I see it works for all files in a given 
directory--exactly what I need. Now, in place of something like

Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ Strings\ In\ D\ Major\,\ Op.565\,\ G411\ -.\ Andante\ Come\ Prima.mp3

I get

Patrick_Cohen__and__Mosaiques_Quartet_-_Quintet_For_Piano__and__Strings_In_D_Major._Op.56£¯5._G411_-¥²._Andante_Come_Prima.mp3

--a big step in the right direction. But I'm still getting some wierd 
characters in there--£¯ and ¥². Are these unicode or something? Anyway, I 
can't reproduce these at the command line. Is there any way your script 
might be made to catch and replace symbols like these as well (I mean, for 
someone who knows absolutely nothing about Perl, and precious little about 
scripting in general)? I have no idea what information these symbols are 
supposed to be representing. It's probably so inconsequential I don't even 
need it, so replacing it with virtually any other symbol should suffice. 
I'd say I've got at least 20 files with such symbols, and more are on the 
way.

Thanks, James

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 15:04 convert windows file names James Miller
  2005-04-15 15:15 ` Flemming Greve Skovengaard
  2005-04-15 15:39 ` Peter
@ 2005-04-15 16:48 ` Ray Olszewski
  2005-04-15 18:49   ` James Miller
  2005-04-15 18:00 ` chuck gelm
  3 siblings, 1 reply; 10+ messages in thread
From: Ray Olszewski @ 2005-04-15 16:48 UTC (permalink / raw)
  To: linux-newbie

At 10:04 AM 4/15/2005 -0500, James Miller wrote:
>Among various frustrations recently I've had the gratifying success of 
>learning how to use streamripper to augment my music collection. 
>Streamripper is a program that writes an audio stream (e.g., from internet 
>radio) to your hard drive as an mp3 file. This is about the closest thing 
>to the mythical "Rivo" (Tivo for radio) that currently exists, I think, 
>and could maybe serve as the basis for a *real* Rivo-type program, should 
>someone really decide to develop one.

 From looking at the streamripper writeup in the Debian packagins system, 
I'd say this is less likely than you think. It says the app is limited to 
MP3 and OGG streams, and these seem to be sonsiderably less popular, with 
streaming sources, than WMA, RM, and Shoutcast.

>Despite the success, there are some problems--mainly having to do with 
>file names. I've found a nice commercial-free classical (Baroque) station 
>and have been happily recording away for the last 24 hrs or so. The 
>streamripper program was evidently written for rock or more popular genres 
>and tries to detect breaks between songs so as to make discrete files from 
>them. For whatever strange reason, it has a problem detecting beginnings 
>and endings between movements in classical music (despite the noticeable 
>pause) and wants to break between movements about 30 seconds into the next 
>movement, rather than at the pause. The cat command seems to fix this, though:
>
>cat movement1.mp3 >full-piece.mp3
>cat movement2.mp3 >>full-piece.mp3
>cat movement3.mp3 >>full-piece.mp3

>The breaks at 30 seconds into the following movement are hardly even 
>noticeable in the full-piece.mp3 (I don't have the kind of purist 
>standards I used to when it comes to audio quality, though).
>
>But, on to file names. unfortunately, the names for the pieces I'm 
>recording from this station follow Windows long-file-naming conventions. 
>Even worse, the names tend to be quite complex and long. Here are a couple 
>of examples:
>
>Anton\ Reicha-\ Albert\ Schweitzer\ Quintett\ -\ Wind\ Quintet\ No.9\ in\ 
>D\ major\ Op.91\ No.3-\ Finale-\ Allegretto.mp3
>
>Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ 
>Strings\ In\ D\ Major\,\ Op.56£¯5\,\ G411\ -¥².\ Andante\ Come\ Prima.mp3
>
>Feeding those names to cat so I can join the movements into a single file 
>is going to be a major pain in the wazoo, as they say down at symphony hall.

Yeah. This is the sort of setting where point-and-click GUIs really show 
their strength. Windows copes with this sort of gobbledygook just fine, and 
I imagine the heavy-duty Linux GUIs, like KDE and Gnome, do too.

>What I was hoping to find is a script that would automatically convert all 
>the wierd characters into more standard Unix file-naming characters. But 
>so far I've come up empty-handed. Can anyone point me to some utility that 
>might do what I need?

OK. A couple of thing to try ... one candidate app plus 2 workaround 
possibilities.

1. See if the app "mp3wrap" (that's the Debian-Sid package name) helps in 
any way relevant to your problem.

2. Consider using playlists rather than joining the files. If you use xmms 
for playback, it has a deccent GUI for creating playlists, and it has no 
problem with the "weird" characters. (This naming standard is also a 
problem for those of us who rip our own CDs to our hard drives as well, and 
xmms plus playlists has always handled it for me.)

3. TAB completion is your friend. Using your first example, I bet something 
close to "Anton<TAB>" would complete that entry on the command line. From 
the examples you provided, I can't back out where the movement 
differentiators are in the names, but such things are typically at the end 
of the name, so you might have to entry just a couple of additional 
characters to complete.

As to actually changing the names ... the problem is that there is no 
"Unix-friendly" standard for naming music files; the one you are seeing has 
pretty much been adopted universally, at least in the MP3 world (probably 
due to the standardizing influence of the CDDB servers, but that's just a 
guess). I know of no off-the shelf script to convert, probably because 
there is no consensus target for what to convert *to*.

A less targeted solution *might* be to try the app "mmv". (It's just 
amazing what oddities are lurking around in the Debian packaging system.) I 
haven't used it myself, but from the writeup, it might offer you some help 
... though you'll still have to define your own naming convention, so a 
custom script (see below) may be about as easy.

>As a last resort, I might try to write my own script. I'm not too hot on 
>doing that though, since I'm at an extremely rudimentary level when it 
>comes to script writing. If it comes to that, could someone maybe help me 
>get started by giving an example for a script that would do the renaming I 
>want? I'd like to retain the bulk of the information, though I don't mind 
>truncating words at, say 5 letters. I suppose the main thing would be 
>replaing all the spaces and/or punctuation with dashes and/or underscores.

Or just omitting spaces (that's how I name my video files, and the result 
is pretty readable if you are strict about capitalizing every word). Aside 
from spaces, the things you need to get rid of are, almost always, 
ampersand (&), comma (,), and apostrophe ('). Maybe tilde (~) too. Your 
examples above show a couple of truly weird characters (like the English 
pound and Japanese yen symbols, which I don't even know how to type in 
here), so you may need a bit more capability than this.

This sort of script wouldn't be too tough to write in Perl, just using a 
few tr commands plus the commands to read and mv file names, and probably a 
shell script wouldn't be much tougher. If I have time later, I'll try to 
cobble something together to serve as an example.

>Thanks, James

Do let us know what solution you end up with ... this is an interesting 
problem.

Closing thought ... take a look at

         http://www.classiccat.net/

for another way to acquire (legitimately, as far as I can tell) free MP3s 
of public domain classical recordings. Or try Google; there seems to be a 
lot of this stuff around.


-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 16:37   ` James Miller
@ 2005-04-15 17:16     ` J.
  2005-04-15 18:17     ` Flemming Greve Skovengaard
  1 sibling, 0 replies; 10+ messages in thread
From: J. @ 2005-04-15 17:16 UTC (permalink / raw)
  To: linux-newbie

On Fri, 15 Apr 2005, James Miller wrote:

> I tried your perl script and it works really well, Flemming. Thanks for 
> bringing it to my attention. I see it works for all files in a given 
> directory--exactly what I need. Now, in place of something like
> 
> Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ Strings\ In\ D\ Major\,\ Op.565\,\ G411\ -.\ Andante\ Come\ Prima.mp3
> 
> I get
> 
> Patrick_Cohen__and__Mosaiques_Quartet_-_Quintet_For_Piano__and__Strings_In_D_Major._Op.56£¯5._G411_-¥²._Andante_Come_Prima.mp3
> 
> --a big step in the right direction. But I'm still getting some wierd 
> characters in there--£¯ and ¥². Are these unicode or something? 

To strip or translate characters that can't be printed properly or aren't 
in your locale `man locale' you can use `tr' [translate] from the 
command-line, for example translate all non-printable characters with
underscores.
~: tr -sc '![:print:]' '_'
translate a `space' to an underscore
~: tr ' ' '_'
DOS to UNIX newlines:
tr -d '\015' < foo.txt > newfoo.txt

or in combi with other programs..
find . -type d -print | xargs rename 'tr/A-Z/a-z/;' 

If your translating filenames, from the command-line in a sequence
of events, make sure you use `read', that will help reading non-ascii 
data. `man ascii'

~: ls -1 | while read FNAME ; do echo ${FNAME} | tr ' ' '_' ; done

Most GNU/Linux distro's come with a standaard program called `rename'
~: rename -h
Usage: rename [-v] perlexpr [filenames]

The Bash shell itself has quite some standard easy & powerfull ways onboard
to achive this, named `parameter expansion': 
ls -1 *.pdf | while read file ; do ps2ascii $file ${file%%pdf}txt ; done
or..
for file in *.bmp ; do convert $file ${file%.bmp}.png ; done

From the bash manual page:
# remove from end the smallest part from `param'
# that equals `word' and returns the remainder as result.
${param%%word}

There are more ways to expand parameters, ${param#word}, ${param##word}, 
 ${foo:+bar}... Just look in the manual page..

To give an example, in combination with find :
~: find . -name '*.jpg' -print | while read file ; do echo "${file%/*}/thumb_${file##*/}" ; done

The unix philosophy is, do one thing and do it good.. Search for `Unix
philosopy' at google to learn more about it. Quite interesting, and will
give some historical insights and understanding on the command-line. 

J.

> Anyway, I 
> can't reproduce these at the command line. Is there any way your script 
> might be made to catch and replace symbols like these as well (I mean, for 
> someone who knows absolutely nothing about Perl, and precious little about 
> scripting in general)? I have no idea what information these symbols are 
> supposed to be representing. It's probably so inconsequential I don't even 
> need it, so replacing it with virtually any other symbol should suffice. 
> I'd say I've got at least 20 files with such symbols, and more are on the 
> way.
> 
> Thanks, James

--
Don't worry Ma'am. We're university students, - we know what we're doing.

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 15:04 convert windows file names James Miller
                   ` (2 preceding siblings ...)
  2005-04-15 16:48 ` Ray Olszewski
@ 2005-04-15 18:00 ` chuck gelm
  3 siblings, 0 replies; 10+ messages in thread
From: chuck gelm @ 2005-04-15 18:00 UTC (permalink / raw)
  To: James Miller; +Cc: linux-newbie

James Miller wrote:
<snip>

> But, on to file names. unfortunately, the names for the pieces I'm 
> recording from this station follow Windows long-file-naming conventions. 
> Even worse, the names tend to be quite complex and long. Here are a 
> couple of examples:
> 
> Anton\ Reicha-\ Albert\ Schweitzer\ Quintett\ -\ Wind\ Quintet\ No.9\ 
> in\ D\ major\ Op.91\ No.3-\ Finale-\ Allegretto.mp3
> 
> Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ 
> Strings\ In\ D\ Major\,\ Op.56??5\,\ G411\ -??.\ Andante\ Come\ Prima.mp3
> 
> Feeding those names to cat so I can join the movements into a single 
> file is going to be a major pain in the wazoo, as they say down at 
> symphony hall. What I was hoping to find is a script that would 
> automatically convert all the wierd characters into more standard Unix 
> file-naming characters. But so far I've come up empty-handed. Can anyone 
> point me to some utility that might do what I need?
> 
> As a last resort, I might try to write my own script. I'm not too hot on 
> doing that though, since I'm at an extremely rudimentary level when it 
> comes to script writing. If it comes to that, could someone maybe help 
> me get started by giving an example for a script that would do the 
> renaming I want? I'd like to retain the bulk of the information, though 
> I don't mind truncating words at, say 5 letters. I suppose the main 
> thing would be replaing all the spaces and/or punctuation with dashes 
> and/or underscores.
> 
> Thanks, Jam
> es

man rename
...
rename " " "_" * # over and over again. ;-)

HTH, Chuck

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 16:37   ` James Miller
  2005-04-15 17:16     ` J.
@ 2005-04-15 18:17     ` Flemming Greve Skovengaard
  2005-04-15 19:29       ` James Miller
  1 sibling, 1 reply; 10+ messages in thread
From: Flemming Greve Skovengaard @ 2005-04-15 18:17 UTC (permalink / raw)
  To: linux-newbie; +Cc: James Miller

James Miller wrote:
> I tried your perl script and it works really well, Flemming. Thanks for 
> bringing it to my attention. I see it works for all files in a given 
> directory--exactly what I need. Now, in place of something like
> 
> Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ 
> Strings\ In\ D\ Major\,\ Op.565\,\ G411\ -.\ Andante\ Come\ Prima.mp3
> 
> I get
> 
> Patrick_Cohen__and__Mosaiques_Quartet_-_Quintet_For_Piano__and__Strings_In_D_Major._Op.565._G411_-._Andante_Come_Prima.mp3 
> 
> 
> --a big step in the right direction. But I'm still getting some wierd 
> characters in there-- and . Are these unicode or something? Anyway, 
> I can't reproduce these at the command line. Is there any way your 
> script might be made to catch and replace symbols like these as well (I 
> mean, for someone who knows absolutely nothing about Perl, and precious 
> little about scripting in general)? I have no idea what information 
> these symbols are supposed to be representing. It's probably so 
> inconsequential I don't even need it, so replacing it with virtually any 
> other symbol should suffice. I'd say I've got at least 20 files with 
> such symbols, and more are on the way.
> 
> Thanks, Jam
> es

This should remove the characters in question.

Replace the if-statement with this on:
	if ($new_name =~ m/(?:^[-+]|[^-\w.])/ ) {

and add this line below the other $new_name... lines:
		$new_name =~ s/[^-\w.]//g;

I have only tested this very little, but the corrections removes anything
that is not a hyphen (-), a word character (A-Za-z_) or a period (.).


-- 
Flemming Greve Skovengaard           FAITH, n.
a.k.a Greven, TuxPower                   Belief without evidence in what is told
<dsl58893@vip.cybercity.dk>              by one who speaks without knowledge,
4112.38 BogoMIPS                         of things without parallel.

-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 16:48 ` Ray Olszewski
@ 2005-04-15 18:49   ` James Miller
  0 siblings, 0 replies; 10+ messages in thread
From: James Miller @ 2005-04-15 18:49 UTC (permalink / raw)
  To: Ray Olszewski; +Cc: linux-newbie

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; CHARSET="en_US.UTF-8"; FORMAT=flowed, Size: 6604 bytes --]

Thanks for your input, Ray.

On Fri, 15 Apr 2005, Ray Olszewski wrote:

> 1. See if the app "mp3wrap" (that's the Debian-Sid package name) helps in any 
> way relevant to your problem.

Ok. I'll check that.

> 2. Consider using playlists rather than joining the files. If you use xmms 
> for playback, it has a deccent GUI for creating playlists, and it has no 
> problem with the "weird" characters. (This naming standard is also a problem 
> for those of us who rip our own CDs to our hard drives as well, and xmms plus 
> playlists has always handled it for me.)

In this case, I think cat'ting files together is going to be the right 
solution, though it might create a bit of extra work. I've discovered that 
it joinsd movments together almost precisely at the point where the break 
was made (ca. 30 seconds into the beginning of a given movement). The 
breaks are almost unnoticeable when I cat the files together into a single 
file. I'm guessing with a playlist, I'd get a pause at the break, meaning 
I would hear the first 30 seconds of a movement at the end of the 
preceding movement, then a pause, then a continuation of the movement that 
got cut off by the software. Isn't that how the playlist scenario would 
work out?

> 3. TAB completion is your friend. Using your first example, I bet something 
> close to "Anton<TAB>" would complete that entry on the command line. From the 
> examples you provided, I can't back out where the movement differentiators 
> are in the names, but such things are typically at the end of the name, so 
> you might have to entry just a couple of additional characters to complete.

Tab completion only gets me so far, and is less helpful because of all 
the wierd characters that are getting inserted. For example, "cat Pat" 
gets me the following output (I include the tail end of the direcory's 
content from an ls command):

Luigi Boccherini - No. 8 in B flat -- Allegro.mp3
Luigi Boccherini - No. 8 in B flat -- Andante affetuoso.mp3
Luigi Boccherini - No. 9 in F -- Adagio assai.mp3
Luigi Boccherini - No. 9 in F -- Andantino.mp3
Luigi Boccherini - No. 9. in F -- Tempo di minuetto amoroso.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In D Major, Op.56£¯5, G411 -¥². Andante Come Prima.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In D Major, Op.56£¯5, G411 -¥³. Variazioni.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In D Major, Op.56£¯5, G411 -¥°. Andante Sostenuto.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In D Major, Op.56£¯5, G411 -¥±. Minuetto Allegro.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In E Minor, Op.56£¯1, G407 -¥². Minuetto Con Moto.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In E Minor, Op.56£¯1, G407 -¥³. Allegretto Final.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In E Minor, Op.56£¯1, G407 -¥±. Adagio.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In E Minor, Op.56£¯1, G407 -¥°. Allegro Comodo.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In F Major, Op.56£¯2, G408 -¥². Poco Adagio.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In F Major, Op.56£¯2, G408 -¥³. Allegretto.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In F Major, Op.56£¯2, G408 -¥°. Allegretto.mp3
Patrick Cohen & Mosaiques Quartet - Quintet For Piano & Strings In F Major, Op.56£¯2, G408 -¥±. Minuetto Amoroso.mp3
Portland Baroque Orchestra under Monica Huggett, with Richard Savino, baroque guitar - Concerto In A Major, Op. 30 
Allegro Maetoso.mp3
Portland Baroque Orchestra under Monica Huggett, with Richard Savino, baroque guitar - Concerto In A Major, Op. 30 
Polonaise.mp3
Portland Baroque Orchestra under Monica Huggett, with Richard Savino, baroque guitar - Concerto In A Major, Op. 30 
Siciliana.mp3
Portland Baroque Orchestra under Monica Huggett, with Richard Savino, baroque guitar - Sinfonia Allegro E Con 
Imperio.mp3
Portland Baroque Orchestra under Monica Huggett, with Richard Savino, baroque guitar - Sinfonia Allegro.mp3
Portland Baroque Orchestra under Monica Huggett, with Richard Savino, baroque guitar - Sinfonia Grave.mp3
**ME**@mymachine:~/Classical$ cat Patrick\ Cohen\ \&\ Mosaiques\ Quartet\ -\ Quintet\ For\ Piano\ \&\ Strings\ In\

As you can see, I have to complete the key, then the opus number, then the 
Yen symbol--etc etc. It's still pretty involved, even using 
autocompletion. Getting rid of odd characters in file names won't help much with 
that, but it will give me a valid name to use as my full-piece.mp3 file 
name.

> A less targeted solution *might* be to try the app "mmv". (It's just amazing 
> what oddities are lurking around in the Debian packaging system.) I haven't 
> used it myself, but from the writeup, it might offer you some help ... though 
> you'll still have to define your own naming convention, so a custom script 
> (see below) may be about as easy.

I'll take a look at that, too.

> This sort of script wouldn't be too tough to write in Perl, just using a few 
> tr commands plus the commands to read and mv file names, and probably a shell 
> script wouldn't be much tougher. If I have time later, I'll try to cobble 
> something together to serve as an example.

Flemming's has gotten me most of the way. He's just posted a slight 
modification that may get rid of the rest of the problem characters for 
me. I'll post to the list whether it works as advertised. If you have the 
time to try a bash script, that would be interesting to see as well.

> Closing thought ... take a look at
>
>        http://www.classiccat.net/
>
> for another way to acquire (legitimately, as far as I can tell) free MP3s of 
> public domain classical recordings. Or try Google; there seems to be a lot of 
> this stuff around.

Another interesting way the internet is turning staid institutions on 
their ears. I was expecting to see old recordings digitized or something. 
But the little bit of snooping I did there reveals that at least some of 
these (the Beethoven piano sonata's I was interested in) are private 
recordings done by unknown (but wanting-to-be-known) artists. It's not 
going to be Schnabel, but nearly any decent interpretation will suit my 
needs just fine. The recording industry has not only file sharing to worry 
about, but independents with access to a wide audience without corporate
intermediaries.

James

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: convert windows file names
  2005-04-15 18:17     ` Flemming Greve Skovengaard
@ 2005-04-15 19:29       ` James Miller
  0 siblings, 0 replies; 10+ messages in thread
From: James Miller @ 2005-04-15 19:29 UTC (permalink / raw)
  To: linux-newbie

Thanks for offering that, Flemming. Modifying the original script per your 
directions does, indeed, seem to get rid of the other extraneous 
characters. I ran it in a test directory, and the results seem to get just 
what I was hoping for. I think I'll go ahead and run it in the real 
directory now, where I have probably a couple hundred files in need of 
renaming. Great work!

Thanks, James

On Fri, 15 Apr 2005, Flemming Greve Skovengaard wrote:

> This should remove the characters in question.
>
> Replace the if-statement with this on:
> 	if ($new_name =~ m/(?:^[-+]|[^-\w.])/ ) {
>
> and add this line below the other $new_name... lines:
> 		$new_name =~ s/[^-\w.]//g;
>
> I have only tested this very little, but the corrections removes anything
> that is not a hyphen (-), a word character (A-Za-z_) or a period (.).
>
>
> -- 
> Flemming Greve Skovengaard           FAITH, n.
> a.k.a Greven, TuxPower                   Belief without evidence in what is 
> told
> <dsl58893@vip.cybercity.dk>              by one who speaks without knowledge,
> 4112.38 BogoMIPS                         of things without parallel.
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-04-15 19:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-15 15:04 convert windows file names James Miller
2005-04-15 15:15 ` Flemming Greve Skovengaard
2005-04-15 16:37   ` James Miller
2005-04-15 17:16     ` J.
2005-04-15 18:17     ` Flemming Greve Skovengaard
2005-04-15 19:29       ` James Miller
2005-04-15 15:39 ` Peter
2005-04-15 16:48 ` Ray Olszewski
2005-04-15 18:49   ` James Miller
2005-04-15 18:00 ` chuck gelm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox