Quickly remove special characters from file names
September 1st 2010 – Version (2.3 Beta) is available. It adds the following new features:
- A “Dry Run” mode. In this mode the application will only display and log the changes to the file/directory names without actually renaming the files and directories. This gives the user the ability to get an idea about what will be done before taking the plunge.
- Fixes a bug where if a file or directory name contains only special characters, the renaming will fail and the recursive algorithm would try to keep going.
June 24th 2010 – Version (2.2 Beta). It adds the following new features:
- Special characters could be removed from directory names as well.
- It could be run in recursive mode that will allow renaming of all files and/or directories in all the sub-directories.
- Removing all the dots in the file names but the last one that indicates the file extension.
- The underscore is no longer considered a special character and it is not removed from the file names.
A few years ago I wrote a small application to remove all special characters from the file names of all the files in a directory.
Very often I would get a bunch of files that needed to be posted on a website and most of them would contain all kinds of special characters. I got fed up doing it manually- file by file, so I wrote this small app.
This is a Windows application written in C++ and works with win 98 and up. Of course Linux does not need anything like that, since you can do this with a quick one line shell command.
It only works with ASCII file names (sorry if you use any other language than English). I could have just as easily wrote it for UNICODE, but I had no need for that.
So, I decided to share it with anyone who wants to use it:
New v.2.3b: RenameFiles ver.2.3b
Old v.2.2b: RenameFiles ver.2.2 b
Old v.1.0:RenameFiles ver.1.0.
It is just a simple executable and does not need any installation. Keeping it simple is the key here. It also creates a log file in the same directory that the executable is in. The log file keeps track of the original file names and the new file names, so you can always find out what was done.
74 Comments to Quickly remove special characters from file names
Leave a comment
Search
Archive
| M | T | W | T | F | S | S |
|---|---|---|---|---|---|---|
| « Apr | ||||||
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | 31 | ||
Recent Comments
- Olivier on Dynamic Port Forwarding with SOCKS over SSH
- Ld7 on How to connect your Android phone to Ubuntu to do development, testing, installations or tethering
- get more Info on How to get Picasa images using the Image Picker on Android devices running any OS version
- Casper on How to detect a user pan/touch/drag on Android Map v2
- Install SSH as socks proxy for dynamic port forwarding | Steve Constine on Dynamic Port Forwarding with SOCKS over SSH
Categories
Blogroll
Online Tools
Other
BLOG ARCHIVE
- April 2013 (1)
- November 2012 (2)
- August 2012 (1)
- May 2012 (1)
- March 2012 (1)
- November 2011 (1)
- August 2011 (1)
- April 2011 (1)
- January 2011 (2)
- September 2010 (1)
- August 2010 (2)
- July 2010 (2)
- June 2010 (2)
- May 2010 (1)
- January 2010 (2)
- December 2009 (2)
- November 2009 (3)
- October 2009 (1)
- September 2009 (3)
- July 2009 (1)
- May 2009 (1)
- March 2009 (1)
- February 2009 (2)
- January 2009 (2)
- December 2008 (1)
- November 2008 (4)
- October 2008 (5)
What command do you use for removing special characters in Linux? I have a script file I have created that removes and logs anything I need removed from all files in a Directory. It is crude but works great; however, I am always discovering more sophisticated methods and an curious to see what you do in linux.
In Linux (as you probably know) there are a lot of stream and text editing commands like: sed, tr, awk, perl and many others.
So if you wanted to replace all the special characters of a string with “_” (underscore), you could do:
sed ‘s/[^A-Za-z0-9_.]/_/g’
Or if you wanted to delete all characters that are not alphanumeric you could do:
tr -dc ‘[:alnum:]‘
As I mentioned, there are many other commands for this in Linux.
Now if you wanted to remove all the special characters of all the files in a directory you could run this simple script:
for file in *
do
mv “$file” $(echo “$file” | sed ‘s/[^A-Za-z0-9_.]/_/g’)
done
Perfect! Much better than I was using
I was using sed with mv statements but identifying each special character unwanted.
Hi, i tried your tool it looks like exactly what i need however is there a way to make it work on folder names too and in subdirectories?
basically i need to remove all special charectors in my my music folder or my mp3 player crashes when loading the folder. thanks
Hi Amin,
I made a quick change to the program so it does a recursive renaming of all the files and directories under the folder that you choose. This will do what you needed. Keep in mind that I did this real quick and I have not had time to fully test it yet.
I will come up with a new version shortly that will be fully tested and will give the user the option to choose whether to do a recursive renaming or not or to touch directories or just files. I will publish it in a new post.
For now here is the modified application for your needs:
Rename Files Recursively
THANK YOU!! This is AWESOME!!! I cannot thank you enough!!
This is a great tool. Is there any chance the selection of characters to be replaced can be tweaked on my end?
I am trying to upload a ton of folder to SharePoint and need to remove all of the “.” and “&” but want to keep the spaces. When I run the tool, it removes all of the “&” but I am still stuck with the “.” from “Co.” in many of the docs names.
Thanks
Stevens
@Stevens
Unfortunately you cannot tweak it on your end. It is a compiled C++ code.
The only way you can do this is if you decompile, change the code and recompile but that is the long way to go about it.
One of these coming days/weeks I will add to the program the ability to select whether you want to preserve the spaces or replace them with under scores.
I just have to find a little spear time…
This will solve your problem with the spaces.
About the “.” – Sorry if I am answering your question with a question but why do you want to remove them?
They are not really considered special characters in the file/directory names.
Since you will be doing this for a SharePoint application I assume this is a Windows server. Windows does not care how many dots you have, it just takes whatever is after the last one to determine the file type.
the dots or periods need to be removed in order to upload the documents to Windows SharePoint. I’m guessing it is because of the MS SQL database that runs SharePoint. I have tested it on a document and it fails when the dot is in the name. example:
“SweetWater Brewing Co..docx” – Fails
“SweetWater Brewing Co.docx” – Uploads
Thanks for the quick response!
@Stevens
I modified the program to do what you needed. It will check for any special characters including “.” in the file/directory names disregarding the last “.” and anything after it. For example a file Co.$.R.dat will be renamed to CoR.dat.
I am assuming that the part after the last “.” is correct in your file names.
If you want to preserve the spaces, make sure to uncheck “Clear White Spaces” box.
Please note: I have not had the time to do an extensive testing. Please run against test data before using it on production data. Use at your own risk.
Here is the modified program.
That was perfect!! I was able to easily rename the folders, sub-folders,and file names to import them to SharePoint and MS OfficeLive. Thank you, you Rock!
-Stevens
Thanks for posting this tool . . . it really helped me when I needed it.
I’m sad it not process unicode filename as the special characters I want to remove are unicodes one ! (japenese charset that old apps can’t process)
Hi
I am not a programmer and do not necessarily understand all aspects.
I tried your little program and it solves a big problem I have at present. I have many hundreds of file to upload to a sharepoint site with chracters that are not allowed.
However after processing the files name look a real mess as some have several charaters removed. For some reason the file order changes totally. The first part is normally numeric. Is it possible to leave a space rather than just removing the characters.
Thanks
@Peter,
Did you check out the program I posted in my comment above from November the 20th, 2009?
Another person asked me to make some changes because he was using it to upload files to a SharePoint server as well.
What you are noticing about the change of the order of the files is a normal behavior. Your files are normally ordered alphabetically. So when you take special characters out they will be reordered in the directory. Even if you replace the special characters with spaces you will still get the files reordered.
For example if you have 2 files, one is called w2#b.txt and the other one is called w2$a.txt, windows will order them with the w2#b.txt file before the w2$a.txt file.
Now, if you remove the special characters, then the w2a.txt file will be before the w2b.txt file. This is the way it should be, unless you somehow change the alphabet
Likewise, if I were to replace the special characters with blank spaces, you will end up with the same result: the w2 a.txt file will be before the w2 b.txt file.
I hope this is clear enough and makes sense.
Hi
I have tried running your program to change my filenames – but it just keeps giving me an error
————–ERROR——————Current directory: E:\Folder\Music\pop\boFile Who Do You Love?.mp3 not found, cause = 3
Windows will not let me rename or delete this file.
Any help would be appreciated.
@ Jasper,
If Windows does not let you rename or delete it most likely it is in use by some process.
Use the Windows Process Explorer to find out what process it is:
http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
Hi,
The file is not in use by any process, I have even tried deleting the file in safe mode.
When I attempt to delete the file I get the error message:-
Cannot delete Who Do You Love?: The filename, directory name, or volume label syntax is incorrect.
If I try to rename the file I get the error message:-
Cannot rename file: Select only one file to rename, or use MS-DOS wildcards (for example, *.txt) t rename a group of files with similar names.
Since I am only selecting one file, the only file in the directory, I am unsure what I can do here – as typing an ‘*’into the filename is not allowed by windows.
I have unsuccessfully tried various file deletion programs, such as malwarebytes file Assassin, which tells me the file has been successfully deleted, even though it has not.
@ Jasper,
This discussion is beyond the scope of the program I posted to mass rename files.
What the Windows message suggested is not putting * in the file name. Rather opening a command prompt then cd into the directory containing the file and then execute:
del *.mp3
Keep in mind that the above command will delete all the mp3 files in the directory.
If you want to be more specific you can do:
del Who*.mp3
This will delete any file that starts with “Who” and ends with “.mp3″
I have previously tried lots of different ways of renaming or deleting this file using a command prompt – always with the same error message – Cannot delete Who Do You Love?: The filename, directory name, or volume label syntax is incorrect.
It seems the question mark in the filename has thrown a spanner in the works and now I can’t rename or delete the file – even using a program like yours which claims to “Quickly remove special characters from file names”
If this was a question of just one file – I would probably ignore it – but I have another directory with many many more files – all with the same problem – a special character in the filename prevents windows from recognising the file as a file – this in turn stops windows error checking tool checking the disk for errors.
Thanks anyway for your help – I am sure I will find a solution eventually.
Sincerely Jasper
@ Jasper,
The program does what it claims. It quickly removes special characters from file names and directories. It is designed to save you time if you have a lot of files you need to go through.
It does not however fix any issues you may have with your PC. If you cannot rename the file(s) manually (even in safe mode) this program will not be able to do it either.
I do not have any issue with my pc – My problem is that some of my filenames have special characters – and so cannot be renamed or deleted by windows.
I therefore looked for a program that claimed it could rename files that had special characters in them, which seemed like an intelligent thing to do.
I can only imagine that what I see as a ‘?’ in the filename is actually a non ASCII character, something Chinese or whatever – and so it is not recognised by either windows or your program – I therefore am unable to rename, access or delete the file.
Anyway, as I said before – thanks anyway for your help.
Jasper
@ Jasper…
My program only reads ASCII, if I ever have some time left I will add support for Unicode…
In your case there must be something more than just a non ASCII character, otherwise I would think that the wild card (*) should have taken care of it.
Have you tried booting into Linux (from a Live CD), then mounting the partition and deleting the files?
Wow. What a timesaving tool! Thanks for sharing. We appreciate it. I’m preparing to move a document library stored on a server over to SharePoint and this will really help!
Hi,
I tried your RenameFiles2.1b.exe program in filenames using characters of languages like German Turkish or French or Greek (mp3 files). The program when tried to rename such files it gives an error cause = 2. Is there any chance to modify the program just to change the non ASCII unicode characters to normal characters? Could that be possible also for directory names?
If I understood well the programm in case of a pathname including a directory name in Greek for example it cannot operate. Is that correct?
Thanks
Spyros
@ Spiros,
You are correct. The program only supports ASCII.
In order for the program to support multiple languages, I have to use Unicode instead of ACII. I cannot promise when I will have the time to do this.
Keep in mind that this will take more time than it might look like at first. I have to identify what a is “special character” what is not for each language/alphabet.
Thanks for the prompt reply.
Is there any chance by defining in your program an external configuration file, every interested user to define the special characters and their replacement. So the program would be configurable!
@ Spyros,
Having an external config file is not a problem and that is a good idea.
The program will still have to be changed to utilize Unicode characters though.
I cannot promise when I will be able to get to it…
Thanks – a very useful tool!!
Did 95% of the job cleaning up mp3 filenames. Only the non-ASCII characters failed. I’ll do them manually until a prog upgrade is availible.
Thanks again:)
This is a great tool! Is there a way for it not to remove underscores (_) or other characters that are allowed for SharePoint?
@Eric,
I published a new version of the application- 2.2b. It will work for you. Check the post above.
thank you !
i made an app which at some point created some files in xp with the character “:” in them; files of 0K in size that could not be deleted, but could be copied along with everything else making more trouble for me.
your app just vanished those files, not renamed them, and i’m more than happy with that.
just for reference, those files were created from jvm; you may think it should be impossible to create a file that cannot be accesed, deleted but can be copied and spread like a plague, yet windows is something of a pandora’s box
thanks yet again
Do you have a program to undo this based on log file? It worked as intended however it renamed some important files that need their original names to function. Would go through and do it manually but its 300+ files i need renamed back.
Thanks.
Nvm i fixed it myself using this mirc script:
alias test2 {
var %i = 1, %oldf
while (%i <= $finddir(j:\,*,0)) {
%f = $finddir(j:\,*,%i)
%oldf = $read(c:\properlog.txt,w,* $+ $mid(%f,4) $+ *)
%oldf2 = $read(c:\properlog.txt,$calc($readn – 2))
if (%oldf2 && $right(%oldf2,1) == \) {
rename %f " $+ %oldf2 $+ "
;echo -a " $+ %oldf2 $+ "
}
inc %i
}
}
@John,
That is the thing with programs like this… you have to be extra careful to make sure you do not rename things that should not be renamed. Extra caution is required especially when using the recursive function. That is why I did the log file! BTW… I have renamed the wrong files myself before… It happens.
Also, your choice of scripting language surprised me. I do not think that mIRC would have been my first choice. But… as long as it works…
Hello,
This tool could save me a lot of time migrating our fileshares onto SharePoint, however it could also cause a lot of issues at the same time!
Could you make it so that it did a read only pass of the directory and output the changes that would be made to the log file instead of actually making the changes?
thanks
Jack
@ Jack,
I have updated the program. It adds a “Dry Run” mode that will do what you requested.
The new version is 2.3b.
Thanks Dimitar, I’m just testing it now
Could it also be made so that the special characters could be specified? SharePoint allows “-” and “_” in a filename
@ Jack,
The short answer is “Yes it could.” But unlike the last change this will take over a couple of hours to do. And it has been very hard for me lately to find that kind of extra time.
It will get implemented in the next versions. I just do not know when…
I thought it was a bit of a long shot – thanks for the update though
hi
I have a problem trying to rename/delete files with invalid characters in their filenames
I have used your program without sucess
They are files that I copied from linux to a windows ntfs partition.
For example:
smb-server:server=ubuntu-e0119c22.log
The invalid character is the “double points (:)”
Do you know how i can delete them from windows?
Regards
@carlos,
You have hit onto a situation where you have a special character that actually has a meaning for the Windows kernel and the ntfs files system. The “:” (colon) character is used as a delimiter between file names and data streams. You can read up on Alternative Data Streams if you want more information on that (I would recommend it because you can do pretty cool stuff with this).
So in your case Windows does not see smb-server:server=ubuntu-e0119c22.log as one file name, rather it sees it as “smb-server” being a filename and “server=ubuntu-e0119c22.log” is the data stream.
My program is not smart enough to identify this situation.
The easiest way to handle this is to boot into Linux. Just get an Ubuntu Live CD, put it in your drive and reboot (make sure the bios is set to boot from the CD/DVD ROM fist). It will mount your hard drives automatically, then you can just rename them. If there are too many of these use the script I posted in one of the comments above.
Mate this is a godsend tool. I love its simplicity, all the other tools need too much customisation to do what yours does instantly. thanks again.
@Jasper: how long are your directories/sub-directories? I have hit the same issue deleting files when i have mroe than 255 chars in total directory length. Just rename one of the folders shorter first then you should be able to delete.
YOU ROCK! Thanks I have just been saved hours of manually renaming files on the server. AWESOME JOB THANXXXXX
Just wanted to let you know that the tool didn’t work for me renaming files with encoding related special characters the OS is not able to delete any longer with it’s own shell.
dimitar,
first of all, thanks for a great (and simple) batch file rename program that works great.
I was wondering if there was a simple way to replace the special characters with a space?
thanks again and i would donate a few bucks if you had a donate link!
@Jeff,
There is no simple way for you to modify the program… it is a compiled C++ code. If you need it to replace the special characters with spaces, then I need to put that feature in… and now I have even less time than I did before
Keep in mind that I did this tool initially for myself and what I needed it for was to get rid of any special characters (including spaces). I am a programmer and a systems administrator and files and directories with special characters (including spaces) could be a cause for a headache.
As soon as I have some time… I will add a feature where you will be able to specify the character to be used in the substitution. Again… might be a while before I get to it!
As far as a donation link… I have never thought about it.., Since I had developed it for myself initially it did not cost me anything to make it available to everyone who had the need for it.
I used this stuff from CodePlex: http://www.codeproject.com/KB/string/FontGlyphSet.aspx
All source is included and i needed to modify it a little to get my text to work as filenames in Sharepoint. Your mileage may vary…
It worked like a charm even in a webservice. Now i am processing non-printable unicode characters out of the filename. it is sweet.
This is a great tool. It saved me a lot of work. Thank you for making it available to the general public.
great tool BUT
I have some music downloads and some, maybe 5% of the files have illegal filenames and your tool does not seem able to correct these. Its sees the filenems and corrects it on the dry run but the real run comes up with error 3 file not found or some such
Any ideas why?
Elsewhere I read that the problem I have is due to incorrect characters in NTFS files on windows.
Maybe your tool cannot make this correction because its cant write back to the NTFS file?
Is this something that is fixable?
@ Denison,
From the error it looks like it cold not rename them, not because it could not write but because the source file was not found. That means that these files have characters outside of the ASCII character set.
What I would try is first using wild cards in command line, for example:
ren *file1.mp3 new_file_name.mp3
Here the “*” character will be the substitute of all the special characters.
If that does not work, I suggest booting into Linux from a live CD. That should definitely take care of business.
Could you tell me what characters are removed? Thanks.
@ Will,
If the characters do not fall in any of the following categories, then they are considered special characters:
- digits 0 through 9
- capital letters A though Z
- lower case letters a through z
- underscore
- blank space
Hello,
I tried application to remove special characters from CZ language. Preview worked just fine, but real run did not find files because of CZ names.
Beatifull things is, there was 700 files to rename and every one now shows message “file not found”
Application cannot be ended in task manager only restart will solve this situation.
Sorry, I would not recomend this to my friend
@ anonymus,
Please read before use! This application is not meant for any other language but English. Here is what it says in the description:
“It only works with ASCII file names (sorry if you use any other language than English). ”
Again, I made this application for my own use only and much later I decided to share it with everyone else who wants to use it. It is not a commercial product and my goal has never been to support everyone (and every language) out there. Sorry!
Thank you. This was extremely helpful and saved a lot of time. Very kind of you to share it and make it easily accessible online.
I tried it with filenames containing characters like Ç and Ã, and it stripped those away just fine, those aren’t ASCII, are they?
@Tiago,
I know they do not look like it, but they are actually ASCII characters. Take a look at the tables here: http://www.asciitable.com/. They are under the extended ASCII codes.
I was wondering if you could estimate when you would be putting out the unicode version. I only have one file that contains a “?” and nothing I’ve read has seemed to be able to remove it. Your program keeps producing and error cause = 3 for it. No rush just trying to recover some old files off a cell phone.
Hi,
I have a mp3 folder with some folders and files in unicode text format.is there any way to replace the unicode filenames and folder names with other characters because i am using linux and it doesnt want to open any of the folders…anybody know how i can proceed?
Hi there,
I just wanted to say thank you for this great little piece of software. I’ve had numerous files containing Japanese characters that cause one of my machines to crash and this will save me a lot of frustration.
Thanks again.
Awesome!! I’m sure this was no big deal for you to write, but this thing just made my life much easier!
Way to go!!
This is a GREAT tool, when do you think you will have time to add the “replace spaces with_” function?
I work in Cygwin so often get in trouble with crossing over.
My messed up files contain question marks at the end of the filename, but can see they look like below.
$ ls -b
21078.xls\r 21232.vsd\r 24296.ppt\r 24347.doc\r
The program renamefiles errors saying File 21078.xls? not found, cause = 3.
But I was able to fix my files using a shell script with tr -dc ‘[:print:]‘. I trick I found in this thread. Thanks!
Patrick, that’s simple:
for x in *
do
mv “$x” $(echo “$x” | sed ‘s/\ /_/g’)
done
Hi Dimitar
I’m from South Africa and a client that I’ve been working with requires all their product documents uploaded into a SharePoint library. There are about 40 500 files with around 3000 files that have character’s in the names. I found that out using SharePrep , unfortunately it is not a free tool. After searching a bit more I found your site. Thanks so much, your program has saved me a huge amount of time and effort. I just wanted to know could this also be done using powershell or management shell?
Also I have another major problem. I need to upload all these files and specify the required metadata. Would you know how I can bulk upload with metadata?
Thanking you in advance for your help.
@Kercheval,
I have no experience with MS power shell or management shell. I am not really a .NET person
This is something fairly simple and if MS’s “power” shell cannot do this, then I doubt it could do anything! I just do not know enough about it to provide you with a script.
Unfortunately I cannot give you a better answer for your meta-data problem. You will have to change the meta-data of the files before you upload them, not during. Also, depending on the file formats and what meta-data you need to change, there probably will be different tools for that.
If you just want to preserve file meta-data like creation date, permissions, etc. when uploading them, take a look at the robocopy utility.
Hey! Thanks so much for this tool! Im looking to rename the files instead of replacing them. Do you think you could share the source so I can make the code change? I’ll email you all the edits I make so you can upload it back to this blog.
@Darus,
The program is already doing this. It renames the files. It does not replace them.
Good day. Do you have a version one can run in cmd? No gui. Want it to rename stuff for me once a day. Would like to schedule it.
PS Awesome little app.
Fine and Fast. How about replace Special Characters with a Space, then replacing Multiple Spaces with 1 Space. Would be an option I would use a lot…
I am having the same problem as Jasper. Audio files (AIF) with a “?” in the file title. Actually the folder they are in has the same character. Every time I try to rename or do anything with them, I get a syntax error. I was excited for your software, but I am getting the same error as Jasper- 3. All the tools I have found online that claim to remove unwanted characters are unable to do it. I have tried with more than one computer, and it seems pretty clear to me that it is not my machine causing the problem. Anyway thought I would mention it. Thanks for your software- the search continues!
Here’s a start at removing special characters via powershell. Could certainly be made more refined in to a cleaner script. Should remove special characters from files and directories (and subdirectories and files). Will raise an error if a name doesn’t have a special character but does continue on. Change to your working directory before you begin or update the -path for get-childitem.
Like I said, it needs work.
get-childitem -Path . -recurse | ForEach-Object { Rename-Item -path $_.fullname -NewName ([System.Text.RegularExpressions.Regex]::Replace($_.name,”[^1-9a-zA-Z_. ]“,”")); }