Remove diacritics from file names




















Replace u, u, u, and u with u. Kent Kent k 30 30 gold badges silver badges bronze badges. Kent, I wanted to add a direct link for "the" man page for iconv -- but none of the ones I found contained that particular quote.

Would you like to add where you got it from? In answer I also mentioned man page of iconv. My current version is iconv GNU libc 2. Jongware — Kent. A side note on this answer: Check the character set of the target file when you get the iconv: illegal input sequence at position Interestingly if you are on Mac you will have to add the -e flag to the command line.

More infos : stackoverflow. The advantage with "sed" is it's almost everywhere. I like iconv as it handles all accents variations : cat non-ascii. For this the tr 1 command is for. Fred Fred 21 1 1 bronze badge.

If don't want it don't do it, but in both cases you're substituting a Latin near look-alike. Show 2 more comments. Community Bot 1 1 1 silver badge. Jabba: , 'utf8' is a "safety net" needed if you are testing input in terminal which by default does not use unicode. It doesn't hurt to be safe, though.

I'll change my function in order to remove the conversion to unicode: it will bomb more clearly if a non-unicode string is passed. Show 6 more comments. Thanks to you, I have created this function that works wonders. With Py2. A workaround for that was to add except TypeError: pass — Daniel Reis. You should catch the exception if the new symbol doesn't exist. This looks elegant in harnessing the semantic descriptions of characters that are available.

Do we really need the unicode function call in there with python 3 though? I think a tighter regex in place of the find would avoid all the trouble mentioned in the comment above, and also, memoization would help performance when it's a critical code path.

In any case, in my experience there is no universal, elegant solution to this problem. Depending on the application, any approach has its pros and cons. Quality-thriving tools like unidecode are based on hand-crafted tables. Some resources tables, algorithms are provided by Unicode, eg. I edited it. In response to MiniQuark's answer: I was trying to read in a csv file that was half-French containing accents and also some strings which would eventually become integers and floats.

This uses python's default encoding, which is "ascii". Since your file is encoded with UTF-8, this would fail. Lines 2 and 3 change python's default encoding to UTF-8, so then it works, as you found out. I tested: it works. I'll update my answer to make this clearer. Born in former Czechoslovakia I had many Czech and Slovak mp3 files, movies and books on my hard drive and almost all these files were using our diacritics and were accepted.

MS Windows and some Linux installs are known to have some issues with accents, but they handle it mostly well. I am giving it out, free of charge, in case someone comes accross the similar diacritic related issues and needs to mass rename the files and folders. Arronical Arronical It works so well!! Thank you!! It is not special. These are the characters we want to keep, after all, not the ones we want to replace. The effect is to ensure those characters are present just before the part we will actually replace.

The backslash is necessary because when a dot appears in a regular expression it otherwise matches any single character. Eliah Kagan Eliah Kagan k 51 51 gold badges silver badges bronze badges. Very cool and simple as well! I liked both solutions presented and added to my library of useful commands. Thank you! Step 1 Install ranger. Step 2 Run Ranger and navigate to the directory that has the files you wish to rename. Step 3 Press v to select all files Step 4 type :bulkrename and press Enter Step 5 Use a Vim macro to record the changes you wish to make.

So for your example, if you started on the very first line of the file you would use these commands: q a this starts the macro named "a" 0 , f. Step 7 :wq to save and exit and then :wq again to confirm the changes. RandomlyRainy RandomlyRainy 33 6 6 bronze badges.

Sign up or log in Sign up using Google. Sign up using Facebook. Sign up using Email and Password. Post as a guest Name. Email Required, but never shown. The Overflow Blog.



0コメント

  • 1000 / 1000