File renaming

Post Reply
LasseThid
Advanced member
Posts: 353
Joined: Tue Mar 03, 2015 2:30 pm
Location: Molndal, Sweden

File renaming

Post by LasseThid »

Is it possible to limit the allowed characters in a filename to just a-z, 0-9 and _ ?
Enfocus Switch, Enfocus PitStop Server, Enfocus PDF Review, HP SmartStream& Kodak Prinergy with RBA
Offset 72x102, Offset Large Format, Digital Large Format and Digital print.
User avatar
Terkelsen
Advanced member
Posts: 297
Joined: Thu Sep 08, 2011 5:08 pm
Contact:

Re: File renaming

Post by Terkelsen »

Hmm, this RegEx

[^[a-zA-Z0-9_]]*

is supposed to find everything but a-z, A-Z, 0-9 and _

However, I've tried to use it with the Rename element for Search & Replace, but for some reason it doesn't seem to work :-/
Padawan
Advanced member
Posts: 358
Joined: Mon Jun 12, 2017 8:48 pm
Location: Belgium
Contact:

Re: File renaming

Post by Padawan »

I did some tests and got it to work by using the following as search regular expression:

[^a-zA-Z0-9_]*

and setting Repeat to Consecutive

Can you try this?

Terkelsen, your regex worked in online regex testers for me, that made me believe it was something specific to the engine used in Switch. Then I checked the regular expression part of the Switch documentation and that's how I found what was wrong :)

http://www.enfocus.com/manuals/UserGuid ... ions.html
jan_suhr
Advanced member
Posts: 586
Joined: Fri Nov 04, 2011 1:12 pm
Location: Nyköping, Sweden

Re: File renaming

Post by jan_suhr »

In the "Rename Job" element you have the option "Reduce character set" in it you can set the "Allowed character set" to "Portable ASCII" and that will give what you want.

It will replace å, ä and ö with Underscore, Space or remove the not allowed characters.
Jan Suhr
Color Consult AB
Sweden
=============
Check out my apps
User avatar
Terkelsen
Advanced member
Posts: 297
Joined: Thu Sep 08, 2011 5:08 pm
Contact:

Re: File renaming

Post by Terkelsen »

Padawan wrote: Thu Oct 25, 2018 5:36 pm
Terkelsen, your regex worked in online regex testers for me, that made me believe it was something specific to the engine used in Switch. Then I checked the regular expression part of the Switch documentation and that's how I found what was wrong :)

http://www.enfocus.com/manuals/UserGuid ... ions.html
That's good to know, Padawan. I'm not an expert in RegEx, but I guess it has to do with the message in Switch saying tha the RegEx is automatically anchored at both ends?
Padawan
Advanced member
Posts: 358
Joined: Mon Jun 12, 2017 8:48 pm
Location: Belgium
Contact:

Re: File renaming

Post by Padawan »

To be honest, I'm not really sure what the sentence about anchoring in Switch means.

I moved the ^ character in your regex based on this piece of info in the Switch documentation on regular expressions:
The caret negates the character set if it occurs as the first character, that is immediately after the opening square bracket. For example, [abc] matches 'a' or 'b' or 'c', but [^abc] matches anything except 'a' or 'b' or 'c'.
And I suggested to set Consecutive as repeat mode because in my testfile I had bad characters at the front and at the back of the filename and that was the only way I could clean it up correctly.
User avatar
Terkelsen
Advanced member
Posts: 297
Joined: Thu Sep 08, 2011 5:08 pm
Contact:

Re: File renaming

Post by Terkelsen »

jan_suhr wrote: Thu Oct 25, 2018 6:44 pm In the "Rename Job" element you have the option "Reduce character set" in it you can set the "Allowed character set" to "Portable ASCII" and that will give what you want.

It will replace å, ä and ö with Underscore, Space or remove the not allowed characters.
Jan Suhr, this will definitely remove or replace the characters like æ,ø,å,ö,ä, but it will still allow special characters like &,%,# etc.
bens
Advanced member
Posts: 252
Joined: Thu Mar 03, 2011 10:13 am

Re: File renaming

Post by bens »

Padawan wrote: Fri Oct 26, 2018 1:08 pm To be honest, I'm not really sure what the sentence about anchoring in Switch means.
I would guess it means that an expression "x" is equivalent to "^x$". That is, Switch automatically adds the start of string (^) and end of string ($) anchors.
Terkelsen wrote: Thu Oct 25, 2018 2:47 pm Hmm, this RegEx

[^[a-zA-Z0-9_]]*

is supposed to find everything but a-z, A-Z, 0-9 and _

However, I've tried to use it with the Rename element for Search & Replace, but for some reason it doesn't seem to work :-/
The square brackets denote a character set. What happens when you have another square bracket inside that set depends on the engine that's used. Most often, it's seen as a literal character. So for example "[[]]" means "the character set with characters '[' and ']'. There are a few exceptions, like the sequence "[:" which starts a special ("POSIX") character class.
Additionally, the caret (^) has the special meaning "not" only if it's the first character after an opening square bracket that starts a character set.

Putting everything together, "[^[a-zA-Z0-9_]]*" would be translated to "^[^[a-zA-Z0-9_]]*$", meaning: "start of the string, followed by 0 or more characters that are not [, a-z, A-Z, 0-9, _, or ], followed by the end of the string."

But, again, some of this depends on the regex engine being used so you may get different results in different products. Aren't regular expressions fun.
LasseThid
Advanced member
Posts: 353
Joined: Tue Mar 03, 2015 2:30 pm
Location: Molndal, Sweden

Re: File renaming

Post by LasseThid »

Terkelsen wrote: Fri Oct 26, 2018 1:26 pm
jan_suhr wrote: Thu Oct 25, 2018 6:44 pm In the "Rename Job" element you have the option "Reduce character set" in it you can set the "Allowed character set" to "Portable ASCII" and that will give what you want.

It will replace å, ä and ö with Underscore, Space or remove the not allowed characters.
Jan Suhr, this will definitely remove or replace the characters like æ,ø,å,ö,ä, but it will still allow special characters like &,%,# etc.
Correct, that's why I need to limit the characters allowed in the file names even more. especially since I sometimes use "-" as a delimiter and quite a few customers use that character in their file names as well, which cause problems. Hence I'd like to limit the characters to basically letters and numbers.
Enfocus Switch, Enfocus PitStop Server, Enfocus PDF Review, HP SmartStream& Kodak Prinergy with RBA
Offset 72x102, Offset Large Format, Digital Large Format and Digital print.
Post Reply