Thread: [5E] sanitize(s) method question
-
January 6th, 2021, 06:18 #1
[5E] sanitize(s) method question
I copied the sanitize method from the 5E ruleset and added it to my StringUtils package.
While testing cases, I tried passing in "Magic-User", expecting it to be converted to my expected classID of "magic_user". Instead the returned string was "magic-user".
Code:-- Used to convert non-xml names to valid xml names. -- Replaces invalid characters with "_". In addition, converts string to lower-case. -- @args s String to be scanned for characters to replace. -- @returns A string with the indicated characters replaced by "_" function sanitize(s) local sSanitized = StringManager.trim(s:gsub("%s%(.*%)$", "")); -- @TODO Review (posted on forums): I added the "-" at start of the character set since "-" was not being replaced by "_" sSanitized = sSanitized:gsub("[-.,-():'’/?+–]", "_"):gsub("%s", ""):lower(); return sSanitized end
It is because one of those is being misinterpreted when inside the "[]" construct as a special character/escape?
It is "working" now, BUT I am sure what I did is not the solution AND it may even be wrong for every OTHER case for all I know, I broke something doing what I did.
More REGEX-y peeps please chime in (remember, in earlier post that I am Regex-challenged.)
-
January 6th, 2021, 14:00 #2
Probably something to do with magic characters
https://www.lua.org/pil/20.2.html
MoreCore - Generic Ruleset
--- Projects ---
Extensions | Tutorials | MoreCore | MoreCore Themes | Call of Cthulhu | Maelstrom | FG Con
-
January 6th, 2021, 23:18 #3
Yup Damned - you are right
I put a "%" in front of the first hyphen and it worked:
sSanitized = sSanitized:gsub("[.,%-():'’/?+–]", "_"):gsub("%s", ""):lower();
The list of "magic characters" is:
( ) . % + - * ? [ ^ $
So I set it to:
sSanitized = sSanitized:gsub("[%.,%-%(%):'’/%?%+–]", "_"):gsub("%s", ""):lower();
Maybe Moon or others can test this and replace it in the 5E code. The last hyphen I didn't know what it was - it DOES look different than the first one so I didn't "escape" it.
But again - not an expert, just noticed it cos I was testing my copy of the 5E method before adding it to my utils package.
-
January 6th, 2021, 23:25 #4
If you convert the characters to ASCII you will see that the last one is definitely a different character.
MoreCore - Generic Ruleset
--- Projects ---
Extensions | Tutorials | MoreCore | MoreCore Themes | Call of Cthulhu | Maelstrom | FG Con
-
January 7th, 2021, 03:16 #5
Yeah, I can tell visually - just wondering
1) How it is referenced in describing it so I do so in future correctly, for example "dash" or "hyphen" vs a "minus"
2) How do I enter that one on Mac or PC? I assume each of their "special chars" mechanisms, but I think #1 will help once know that.
BUT... you pointed out one very helpful thing too - that since it is NOT a "-" it is therefore not one of the "Magic Numbers" that I need "escape" so it was correct for me not to preface it with "%"
-
January 16th, 2021, 00:12 #6
Supreme Deity
- Join Date
- Mar 2007
- Posts
- 15,081
1) There are three common "dash" types - minus, en dash, em dash.
2) I usually copy the character from a web page about encodings in order to make sure I get the right "dash" I really want when coding.
I'll also fix up the escaping for that function in the next revision of the 5E ruleset.
Regards,
JPGFG Wish List - http://fgapp.idea.informer.com/
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks