PDA

View Full Version : UTF-8 encoding for accented characters in FGU?



Oso Buho
December 3rd, 2021, 12:55
Hi everyone,

As I'm working in a ruleset. I found out that whilst I can have accented characters display (á, é, í, ó, ú, ñ, etc...) from the spanish alphabet by modifying the xml encoding from iso-8859-1 to UTF-8. LUA doesn't "recognize" them.

An example is trying to parse them with a .find.

I'm trying to detect the score in an ability that has the format: "ABILITY_NAME SCORE_VALUE".

So, I do a string.find and it cuts as soon as it encounters an accented character.

e.g.: Prestidigitación 19 cuts at Prestidigitaci and stops there.

Doing a couple of days of research, it seems like the issue may be that the os.locale of Fantasy Grounds itself is set up at iso-8859-1 or another encoding that is not UTF-8 or latin-1.

Is there a chance that for a future patch the encoding of the application itself can be set to UTF-8 allowing for most of the characters out there?

lozanoje
December 3rd, 2021, 14:59
We are in 2021, seriously?

Does Smite Works has anything to say about supporting locales other than en?

GM_Morgoth
December 3rd, 2021, 16:02
We are in 2021, seriously?

Does Smite Works has anything to say about supporting locales other than en?

Smiteworks doesn't care about other languages than English LOL

Oso Buho
December 3rd, 2021, 16:09
Ok ok, just in case.

I would prefer if this doesn't become a Witch Hunt!

I honestly ask if that's a possibility, I do understand that it may even be more complex than just changing the os.locale. Hopefully one of the devs can bring some more light to the topic.

apetina
December 3rd, 2021, 17:39
+1 to this suggestion. This is something really needed for non-English languages

elfurna
December 3rd, 2021, 18:03
I think there are a lot of non-English speaking gamers and gamemasters who would love to have material in their native language or even have the possibility that their extensions, themes or creations could be made in that language.

LordEntrails
December 3rd, 2021, 18:16
Be aware, prior to FGC retirement (which was only a few months ago), support for a more robust character set was not feasible. Now we are at the point (with no longer a need to support FGC) with FGU that changes to make such technically feasible are probably possible. But, there are still ~1700 DLC products using the old iso-8859-1 that must be supported. Also, for things like non-English language versions of WotC products (and probably all the other publishers), I'm fairly certain that SmiteWorks is not licensed to convert and distribute them.

Now, let's not highjack Oso Buho's thread which is a technical discussion regarding LUA character sets. Instead, I would suggest another thread be started to discuss non-English language support, as well that a request be placed on the Wish List and you all go vote for it there. By showing interest via the Wish List, it will help SmiteWorks understand the desire from the current user base and give them useful information in order to help prioritize the request compared to other features and improvements that are desirable.

DCrumb
December 4th, 2021, 03:22
I'm not sure how to do it, but iso-8859-1 should be able to display accented characters (when looking at the wiki page, they are in the Cx to Fx hexadecimal coding). Spanish is fully covered, with German mostly covered (Capital eszett doesn't have a character, but the lower case does). If using windows, you can use the character map to paste them, and also see the ALT-Number Pad combination that will work to print them - for the given example Alt-0243 will give the accented o character - ó.

Oso Buho
December 4th, 2021, 13:37
I'm not sure how to do it, but iso-8859-1 should be able to display accented characters (when looking at the wiki page, they are in the Cx to Fx hexadecimal coding). Spanish is fully covered, with German mostly covered (Capital eszett doesn't have a character, but the lower case does). If using windows, you can use the character map to paste them, and also see the ALT-Number Pad combination that will work to print them - for the given example Alt-0243 will give the accented o character - ó.

It does display them. The issue comes from then treating them through LUA code and displaying them again, it does with the ascii code instead of displaying the character.

psicodelix
December 4th, 2021, 15:31
+1 to the suggestion, I've struggled many times trying to parse strings in lua with accected characters, with poor results.

In my tests it looks as if normal characters occupy only one space in the string, but accented characters occupy two, which makes operations with substrings practically impossible.

I don't know if there is an easy solution from the FGU side, but if there is it would save us developers of other languages a lot of trouble.