PDA

View Full Version : Obfuscation in a data parser (to avoid copyright infringement): A Question



Minty23185Fresh
May 9th, 2018, 06:00
This question is directed to the powers that be at Smiteworks.

I have been working on a parser that will break up the WotC Unearthed Arcana Mystic Class published PDF into the fields of a Library Class definition object.

I am all but done.
However. I use string delimiters to search for and break up the PDF data into the fields for the class object.
For example: the string constant, "Unearthed Arcana: The Mystic Class" is used to verify beginning-of-file.
And "level (3d6), and 17th level (4d6)." is used to ensure end-of-file.

Publishing complete verbatim strings from the PDF in my parser could, rightfully, invite accusations of copyright infringement by WotC.
To avoid opening this can of worms, and to protect both Smiteworks and myself I will be obfuscating all Mystic Class string constants.

Instead of publishing an extension with the two strings mentioned above I'll publish these two obfuscated string constants:
bof = "Xqhduwkhg#Dufdqd=#Wkh#P|vwlf#Fodvv"
eof = "ohyho#+6g9,/#dqg#4:wk#ohyho#+7g9,1"

At runtime, the extension will "unobfuscate" this gobblity-gook to properly parse the PDF file.
My extension will not be published with any strings remotely resembling the text within the published Mystic Class PDF file.

And now the question: Is this methodology acceptable to Smiteworks?

Moon Wizard
May 9th, 2018, 06:05
You'll have to reach out to Doug at [email protected] to figure out what is acceptable.

Regards,
JPG

damned
May 9th, 2018, 07:40
You can write a parser that takes copyright material and converts it to another format.
The parser itself does not breach copyright.

There is a DDI 4E parser, a Basic Parser etc already available here on this site.

You cannot include the source text that will be parsed though...

Moon Wizard
May 9th, 2018, 17:57
I mentioned talking to Doug, so that you can explain what you are doing to him.

However, if the source text is included in any way, we won't be able to allow posting on the forum. (i.e. you cannot include source text, even if the source text is obfuscated.) It's not about security; it's about copyright and content ownership.

Regards,
JPG

Minty23185Fresh
May 9th, 2018, 18:51
However, if the source text is included in any way, we won't be able to allow posting on the forum. (i.e. you cannot include source text, even if the source text is obfuscated.)

Okay. I went down the wrong methodology path then.
I should have been using numbers (e.g. "put the text between character 72 and 143 into field ...").
About a month of time wasted, more or less.

All users who want the Mystic Class are on their own, I'm done with this.

(BTW. No hard feelings toward Smiteworks and Fantasy Grounds. You gotta protect yourselves from da man. And that's why I asked before posting.)

Nickademus
May 10th, 2018, 04:32
And now you understand...

Trenloe
May 10th, 2018, 19:19
I suppose I'm unclear on what your original parser does. If you supply people with the parser, but they have to then supply their own PDF, could that be a way of working it?

Valeros's Basic Rules parser works against the Basic Rules PDFs downloadable from WotC. So, whereas the end result is copyright protected material, people are using the base PDF to provide data for personal use only, and the parser does that provide that PDF as part of the parser files.

As you have discovered, you can't distribute any data (in whatever form) that is self contained and results in people getting copyright protected material (without them supplying it themselves as the input - for personal use).

Minty23185Fresh
May 10th, 2018, 20:56
Sorry. One thinks that things are clear and concise because one (me) is so close to it... So please let me elaborate.

1) I would publish an extension that users would employ to automatically construct the Mystic Class. It would automate the process that Zacchaeus describes in his you tube video.
2) A user of my extension would have to obtain (download) the Unearthed Arcana Mystic Class PDF from the WotC website/store.
3) The user would then open the PDF, select the entire contents, and copy (Ctrl-C) the selection into the copy/paste buffer.
4) The user then creates a new campaign, with my extension loaded. He/she would create a new blank class and paste (Ctrl-V) the paste buffer into it.
5) The extension would break the pasted text into headers, bodies, tables, abilities, features and spells, populating the blank class with the pasted text (from the PDF).
6) A table is created in the Fantasy Grounds DB, as well as a Class, and about 220 Spells. The user could if they wish export these data to a module for self use.

And now the caveat!
The extension needs to identify and segment the pasted data from the PDF into headers, bodies, tables, etc.
To do this I used string segments from the PDF. Things like "The Mystic Class ...", "The Celerity Discipline...", "The Mystic Quirks table...". I string search/match the segments to the pasted data to delimit and identify headers, bodies, ...
To prevent someone, anyone from opening the extension source code and immediately identifying segments of text that exactly match text in the PDF I obfuscated the segments (examples above). If one were to dissect my source code one could determine the algorithm I use to obfuscate the segments. But this way I am not distributing strings right out of the PDF. And technically I'm not redistributing anything. Without the download from WotC the extension is completely worthless, unless one "decompiles" my code to obtain partial phrases like "A mystic has psionic powers."

In retrospect I should have used numbers. The text in the pasted data between positions 765 and 789 is a header, for the Mystic Class description instead of the text between "A Mystic is a..." and "... they don't live in hobbit holes." is a text body describing mystics.

Trenloe
May 10th, 2018, 21:04
Sorry. One thinks that things are clear and concise because one (me) is so close to it... So please let me elaborate.
...
Thanks for that. It now makes total sense. :)

Have you emailed Doug to see what his thoughts are? Or are you over it?

Minty23185Fresh
May 10th, 2018, 21:06
Right now I am considering the numbers approach. But I am also sick and tired of this project and want to move on to something else. Especially since I have nothing to gain from it. I have Nickademus's original module, I did the work the first time creating the Mystic Class Implementor extension which I voluntarily pulled from the forums. So I got mine. I don't need this. Nor the headache it is causing me. (Rant done, thanks for listening)

Minty23185Fresh
May 10th, 2018, 21:08
Thanks for that. It now makes total sense. :)

Have you emailed Doug to see what his thoughts are? Or are you over it?

I'll email Doug, but I think Moon Wizard already took care of that approach for me.

(And the above rant was prior to me seeing your post.... :o )

Trenloe
May 10th, 2018, 21:13
I don't need this. Nor the headache it is causing me. (Rant done, thanks for listening)
Totally understand, I really do. It's great that you wanted to do something nice for the community. But don't push it so that it's too much of a hassle! Move on, play some games... :)