PDA

View Full Version : updated spell library?



Ken L
September 29th, 2015, 08:45
Is there an updated version of the spell library including all newly added spells since occult adventures?

Ken L
September 29th, 2015, 23:38
Has anyone thought to parse and split out XML for

https://www.pathfindercommunity.net/home/databases/spells

?

Trenloe
September 29th, 2015, 23:56
Has anyone thought to parse and split out XML for

https://www.pathfindercommunity.net/home/databases/spells

?
Yeah, I use that to create the All Paizo Spells Module. The main issue is that the description_formated field (used to make nice, formatted spell descriptions) has a number of XML errors which break the FG module. So these have to be manually edited each time to fix the breaks before the module can be created. Hence, it's not just a case of running the database as-is through a parser, there's quite a bit of pre-formatting work to do. And because the database doesn't always add new spells at the end of the database it's pretty much a new job each time it's done.

I have on my list of many FG tasks to do an updated All Paizo Spells module at some point... No ETA at present.

Trenloe
September 29th, 2015, 23:59
Also, as Occult Adventures hasn't been released in PRD form yet spells from that book aren't in this database.

Ken L
September 30th, 2015, 00:03
Sounds like all you need to do is a first pass to stripe out bad actors, then parse the csv/tsv into a data structure to be then outputted to a desired format. Unless there's something I'm not accounting for. The TSV/CSV data looked quite good, I haven't seen any bad actors aside from the odd html entities which I can strip out easily. I'm not terribly worried about pretty formatting, just plain text that can then be auto-parsed by the built in FG spell parses for additional information; so essentially much of the header information (casting time etc...) leaving the description block intact.

I might just whip up some quick python for this, or hell, even some bash.

Ken L
September 30th, 2015, 00:04
Also, as Occult Adventures hasn't been released in PRD form yet spells from that book aren't in this database.

That's swell as well. But APG, UC, UM, and APG are in it which are quite big where as occult adventures is still a fantasy niche. It even has a large number of PF fluff book spells such as things from magical marketplace, inner sea gods, and more.

Trenloe
September 30th, 2015, 01:10
Sounds like all you need to do is a first pass to stripe out bad actors, then parse the csv/tsv into a data structure to be then outputted to a desired format. Unless there's something I'm not accounting for. The TSV/CSV data looked quite good, I haven't seen any bad actors aside from the odd html entities which I can strip out easily. I'm not terribly worried about pretty formatting, just plain text that can then be auto-parsed by the built in FG spell parses for additional information; so essentially much of the header information (casting time etc...) leaving the description block intact.

I might just whip up some quick python for this, or hell, even some bash.
Yeah, it's dead easy. Go nuts...

Trenloe
September 30th, 2015, 03:52
Now I remember why updating this module does my head in. The tags in the description_formated field are screwed on so many entries: lists, tables (this is the big one - just random <th>, <td>, <tbody>, etc. tags that break the XML), breaks, etc..

Attached is an updated spell module. I'm sure there'll be some formatting issues with the description text.

Ken L
September 30th, 2015, 18:11
Haha, I just quickly made one using time this morning and a bit of lunch break:

I'll check yours out when I get back, it's probably more complete as I automated much of my population, but it all looks good.

Trenloe
September 30th, 2015, 18:35
Nice work.

I see you used the unformatted description text. It's probably more desirable to use the formatted text. See below for an example comparison. But, as I mentioned above, the main issues with auto generating from the database is the formatted description but this requires a lot of manual editing as it is not correct/FG compatible.

For example, for detect magic, the unformatted spell (from your module) looks like:

https://dl.dropboxusercontent.com/u/39085830/Screenshots/Fantasy%20Grounds/PFRPG/Detect%20Magic%20no%20formatting.jpg

Whereas, using the formatted description (and doing manual editing), gives:

https://dl.dropboxusercontent.com/u/39085830/Screenshots/Fantasy%20Grounds/PFRPG/Detect%20Magic%20with%20formatting.jpg

Ken L
September 30th, 2015, 18:47
Yea, I noticed the huge amount of un-closed tags when I ran it through xmllint. for the formatted text.

The end goal wasn't to make it pretty as I noted earlier, but to at least have the spells in the system so FG could parse them.

Quick and dirty, but effective if you don't care about aesthetics. Precise logic is all I care for so I whipped it up in about 2 hours + a bit of lunch which was easy. Getting it pretty is something else.

I'll see if I can adapt my natural language parser from another project to fix the unclosed tags. I have some predictive metrics in there, but I need to write an XML rule system for it which is a bigger job, but when done would be cake for imported future updates.