PDA

View Full Version : Regex for parsing out spells?



kuthulu
August 12th, 2016, 19:48
I'm working on a python application to extract NPC data from a spreadsheet and output it into a txt file that is compatable with Trenloe's Pathfinder Creature Parser V2 (https://www.fantasygrounds.com/forums/showthread.php?20522-Pathfinder-Creature-Parser-V2-Beta-Version). I have all the data setup and formatted but I'm stuck on the spells. The spells are all listed in one long STRING so in order to get it to work with the parser I need to get each spell on it's own line. I wanted to see if anyone had an existing RegEx to do something like that. Or really anything close that I could start working with.

If there is a better area to post this in please let me know. Here is an example:

STAT BLOCK
Arael CR 2
XP 600
half-elf cleric of Iomedae 3
LG Medium humanoid
Init 1; Senses low-light vision; Perception +4
DEFENSE
AC 17, touch 11, flat-footed 16 (+6 armor, +1 Dex)
hp 20 (3d8+3)
Fort +4, Ref +2, Will +5
OFFENSE
Speed 20 ft.
Melee mwk longsword +3 (1d8/19-20)
Ranged light crossbow +3 (1d8/19-20)
Space 5 ft.; Reach 5 ft.
Special Attacks channel positive energy 5/day (2d6, DC 13)
Domain Spell-Like Abilities (CL 3rd) 5/day-battle rage, touch of good
Cleric Spells Prepared (CL 3rd) 2nd-hold person (DC 14), sound burst (already cast), spiritual weaponD 1st-bless (2), protection from evilD, shield of faith (already cast) 0-guidance, light, stabilize, virtue
D Domain Spell; Domains (Good, War)
TACTICS
Before Combat Arael casts guidance and virtue on any unskilled rebels present and protection from evil and shield of faith on anyone he believes is especially vulnerable.
During Combat Arael casts bless if he has allies, uses sound burst in the hopes of stunning multiple opponents, and hold person to disable a dangerous adversary.
Morale Arael surrenders when he reaches 5 hp if he believes his foe will accept a surrender. He is willing to hold off an enemy even at great risk to himself if it gives his allies more time to succeed at a task or escape, but prefers to make a tactical retreat rather than dying needlessly.
STATISTICS
Str 10, Dex 12, Con 13, Int 10, Wis 15, Cha 14
Base Atk +2; CMB +2; CMD +13
Feats Alignment Channel, Pick Alignment, Brew Potion, Skill Focus (Knowledge [local])
Skills Diplomacy +8, Heal +6, Knowledge (history) +4, Knowledge (local) +4, Knowledge (planes) +4, Knowledge (religion) +4, Sense Motive +6
Languages Common, Elven
SQ elf blood, elven immunities
Combat Gear potion of cure light wounds, potion of bull's strength; Other Gear breastplate, masterwork longsword, dagger, light crossbow, 20 bolts, 60 gp


So I need to get the spell levels on individual lines, something like:
Cleric Spells Prepared (CL 3rd)
2nd-hold person (DC 14), sound burst (already cast), spiritual weaponD
1st-bless (2), protection from evilD, shield of faith (already cast)
0-guidance, light, stabilize, virtue

FORMAT:
class Spells Prepared (CL ; concentration +)
2nd--
1st--
0--

I still have to do this part as well, but I'm working on it now:
Invalid Spell Like Abilities entries
In some older Paizo statblocks certain class abilities are listed as "Domain Spell Like Abilities", "Bloodline Spell Like Abilities" or similar entries. These are not actually spell like abilities, as they are not spells, and so will create issues with parsing of the statblock (usually in the form of an XML error when you try to open the module in FG). Remove these entries and move the relevant data to Special Attacks or SQ.

Nylanfs
August 12th, 2016, 21:06
Take a look at the PCGen importer.

kuthulu
August 12th, 2016, 23:27
Take a look at the PCGen importer.

Are you talking about the NPC Character Importer (https://www.fantasygrounds.com/forums/showthread.php?20393-Import-PCGen-characters-to-NPCs) or something from the PCGen application itself.

Btw, Just saw there was a new stable release of PCGen. I use it all the time.

Ken L
August 15th, 2016, 00:09
I've been dabbling in an extension that works similar to the 'advanced beastiary' import but also imports spells with their full descriptions. Currently I have it working for everything but spells as Lua is giving me headaches with its subset regex. I'd like to say I'll use Trenloe's parser, but I'm alergic to the amount of wine gimmicks I need to pull off.

Another is the xpath logic for digesting d20pfsrd as there's no uniform layout for spells as I'm trying a maximum capture paradigm similar to my equipment library where I can grab errata and such which d20pfsrd has.

kuthulu
August 24th, 2016, 21:29
Thx for the reply Ken. I just got back online. The house flooded so we had to evacuate.

I agree about using WINE. It is one of the reasons I am writing my app in python since I use Linux 100%. Once things calm down here I'll look into the regex for the spells. The ultimate goal for me is to make an application similar to Trenloe's that runs natively on Linux. I know it is not necessary but I'm using it as training so I can learn python as I go. Here is what I have so far. It is pretty basic. I'll return to it once I finish putting my house back together.

https://github.com/BigD3m0n/pyFG_Module_Maker

Ken L
August 24th, 2016, 22:40
Stay safe from the floods! Were I you I'd focus on the clean up and keep the games on the back burner.

I'll post my lua native version of CreatureGen after I get some bare spells on. It's not difficult, but it's annoying to re-write logic I've done before, and lua is quite the stickler so I'd work on it for an hour and stop for a week or some odd. The spell scraper / parser for d20pfsrd is running into issues with inconsistent formats so I'm using a 'template' approach that matches each style. I'll likely not release the python scraper scripts (for both items/spells) as they literally hammer d20pfsrd with requests over a short interval of time so I'm slightly worried about that; I didn't exactly write efficiently there. I suppose I could write a wrapper around my fetch function to have a timed delay; but I already wrote in a cacher so I only perform a fetch once on new material.

The beauty of having a lua native script is that it runs as an extension within FG so it's platform independent as if FG runs, this runs atop it. Their non-POSIX style regex is the largest pain though, especially for anything parsing intensive but it's not a tall hurdle, just an annoying one.