PDA

View Full Version : Parser for the 5E pdf rules



JamesManhattan
October 4th, 2013, 12:47
Work in progress, but here is a parser that will turn txt files pasted from the 5E rules pdf's into modules usable in 4E fantasy grounds ruleset. This uses one simple java programs, and a windows batch files. Your virus protection might scream warnings. This works great with this extension https://www.fantasygrounds.com/forums/showthread.php?19639-Here-is-Extension-for-4E-ruleset-that-gets-you-90-of-way-to-D-amp-D-Next-(5E)

Currently only Bestiary, Spells, and Class Feature powers.

example of in game info.
5202

Instructions for Bestiary, the others are similar (see below)
Step 1
Download the 5EParser.zip file attached below and unzip all the files into a folder that you create. The name of the folder doesn't matter.

Step 2
You must have Java installed. The Java run-time environment. Most people have this installed already, since every other website asks you to install it to work properly. (Java should automatically add itself to your PATH, see below if it doesn't)
https://www.java.com/en/download/index.jsp

Step 3
Open up the Bestiary.pdf you have from the playtest in Adobe Reader.
In the menu go to Edit, and choose “Select All”. Then press CTRL+C to copy all the text.
Open up a new blank .txt file using Notepad.
Press CTRL+V to paste in all the text.
(if your pasted in data has a line break after every single word, that’s fine)
Save the Notepad file as Bestiary.txt and put in your folder from Step 1.
(If it asks you any stuff about Unicode being lost just click OK, we don't need the unicode.)

Step 4
Run the batch file “AAA Create Bestiary Module.bat”
...wait, hopefully nothing goes wrong. You can look at all the command lines to see errors.
Look for the output file DnDNextBestiary.mod which you should copy into your fantasy grounds modules folder. Usually C:\Users\YourName\AppData\Roaming\Fantasy Grounds II\modules
Now when you open up a 4E Campaign you should be able to add this Bestiary within your Library.

FINISHED!




For Spells
Simply copy all the text from the Spells.pdf, save it as Spells.txt and then run “AAA Create Spells Module.bat”
It will create a file named DnDNextSpells.mod

Class Features
For the ClassFeatures there is one special additional change. You can’t save the copied data as a .txt file. Open up the playtest Classes pdf. Copy all the text, and then you have to open WordPad and paste the data into there. Then save that as a ClassFeatures.rtf file. Rich Text Format(rtf) saves the font size, which is the only thing I could use to determine what type of headings there were. Run the "AAA Create ClassFeatures Module.bat"
It will create a file named DnDNextClassFeat.mod


To do the custom Bestiaries
I included an example for White Plume Mountain.

Copy the text from the Mud Sorcerer Bestiary.pdf and save it with a name like BestiaryMudSorcerer.txt
Then, you have to make a few edits within 2 files.
Copy TransformBestiaryWhitePlumeMtn.xslt and name it TransformBestiaryMudSorcerer.xslt
In that xslt file change all instances of DnDNextBestiaryWhitePlumeMtn to DnDNextBestiaryMudSorcerer. You’ll have to do a find and replace.
Make a copy of the batch file “AAA Create Bestiary WhitePlumeMtn Module.bat” and rename it “AAA Create Bestiary MudSorcerer Module.bat”. Within the batch file change all instances of BestiaryWhitePlumeMtn to BestiaryMudSorcerer.


Problems?
I compiled my own java programs for windows and included the class files which you can compile if needed: RegexNextParse.java, Transform.java, and createZip.java. I think it will work on most windows PC’s. If you don't have windows. To compile, you’ll have to install java developers kit to do that.

JamesManhattan
October 4th, 2013, 18:01
Woah, I just realized how easy it is to do the entire thing within java. edited my original post.

I'm going to work on Feats, and Equipment, and Magic Items, but those don't seem too urgent to me.

DAWPage
October 12th, 2013, 14:07
What format are you expecting the spells to be in, I try parsing a file with the following content, and it does not get me anything.

Aid
2nd level abjuration
Casting Time: 1 action
Range: 25 feet
Duration: 8 hours
Your prayer calls down a divine blessing of
toughness and resolve. Choose up to three
creatures within range that are not currently
affected by this spell. Each target’s hit point
maximum and current hit points increase by 5
for the duration. This spell has no effect upon
undead or constructs.

Air Walk
4th level transmutation
Casting Time: 1 action
Range: 5 feet
Duration: Concentration, up to 1 hour
Choose a willing creature within range. Until the
spell ends, the target can tread on air as if
walking on solid ground. The target can move
upward or downward at a 45 degree angle at
half its speed. Treat strong winds (twenty miles
per hour or more) as difficult terrain.
If the spell ends while the target is airborne, it
falls if this spell is the only thing keeping it aloft.
At(Higher(Levels: When you cast this spell using
a spell slot of 5th level or higher, you can add
one target for each level above 4th.

JamesManhattan
October 14th, 2013, 21:42
When I copy and paste from the pdf's it looks like the below:

Zone
of
Truth
2nd-*‐level
enchantment
Casting
Time:
1
action
Range:
50
feet
Duration:
10
minutes
Choose
a
point
within
range.
You
create
a
magical
zone
that
guards
against
deception.
Until
the
spell
ends,
any
creature
that
enters
a
15-*‐foot
radius
centered
on
that
point
or
that
starts
its
turn
there
must
make
a
Charisma
saving
throw.
On
a

mattcolville
November 4th, 2013, 23:48
I get 'java' is not recognized as an internal or external command, operable program, or batch file.

Java is up to date on my machine.

Zeus
November 5th, 2013, 08:02
Regarding the formatting of the text when cut n pasted from the PDFs.

I had this problem when I wrote PAR5E for the new 5E ruleset. There are two approaches to fixing easily: i) is time consuming whilst the other ii) is expensive.

i) Grab yourself a copy of TextMate (similar to Notepad++), cut n paste a paragraph or two of text into TextMate document, select the text in TextMate and select Text->Reformat Paragraph. This will fix the paragraph so that each word is not terminated with a newline.
ii) Grab yourself a copy of Adobe Acrobat Pro. Open the PDF, export to MS Word. Open in Word, export to plain text. Some of the column based text might be mixed up but aside from that I found this method to be the fastest.

mattcolville
November 5th, 2013, 23:12
I get 'java' is not recognized as an internal or external command, operable program, or batch file.

Java is up to date on my machine.

I fixed it. My Computer didn't know where Java lived.

mattcolville
November 6th, 2013, 01:52
It worked on the Bestiary and White Plume Mountain but when I followed the instructions and tried to do the same thing on the Against the Slave Lords bestiary, the result is a 1k mod file.

Maybe the formatting in the Slave Lords Bestiary PDF is different?

JamesManhattan
November 6th, 2013, 21:50
It worked on the Bestiary and White Plume Mountain but when I followed the instructions and tried to do the same thing on the Against the Slave Lords bestiary, the result is a 1k mod file.

Maybe the formatting in the Slave Lords Bestiary PDF is different?

I tried to make my regex parsing as tight as I could, but incongruities in format can throw it off. Until I write a better parser. Here's what to fix:

Once you paste the Against the Slave Lords Bestiary from the PDF into Notepad as text.
Before saving it, make these corrections: (it helps to also have the PDF open so you can better understand what you're deleting)

Search for "Raker". Delete from Raker all the way down to the "(+1)" right before Cifal. You'll be deleting Raker and Rat Master.
Delete all the below.

Raker:
AC
12;
hp
9
(HD
2d8);
Dex
14
(+2)
Rat
Master:
AC
11,
Dex
13
(+1)

Search for "Markessa" and change her Short Sword of Speed+1 into a Short Sword of Quickness+1. The word Speed followed by a number messes things up.
(It's the second Markessa that is found) Her name is in there a bunch of times. She's the one on p.54 of the PDF.


Melee
Attack—Short
Sword
of
Speed
+1:
+8
to
hit

Search for the word "corpses" which is under the Mycanoid Sovereign entry.
Scroll down and delete everything from between and including "Animated Corpses" to the "stunned" right before "Confidential". Lots of Lines


Animated
Corpses:
When
animating
spores
animate
a
.
.
.
on
the
target
being
able
to
see.
It
cannot
be
blinded,
charmed,
frightened,
paralyzed,
or
stunned.

JamesManhattan
November 7th, 2013, 15:27
I fixed it. My Computer didn't know where Java lived.

I thought when you install Java it automatically adds itself to the Windows PATH.
Here is how to do it manually, which you'll need to do if you want this to run.
https://www.kingluddite.com/tools/how-do-i-add-java-to-my-windows-path

JamesManhattan
November 7th, 2013, 15:32
Regarding the formatting of the text when cut n pasted from the PDFs.

I had this problem when I wrote PAR5E for the new 5E ruleset. There are two approaches to fixing easily: i) is time consuming whilst the other ii) is expensive.

i) Grab yourself a copy of TextMate (similar to Notepad++), cut n paste a paragraph or two of text into TextMate document, select the text in TextMate and select Text->Reformat Paragraph. This will fix the paragraph so that each word is not terminated with a newline.
ii) Grab yourself a copy of Adobe Acrobat Pro. Open the PDF, export to MS Word. Open in Word, export to plain text. Some of the column based text might be mixed up but aside from that I found this method to be the fastest.

It is indeed annoying to have all those line breaks. I wrote the java code to just remove them so you don't have to do any of this editing manually. You can just leave the line breaks.

mattcolville
November 7th, 2013, 18:45
I thought when you install Java it automatically adds itself to the Windows PATH.
Here is how to do it manually, which you'll need to do if you want this to run.
https://www.kingluddite.com/tools/how-do-i-add-java-to-my-windows-path

A reasonable expectation but neither my computer at home, nor my PC at work had a path for JAVA.

mattcolville
November 7th, 2013, 19:37
So close! :D

After making the edits you suggest, the txt file parsed successfully and I got a SlaveLords mod file that was not empty. :D

When I try and open it up in Fantasy Grounds though I get;

Database Error: A XML parse error occurred processing file DNDNextBestiarySlaveLords:db.xml - Error on line 0: Error reading Element value.

JamesManhattan
November 7th, 2013, 21:01
So close! :D

After making the edits you suggest, the txt file parsed successfully and I got a SlaveLords mod file that was not empty. :D

When I try and open it up in Fantasy Grounds though I get;

Database Error: A XML parse error occurred processing file DNDNextBestiarySlaveLords:db.xml - Error on line 0: Error reading Element value.

Ah, it's because a lot of the monster names have numbers in them, which I didn't expect to happen. Such as Half--Orc Cleric 4/Rogue (Assassin) 5.

Here just use this new BestiaryRegex.txt, then it can grab monster names that include number in them. https://dl.dropboxusercontent.com/u/454578/RegexBestiary.txt
I'll update the original post with an entire new package including this file.

mattcolville
November 7th, 2013, 21:29
Hey! It worked! Awesome, thank you sir.

Shimrath
November 12th, 2013, 21:37
Very nice! I'm looking forward to finding some time to utilize this tool.

Thanks for sharing it with the 5E FG community!

Shimrath
November 16th, 2013, 00:37
So, when you say "wait" in Step 4, about how long do you usually have to wait?

Shimrath
November 16th, 2013, 21:14
So, when you say "wait" in Step 4, about how long do you usually have to wait?

I followed the steps above and got the following messages, and after clicking the any key, no results:


C:\Users\Shimrath\Desktop\5E Parser>java RegexNextParse Bestiary.txt RegexBestia
ry.txt bestiaryoutput.xml
'java' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\Shimrath\Desktop\5E Parser>java Transform bestiaryoutput.xml TransformB
estiary.xslt db.xml
'java' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\Shimrath\Desktop\5E Parser>echo <?xml version='1.0' encoding='iso-8859-
1'?> 1>definition.xml

C:\Users\Shimrath\Desktop\5E Parser>echo <root version='2.2'> 1>>definition.xml


C:\Users\Shimrath\Desktop\5E Parser>echo <name>DnDNextBestiary</name> 1>>
definition.xml

C:\Users\Shimrath\Desktop\5E Parser>echo <author>Java Next Parser</author>
1>>definition.xml

C:\Users\Shimrath\Desktop\5E Parser>echo <ruleset>4E</ruleset> 1>>definit
ion.xml

C:\Users\Shimrath\Desktop\5E Parser>echo </root> 1>>definition.xml

C:\Users\Shimrath\Desktop\5E Parser>java createZip DnDNextBestiary.mod db.xml de
finition.xml
'java' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\Shimrath\Desktop\5E Parser>pause
Press any key to continue . . .

mattcolville
November 17th, 2013, 02:59
Google 'java' is not recognized as an internal or external command, operable program or batch file.

It's easy to fix.

Shimrath
November 17th, 2013, 07:52
Thank you for the advice, i was able to make it work!

. . .only to figure out that it only helps if you're using the 4E ruleset as a base. I've been running my D&D Next playtest using the 3.5 ruleset, and was really hoping i wouldn't have to manually input another monster statblock or spell description.

JamesManhattan
November 19th, 2013, 03:38
Thank you for the advice, i was able to make it work!

. . .only to figure out that it only helps if you're using the 4E ruleset as a base. I've been running my D&D Next playtest using the 3.5 ruleset, and was really hoping i wouldn't have to manually input another monster statblock or spell description.

As strange it may sound, D&D Next has much more in common with 4E and its ruleset than it has with 3.5 edition and its ruleset.

Shimrath
November 19th, 2013, 05:31
My knowledge of the differences between the two rulesets is terribly limited, but i totally believe what you're saying. I must have chosen to use the 3.5 ruleset simply based on the fact that myself and most of the players had used it a lot in the past.

But who knows, if you make it easy enough to get all the information i need from all of the Next material, i might be tempted to do a big switch-over!

Acroyear
November 24th, 2013, 22:42
I use 3.5E also to run my DNDNext game. I like the ruleset better than the 4E ruleset for the ability of much more control over abilities and spells.

mattcolville
December 8th, 2013, 17:34
The Expedition to the Barrier Peaks Bestiary also produces an empty .mod file. I was trying to grab the assassin vine from it.

JamesManhattan
December 23rd, 2013, 23:45
I get it to work just fine, are you using the last files I uploaded?

mattcolville
March 25th, 2014, 05:48
I have several modules for the 5E rules parsed using this tool, but when I create a campaign with the new 5E ruleset I don't see them in the Module Activation window. How does a module know what ruleset it belongs to?

mattcolville
March 25th, 2014, 05:56
Ok, I figured it out. Easy. Should have looked before I posted.

Now I get this error when I try and look at the Bestiary;

Runtime Error: desktop: Unable to create window with invalid class (reference_classmonsterlist : Monsters.MonstersByNameAll@DnDNextBestiary)


But I guess Zeus' parser is now the default. I just find this one way easier to use.

Zeus
March 25th, 2014, 10:26
I have several modules for the 5E rules parsed using this tool, but when I create a campaign with the new 5E ruleset I don't see them in the Module Activation window. How does a module know what ruleset it belongs to?

The newly developed 5E ruleset is now available for public testing everyone.

5E offers several enhancements for D&D Next games over using the 4E and 3.5E rulesets including:

- new reference classes for: Backgrounds, PC Classes (including Class Proficiencies, Features and Abilities, Racial Traits), Encounters, Equipment, Feats, Images (with Pins), NPCs, Parcels, Races, Reference Manual, Skills, Spells, Story, Tables and Traps
- new exploded character sheets with enhanced drag/drop of reference data and better support for multi-class setups
- support for advantage/disadvantage
- Spell/Power attack/damage parsing
- Full Combat Tracker and Partysheet support

Morfedel
April 22nd, 2017, 03:50
I have a 3.5 adventure I've been planning to convert to 5e and make it in Fantasy Grounds. Will this program help me at least get the info into a module format, or is this a non-starter for my plans?

Trenloe
April 22nd, 2017, 04:56
I have a 3.5 adventure I've been planning to convert to 5e and make it in Fantasy Grounds. Will this program help me at least get the info into a module format, or is this a non-starter for my plans?
This application only parses specific Wizards of the Coast produced PDFs into FG modules. It is not a freeform PDF parser. You'll have to enter the details into FG manually, or use something like PAR5E to convert marked up data into a 5E module.

LordEntrails
April 22nd, 2017, 06:25
I have a 3.5 adventure I've been planning to convert to 5e and make it in Fantasy Grounds. Will this program help me at least get the info into a module format, or is this a non-starter for my plans?
I just convert PDFs using the FG interface. I don't bother with either of the two parsers. Most of the text you can just cut and paste, and the cleanup you have to do (i>l etc) you would still have to do for a parser.