PDA

View Full Version : AD&D Core Rules CD Importer for HTML files



celestian
March 6th, 2017, 23:02
So, I've written a few scripts for myself to import data from the AD&D Core Rules CD (2.0). It's turned out pretty well. Since most of it is HTML it's not been to hard to get it formatted to something that looks pretty good in "Story" document view.

That said, this is in perl (I run it on my linux box) so most people probably won't be able to use it but hopefully some will. If I could just give the .mod files out I would but right now they don't allow anything but 5e on dmsguild.

To that end, here is the script I used. It requires a few perl mods and copy/paste experince. If you know perl it's probably going to be an easy process. I'll help out where I can.

https://github.com/CelestianGC/FG--AD-D-Rules-Core-Importer

The corebook-html.pl is the file you want. I'd take the time to look at the readme.

This "should" work for any directory of files you find in your CoreCD\WebHelp\*. If you want the Tome of Magic, run the script on the TM directory (./corebook-html.pl CoreCD/WebHelp/TM tomeofmagic.xml) or DMG for the Dungeon Master's guide files.

This quoted part is still there but not the preferred method any longer. See the README.md for full details on creating a ref-manual from the created xml files.


The way I use these files is inserted into a clean campaign. That campaign is used to export and create a .mod for that book. So, I have campaigns for Tome of Magic and DMG and PHB and so on. Then in my REAL campaign I use those mods. The best way I've found so far is to use them from the Library and pop up the window for the book, then you can "search" for the topic. Right now I have no pretty table/index for it.


At some point I'd like to figure out the library stuff and see if I can organize it better but for now this is what I've got. I'm doing a lot of ruleset coding and this so I bounce between the 2.

Eventually I will include the spells script (works a lot like this script) that will create the data needed to have the Spells in your campaign but right now I just don't have the time to smooth it out.

edit:

Here is what the documents look like in FG.
https://i.imgur.com/DHLHkTA.jpg

Bidmaron
March 7th, 2017, 02:23
Awesome work, Celestian.

celestian
March 7th, 2017, 07:07
I've updated the script to now also export skills (from the PHB) and Items (from the DMG). These exports are simply the name and the descriptions (with formatting/tables/etc). You'll find *_skills.xml and *_items.xml when they are found.

With a few tweaks to the item it works out kinda nicely as "templates" using the forge option (well for my ruleset). Vorpal dagger here I go ;)

celestian
April 13th, 2017, 23:20
I've added "corebook-spells.pl" importer to this same github repo. It will allow you to take the data for spells from the html files (I used Player's Handbook and the Tome of Magic html files) and import them into a "Spells" format.

Mostly useful if you use my ruleset tho I don't see why it wouldn't load in others but not sure how useful it would be.

Here is a view from my FG desktop with the various items that I've imported.

https://i.imgur.com/bWno7b8.jpg

Bidmaron
April 17th, 2017, 00:13
Very nice work, celestial. Now where did I put my Core Rules CD?.....

celestian
April 17th, 2017, 03:46
Very nice work, celestial. Now where did I put my Core Rules CD?.....

Good question ;) I actually still have mine installed and it works on my PC. Not bad for something written in 1999.

However, all you need are the .htm files located in Webhelp/* for what it's worth.

Bidmaron
April 17th, 2017, 23:51
I want to thank you for posting your dropdown code solution too

celestian
September 27th, 2017, 04:47
Started tinkering with this again now I've got a clue about how ref manuals work (basic stuff). I've got it to where it will create a $outputfile_client.xml file <reference> block that can be used. Right now it's just alphabetically sorted with each entry a subchapter. Not really sure I can do more than that since the html has no real ref that I can use to organize it by.

Now if we could just edit at least the chapter/subs (order) in ref manuals In FantasyGrounds the hard work would be done. It should be a lot easier to sort it out manually tho than before.

I've got a work project for the next two days but I'll see about getting a more complete client file (include the library link) sometime soon.

celestian
January 6th, 2018, 06:04
So, I had a big breakthrough tonight with the import tool. I've been working on making it create the "reference" block (the big one in a refmanual that is the pain in the ***) and the best I could do was create a list of pages sorted alphabetically... really not much like the original book. If we could drag/drop the pages around it would have been useful.

Tonight however I figured out a way to import the pages from the html files in the order they appear in the actual books (90% at least). Now I can import the pages from the Core Rules CD, such as the Complete Thief's Handbook and it practically be complete and in order like the handbook itself.

Several books I tested turned out really good. It's got a good 90% of everything in them. Pictures obviously are not there and a few of the tables get wonky in translation but lots easier to cope with.

Here is what I kicked out in about 30 minutes.

https://i.imgur.com/SmE9ENK.png

Very happy how well it turned out. It's not perfect but at this point it's cake to finish.

The importer update is posted on github.

Bidmaron
January 6th, 2018, 13:17
Really nice work, Celestian. I have not been able to locate my Core Rules CD in a long time, but it really doesn't matter because the only reason I'd ever go back to anything WotC would maybe be SpellJammer.

I'd like to eventually do something equivalent for PDFs (Acrobat Reader has an API that lets you select text, copy it, and so on), but the best you could get there would still be without any formatting whatsoever, and it would still have the infamous PDF carriage return problem (but that is not that hard to fix in code). I have never owned the full version of Acrobat, so I am not sure if that version lets you copy character formatting or just the raw text.

celestian
January 6th, 2018, 17:17
Really nice work, Celestian. I have not been able to locate my Core Rules CD in a long time, but it really doesn't matter because the only reason I'd ever go back to anything WotC would maybe be SpellJammer.

I'd like to eventually do something equivalent for PDFs (Acrobat Reader has an API that lets you select text, copy it, and so on), but the best you could get there would still be without any formatting whatsoever, and it would still have the infamous PDF carriage return problem (but that is not that hard to fix in code). I have never owned the full version of Acrobat, so I am not sure if that version lets you copy character formatting or just the raw text.

Yeah this project is very narrow in it's scope unfortunately. There isn't any other version of the game that had html files that are so easy to parse... easy being relative, more so than PDF! Although thinking about it I could probably do something with the SRD/OGL by sucking down the website pages. Either way it's not window click and drag tool. The importer assumes you have a good bit of knowledge on how to mess with perl, at the very least to install it and modules.

That said a lot of the logic, in theory, could be of use for other versions depending on the source content.

I'm just happy it's worked out... I've got a load of books now for my players to use during my games. Now to decide if I wanna use handbooks or not ;)

Myrdin Potter
January 6th, 2018, 17:42
I hit a wall on the PERL part. One of the functions needed just refused to install. I am using PERL via Windows 10 as do not have a linux box up and running now and not enough time in the day to resurrect one of my old ones.

celestian
January 6th, 2018, 20:18
I hit a wall on the PERL part. One of the functions needed just refused to install. I am using PERL via Windows 10 as do not have a linux box up and running now and not enough time in the day to resurrect one of my old ones.

I've been fortunate enough to not have to setup perl on a Windows box. If I ever get the time (hah!) maybe I'll give it a spin and work it out with some instructions.

I have Workstation Pro on windows with a VM running Centos which is where I do all my perl ... with CPAN it's way easier. I think Windows comes with Hyper-V? That should work to create a VM also... Tho it might require Windows Pro?

Andraax
January 6th, 2018, 20:22
ActivePerl (https://www.activestate.com/activeperl)

celestian
January 6th, 2018, 23:30
ActivePerl (https://www.activestate.com/activeperl)

I poked around with this a bit and everything (module wise) installed just fine except HTML::Tidy. Looked around to try and find a fix and I couldn't find one. The author blew it off as "not a high priority" but another person said you needed to install a dependency (which I couldn't find the name of).

I'll loop back at it when I have more time. It did seem pretty simple to use and I did notice some Windows eccentricities in my script I've coded around.

Myrdin Potter
January 6th, 2018, 23:35
That is where I got stopped - Tidy did not install.

celestian
January 7th, 2018, 00:02
That is where I got stopped - Tidy did not install.

Man, I need to let these things drop but they keep pricking my conscious and I have to go back and work on it.

So, I figured out a way to do this.

Start cpan

cpan> install KMX/Alien-Tidyp-v1.4.7.tar.gz
cpan> install HTML::Tidy

Now, the first one should install with no problems. When I ran install HTML::Tidy it complained a test didn't work... so what I did was go into the directory of cpan.

cpan\build\HTML-Tidy-1.60-6nsqtQ (note the 6nsqtQ part will be different, it's not a static value)
within the "cpan\build\HTML-Tidy-1.60-6nsqtQ" directory type "dmake install"

It'll complete and after that the corebook-html.pl file should run.

perl corebook-html.pl PHB PlayersHandbook

"PHB" path assumes you've copied the core rules WebHelp\* directories

Once it finishes it'll create 3 files, PlayersHandbook.xml, PlayersHandbook_skills.xml, PlayersHandbook.client.xml,

The contents of the first 2 XML files can be placed into the proper section of a db.xml file and it should show in the AD&D Core ruleset. The last one is a client.xml file for a refmanual. It'll need the "<library>" link section (this is something Im going to add to this process down the road) and the definitions.xml... both of which are fairly static.

add this to the clients.xml


<?xml version="1.0" encoding="iso-8859-1"?>
<root version="3.3" release="8|CoreRPG:3">

<library>
<add2e static="true">
<categoryname type="string">2e</categoryname>
<name type="string">AD&D 2e Players Handbook</name>
<entries>
<ref_000001>
<librarylink type="windowreference">
<class>reference_manual</class>
<recordname>reference.refmanualindex</recordname>
</librarylink>
<name type="string">Players Handbook</name>
</ref_000001>
</entries>
</add2e>
</library>

<!-- ADD THE REFERENCE section from the import below here. -->

</root>


definitions.xml


<?xml version="1.0" encoding="iso-8859-1"?>
<root version="3.3" release="8|CoreRPG:3">
<name>AD&D 2e PHB Reference</name>
<category>2e</category>
<author>version 1.0</author>
<ruleset>AD&D Core</ruleset>
</root>


Drop the client.xml and definitions.xml files into a directory in your FG/data/modules path and load FG, it should show the refmanual at that point.

damned
January 7th, 2018, 01:11
Well done celestian.

Myrdin Potter
January 7th, 2018, 02:08
Will take a look at getting the perl ready.

celestian
May 2nd, 2018, 15:54
Updated the README.md in the github repo to better explain the process of getting the imported data (xml file) into FG as a ref-manual.

celestian
October 27th, 2018, 18:53
I've made some tweaks to the importer.

This version improves the keywords used for search fields for the ref-manual versions.

I also configured the story entry exports to add categories and ordering for entries so that you could use those and then use the Author tool to export to ref-manual. That would allow you to edit/tweak the Story entries that don't import correctly w/o having to dig into the ref-manual xml files by hand.

Check the repo for the files.