-
July 11th, 2011, 04:25 #1
- Join Date
- Aug 2008
- Posts
- 614
Tenian's Parser - Version 4.1.1.1+ Beta
Hi all!
If you're in a rush, here's the skinny:
- If you don't have the latest stable version of the Parser installed, install it first!
- Download the .NET 4.0 Redistributable and install it.
- Download the beta release of the Parser and copy it to the application directory where you installed the stable Parser, overwriting the previous Parse4E file. Most likely, this is in "C:\Program Files\4EParser"
Release Notes
If you're not in a rush, here's extra info!
Tenian developed an application that allows you to parse DDI materials. He did this for himself but kindly released it to the public. However, he is currently on a DnD hiatus and has ceased development of the application. He does not want the application's source code to be released since it is his. But he has given me the source so that I may maintain it for bugs and feature requests, as time allows.
There are a few problems with this:
- I don't use the same development tools Tenian does. His are 1) old and 2) expensive (Visual Basic/Studio 2003) . Mine are 1) new and 2) free. (Visual Basic Express 2010) I don't have the funds to correct this, leading to the following bullet points:
- Visual Studio Express 2010 requires that end-users install the 4.0 version of the .NET framework. I could have sworn there was an option somewhere to change that, but whether its because it's Visual Basic (I'm much more familiar with C#) or because it's an imported project or my memory's just bad, I see no option to change the dependency to v2.0, the version Tenian developed on.
- I cannot use the installer than Tenian developed since it is specific to his software. I plan to eventually use Nullsoft's free, open source installer platform, but setting this up to match Tenian's setup is a daunting task. Until I get around to this, my releases will be beta releases only, with no installer, just a binary (.exe) file you will paste over the Parse4E.exe file from Tenian's latest stable 4.0.118 version. Sorry for the inconvenience, but its better than nothing! Finally, the changelog will be posted here on the forums until I get the installer up and running as well.
- Tenian's software is well over 10,000 lines of code (maybe 20k or 30k, I just scanned the files in about 10 seconds). He coded for himself, not for others. He understood his own code, and I do not. I try to understand the components I modify to the best of my ability. I also lack the experience Tenian has with knowing what type of errors to watch out for. I try to test the software I release, but to be frank, I'm confident that you will not receive the level of quality you expected from Tenian. Recall all the bugs in the early days of the software. I will fix bugs as quickly as they are found, but for that to happen, they must be reported. Preferably, in this thread. My apologies in advance for the bugs!
Download
The latest beta release can be found here:
https://www.eugenez.net/downloads/pa....3/Parse4E.exe
Please report any bugs you find.
Requirements
- .NET Framework 4.0
- An installed copy of version 118. (Copy the above binary over the old binary.)
Documentation
Some people have developed some wiki documentation for the parser here.
Changelog
v.4.1.2.3
- DrZeuss identified and fixed the item filtering issue. Items should work again.
v.4.1.2.1
- Errant string <p class="publishedIn"> removed from scrape
v.4.1.1.1
- Fixed Item scraping from the Compendium
- Fixed Item HTML output
- Lowered inter-item pause significantly, scrapes should now be much faster
Last edited by EugeneZ; August 8th, 2012 at 10:03.
-
July 11th, 2011, 07:05 #2
- Join Date
- Jul 2011
- Posts
- 12
Wanna be the first to thank you for giving a great effort to seeing this application work. I have a scrape going right now. If there is any feedback we can give you that would help, please let us know, and again thank you for your time and effort.
---===Edit===---
Ok, finished up that Scrape and attempted to Parse it. I got a few errors that I could correct using Notepad++, like a few missing </p> tags and such.
The main issue I'm running into now is when I try to parse everything I got from the Player's Handbook is when it gets to Powers. I run into something that looks like this
7/11/2011 2:45:01 AM : ERROR:System.Xml.XmlException: '<' is an unexpected token. The expected token is '>'. Line 1521, position 342.
<keywords type="string">Arcane, Force, Implement</keywords>
<action type="string">Standard Action</action>
<range type="string">Ranged 20</range>
<source type="string">Wizard Attack 1</source>
<description type="formattedtext"><table><tr><td><b>Target:</b>One creature or object</td></tr></table><table><tr><td><b>Attack:</b>Intelligence vs. Reflex</td></tr></table><table><tr><td><b>Hit:</b>2d8 + Intelligence modifier force damage. Make a secondary attack.</td></tr></table><table><tr><td><b>Secondary Target:</b>Each enemy adjacent to the primary target</td></tr></table><table><tr><td><b>Secondary Attack:</b>Intelligence vs. Reflex</td></tr></table><table><tr><td><b>Hit:</b>1d10 + Intelligence modifier force damage. <p class="publishedIn"></p></td></tr></table></description>
*** <shortdescription type="string">Target: One creature or object; Attack: Intelligence vs. Reflex; Hit: 2d8 + Intelligence modifier force damage. Make a secondary attack.; Secondary Target: Each enemy adjacent to the primary target; Attack: Intelligence vs. Reflex; Hit: 1d10 + Intelligence modifier force damage. <p class="publishedIn"></p</shortdescription>
<class type="string">Wizard</class>
<powertype type="string">Attack</powertype>
<level type="number">1</level>
<tier type="string">Heroic</tier>
<type type="string">Power</type>
The offending text is in Bold and Underlined. I can't seem to edit that particular tag because it doesn't show up in Notepad++, or I don't know a way to make it show up. Currently using Win7(x64bit), and using the 4.1.1.11 version of 4e Parser. If there is a way to configure Notepad++ to show the text the Parser is referring to I think I can fix this up.
Thanks again for your time and effort, and the help anyone else provides.
---===Edit #2===---
I was running into this same error using 4e Parser 4.0.118.Last edited by Mooses8D; July 11th, 2011 at 09:05.
-
July 11th, 2011, 14:43 #3
- Join Date
- May 2011
- Location
- Sydney
- Posts
- 16
Thanks EugeneZ!
I am parsing the PHB right now, adn will let you know how it goes (so far so good).
-
July 11th, 2011, 22:04 #4
- Join Date
- Jun 2011
- Posts
- 6
@EugeneZ: thanks for the update! Alas, I still have the same problem (see attached screenshot). No scraping for me.
I've reinstalled the parser and applied your binary to it and also tried running it explicitly as Administrator. No success.
If you want I can try to trace the sent data during a scrape attempt with a network sniffer. Thinking of it, I'll probably do it anyway since I'm messing around with the Compendium API myself currently (trying to get an AS3 lib out of it for an AIR application).
And last but not least: don't apologize for any bugs. You're doing a great job here. I know myself how difficult it is to work with code that you haven't written yourself (and that probably wasn't written with readabiliy in mind).
-
July 11th, 2011, 22:15 #5Originally Posted by Mooses8D
When you fixed up the missing </p> tags did you add new </p> tabs in or remove the offending opening tag - usually <p class="publishedIn"> ?
I usually remove the offending tag.
The error you're seeing looks like you've been adding the close tag </p> and it somehow got messed up here by only adding </p
-
July 11th, 2011, 22:29 #6
- Join Date
- Jul 2011
- Posts
- 12
Originally Posted by arotter
Can you have plus signs (+) in an email address? Are you entering it in correctly? I used the Compendium just now to scrape the PHB, and I finally got it working. Maybe check the email address field again.
---===Edit===---
Also, maybe log into your WoTC D&D Insider using a browser and then try running the scrape? I have it set to remember me being logged in, and that seems to also be a working combination.
---===Edit #2===---
Also, that Account Validation error popped up once or twice while I was using the Scrape but it was during heavy internet traffic so the connection might have dropped out. I found that the information it had grabbed was indeed there, just not complete. Maybe try running it during non-peak hours.Last edited by Mooses8D; July 11th, 2011 at 22:37.
-
July 11th, 2011, 22:32 #7
- Join Date
- Jul 2011
- Posts
- 12
Originally Posted by Trenloe
Was just about to post that I had found a work around.
But to answer your question, yeah I was closing all the <p class="publishedIn"> lines by adding </p> to close them up, but what I ended up doing to fix it was adding an additional > (</p>>) at the end so that it would continue onto the <shortdescription> line.
It appears to have worked and I now have the Player's Handbook in my library. Thank you for the quick response.
-
July 11th, 2011, 22:56 #8
- Join Date
- Jun 2011
- Posts
- 6
Originally Posted by Mooses8D
I've copied and pasted the email address from a text editor both into the Compendium's login screen and into the parser (and did the same for my password). The result was that the Compendium logged me in while the parser told me that my account couldn't be validated.
Originally Posted by Mooses8D
Originally Posted by Mooses8D
But thanks for the suggestions, thou. Any help or suggestion is appreciated.
-
July 11th, 2011, 23:43 #9
- Join Date
- Jul 2011
- Posts
- 12
@arotter
Maybe it's a Firewall issue?
-
July 12th, 2011, 00:45 #10
- Join Date
- May 2011
- Location
- Sydney
- Posts
- 16
Scraping the PHB worked. However when I try to parse the results I get an error;
11/07/2011 11:45:40 PM : ERROR:System.Xml.XmlException: The 'p' start tag on line 586 position 212 does not match the end tag of 'description'. Line 587, position 7.
<cost type="number">25</cost>
<type type="string">Light</type>
<prof type="string">Leather</prof>
<description type="formattedtext">
<p>Leather armor is sturdier than cloth armor. It protects vital areas with multiple layers of boiled-leather plates, while covering the limbs with supple leather that provides a small amount of protection.</p><p><p class="publishedIn"></p>
*** </description>
</leatherarmor>
<hidearmor>
<name type="string">Hide Armor</name>
<ac type="number">3</ac>
<min_enhance type="number">0</min_enhance>
So is this the Compendium spitting back malformed data, or something in the parser?
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks