PDA

View Full Version : PDF Converter



Bidmaron
January 13th, 2017, 04:33
I tried to search to see if this had been discussed before, but I could not locate a thread.

Does anyone have a recommendation for a free or fairly low-cost PDF to word converter? I'd like to convert some of my PDFs to Word files and use Visual Basic to convert tables into FG2 table module suitable for direct use in FG2.

dulux-oz
January 13th, 2017, 04:35
Won't Libre Office do it - load up the pdf then save it as a Word doc?

darrenan
January 13th, 2017, 04:51
Doesn't Word have the ability to directly open PDF files?

JohnD
January 13th, 2017, 05:32
Doesn't Word have the ability to directly open PDF files?

If you have one of the newer versions, yes.

Bidmaron
January 13th, 2017, 12:34
Must not be in 2007. Guess it may be time for a 365 subscription maybe

dulux-oz
January 13th, 2017, 12:43
Must not be in 2007. Guess it may be time for a 365 subscription maybe

Nah, go Open Source - I did about 5 years ago and I haven't looked back (but then again I'm sooo tight with a $ its not funny)

Bidmaron
January 13th, 2017, 13:29
Problem dulux is I need to do some scripting to automate, and I am very fluent in Visual Basic. But I suppose I could use libre to convert and word to script

Mirloc
January 13th, 2017, 13:33
Agreed Dulux-Oz, the Open Office, Polaris Office and Libre are all great products.

Now, if you are looking to open and edit there are a number of alternatives to Adobe Acrobat, and have just as good, if not better feature sets. Prices range from FOSS to the couple hundred dollars for a single seat of Acrobat Pro.

Now - and this is the real crux of the issue:
1 - If your original PDF was made by somehow exporting a Word document (for example) any of the products (particularly Libre Office) will allow you to manipulate the document directly and intuitively.
2 - If, however, you scanned a document there is a fairly good chance the document is a picture of the text, not actual text. These kinds of documents cannot be edited easily, and even the cleanest picture won't OCR with 100% accuracy.

dulux-oz
January 13th, 2017, 13:52
2 - If, however, you scanned a document there is a fairly good chance the document is a picture of the text, not actual text. These kinds of documents cannot be edited easily, and even the cleanest picture won't OCR with 100% accuracy.

True, and that's a real issue.

It's also something that an experienced operative like Bidmaron would know - and so I (possibly incorrectly) assumed that this was already taken into account.

dulux-oz
January 13th, 2017, 13:53
Problem dulux is I need to do some scripting to automate, and I am very fluent in Visual Basic. But I suppose I could use libre to convert and word to script

You can script via VB in Libra - one of the reasons I went that way when I got tired of forking out to MS

Bidmaron
January 13th, 2017, 14:07
Oh, cool Dulux. I will definitely check that out then if it does Visual Basic.
I don't believe any of the pdfs I have are images. I have some software to ocr images if I have to (but I haven't used it in years so probably would need updating)

Myrdin Potter
January 13th, 2017, 15:45
Most the mainstream PDF software has a built in OCR. Even scanned text can get recognized as text and cut and pasted.

Most of my experience in pulling PDF's into Word and such results in a jumbled mess and I think many commercial PDF are converted from desktop publishing programs and things like tables just do not convert well.

I can cut and paste sections into word and then manipulate it better than grabbing a whole document, unless the document is just normal paragraphs without fancy layouts.

Talyn
January 13th, 2017, 21:15
I'm super-lazy so I just fire my PDFs through https://www.extractpdf.com/ (unless they exceed the 14MB limit) and it spits out a .txt file so I can copy/paste into Notepad++ then do the markup from there. It also gives a .zip with every image in the PDF and a .zip with the fonts.

The .txt needs a little cleaning up but that's also the case if I do a text export directly from Acrobat itself. It's not too bad and once it's cleaned up, it's ready to go if you're doing Notepad++ like I do or if you want to insert the custom markup for either PAR5E or Ikael's content importer extension.

ddavison
January 13th, 2017, 21:24
A lot depends on whether or not the PDF is locked or not as well.

Bidmaron
January 13th, 2017, 21:41
What happens if it is locked? I presumed most were these days?

Talyn
January 13th, 2017, 21:43
If the PDF is locked you can't edit or export text/images.

Bidmaron
January 13th, 2017, 23:45
So in that case you would have to print it to an image file, scan and our it I guess.

Talyn
January 13th, 2017, 23:46
Or just unlock it. Which is a trivial matter but possibly a grey area if that particular discussion violates the ToS on the forum so, we'll just leave it at that. :)

damned
January 14th, 2017, 00:46
OneNote does excellent OCR.

Bidmaron
January 14th, 2017, 00:49
I had no idea it did, but I don't think it has a scripting language I want to learn.

Zhern
January 14th, 2017, 19:21
OneNote is my goto tool for all my campaign planning, prep, etc. I had no idea it did OCR too. OneNote doesn't support vbscript, though.

Bidmaron
January 14th, 2017, 20:12
It is not free either right?

Zhern
January 14th, 2017, 22:46
Microsoft actually did make it free as of Windows 10/Office 2016.

damned
January 14th, 2017, 22:56
free and works on idevices, android, windows, etc...

damned
January 19th, 2020, 08:06
Hi,
Maybe you can try the I have used this tool to convert my PDF. It's free and efficient, I think you can just have a try.
Hope it can help you.

Bye Bye.