Thanks for everybody... I am here looking for some help.
I have a bounch of PDF files, 100,000. They include both text and image. The images are basic some tables like excel. I want to transfer them into pure HTML. The resultant HTML file should include the original text, which I am able to do, as well as the tables in images. I had tried several PDF parsers, they are not powerful enough.
Can anybody here give me a help?
Thanks in advance!