Zend certified PHP/Magento developer

How to write a pdf file text extractor? [closed]

I use pdftotext on my local machine (macos) in a script which compares changes in PDF files. I would like to be able to use the script in my shared hosting account, for online access. pdftotext is not available on the server I am using, and the hosting company tells me there would be no way for me to install it (without upgrading to a private server tier $$$, which I don’t want to do).

So I am considering writing a tool to extract the text from pdfs, which would only rely on the typical software available on a linux server, eg, perl, various bash builtins, etc.

While I have written such a script for non-binary files such as RTF, I have no experience writing such a thing for a binary file. Can someone get me pointed in the right direction, tell what cues I can look for, and even just how to deal with the binary code to begin with.

Note that this would be used in a special use-case and the files this will be used for are all created from the same app (macos Pages). Also note that I am only interested in text. I do not need to do OCR.

Any pointers would be much appreciated.

None of my Google searching, nor any the recommended similar topics answer this.