I just use the PDF defaults from whatever browser I'm using at the time. Nothing special involved, just the defaults.
I do use 'pdftotext' to do more fine-grained searching if I need to - but for the most part I find that a simple "ls -l | grep <search>" suffices, since this method preserves page title text too ..
I did the same thing for this thread and had no issues with this command, whatsoever:
> EDIT: Seeing the command-line you're using, the search you do is over the files' names, correct? The PDF/(original web page) text content is not indexed, right? Just to make sure I understand correctly.
pdftotext gets the actual text from the PDF. I don't do this, but I'm sure that you could automate the process of generating a text file for each PDF in a directory with pdftotext and then ripgrep the text files when it's time to search the contents. That would be doable with a makefile or a couple of shell scripts.
Yeah, my computer is fast enough that I can just do "find . -name '*.pdf' -exec pdftotext {} \; | grep -i someSearchTerm" and come back later. Bonus points that it stays in my Terminal for reference later in the day as needed.
Is there a reason why you don't use mdfind instead (built-in spotlight search from the terminal)?
That way you can search pdf files directly from the terminal without converting to text first, and the directory is already indexed.
1: Force of habit, since I use grep and silversearcher elsewhere a lot, but 2: I hate the mdfind indexer service putting garbage all over my disks, so I've turned it off and forgotten about it.
I do use 'pdftotext' to do more fine-grained searching if I need to - but for the most part I find that a simple "ls -l | grep <search>" suffices, since this method preserves page title text too ..
I did the same thing for this thread and had no issues with this command, whatsoever:
Results:".. nothing beats print-to-PDF. Its just awesome." "fancier laid out pages), I Print-to-PDF again after enabling Reader"