January 2, 2016

A PDF workflow which kills no kittens

I have spent really decades trying to find a decent workflow for storing, annotating and referencing research documents, mostly but not always PDFs. I really want to be able to access and annotate the same documents in my normal filesystem and sync them across devices just like I sync all my other stuff nowadays. I want to use standard PDF annotations which are searchable using normal desktop search and are most likely to still be readable in 10 years’ time.

Zotero is a fantastic, opensource tool with lots of bells and whistles but it has the weirdest way of saving PDFs inside meaninglessly-named folders which makes it very hard to do this. I really tried, with python scripts and hardlinks and everything, and really gave up.

Docear is a truly amazing opensource tool but it is just too complex and distracting for me, and it also stores files in weird places.

Mendeley has a much better way of storing PDFs and is nice and simple, but it is closed-source and owned by Elsevier, and if you are a pro open-access researcher you will share my shudder at the very mention of that name. Worst, Mendeley doesn’t do real PDF annotations, it saves them in its own database so it a) doesn’t do what I want, see above, and b) locks you in deliberately to a proprietary workflow.

I am not alone: a lot of people care about this, look at this discussion.

Finally, I realised that the solution is remarkably simple. I just use Mendeley to manage, store and cite my documents but use ordinary PDF readers for annotations. So there is no lock in . And as far as I can see it would work on any platform including Mac and Linux. In detail:

I have one watched folder set in Mendeley. So I can e.g. save a PDF straight to this folder from my browser, and Mendeley will import it silently. I like silently.
In Mendeley options / file organiser, I check organise my files” into a single folder in my dropbox, but you could use any file syncing software. I don’t sort files into subfolders because that makes them harder to find, but I do set rename document files” to author year title” so that it is easy to find the actual files in the file system.
Also in options, I enable bibtex syncing into one file. This means I can use this bibtex file for referencing documents from within word processing.

On the desktop, the important thing is to never use the stupid internal Mendeley viewer. Just right click and choose the external option to open your file in your reader of choice. I am the sort of person that is bothered by all this right-clicking because I do it dozens of times every day so I wrote an auto hotkey script so I can use a single key instead of faffing about with the trackpad. Opening docs like this is also more consistent in the case of non-PDF files, which Mendeley cannot open at all.

I use PDFxchange reader for annotations because this software has a setting to always copy highlighted text into the associated note. This is very important, because it is much easier for any automated process to actually find your highlights. Otherwise, your highlights are just information about yellow marks at certain points on certain pages. They do not contain the actual text. For example, if you do this, if you later decide to use zotero or docear, they will find your highlights as searchable text. At the moment I am on Windows and Windows Explorer can even find these highlights from a normal desktop search.
The fact that Windows desktop search includes highlights made in this way is fantastic. I can also use special strings like XK in my annotations for example when I think of something I need to do research further while writing a note or reading an article. Then I can save the desktop search for this string and have instant access to tasks and ideas I thought of while reading.

Now, in dropbox on my android phone I can easily find and annotate my files. I don’t bother with the Mendeley android app or any other such stuff. All I want to do on my android is read and annotate. If you also want your highlights made on android to be searchable, you have to use Foxit reader, which is AFAIK the only one which does it right. This way, I can read and annotate on my phone (which I have been doing daily since deciding on the workflow I describe here) and open the same dark on my computer to continue annotating without worrying about sync.

When it comes to referencing, I just use the mendeley plug-in for Microsoft Word. However, as you can set Mendeley to continuously export an up to date bibtex file, you can use other tools such as the excellent docear plug-in for word which uses bibtex files.

The main thing I like about this workflow is that it is so simple it hardly counts as a workflow at all. And I have to admit that mendeley is very restful and simple to use.

The only, though very irritating, drawback to this workflow is that while all your other tools can find your PDF highlights and annotations, mendeley cannot. You could view them visually in the internal mendeley viewer but they are not searchable. It would be really simple for mendeley to implement this but there are never going to do it in spite of over 3000 votes for this feature. They say it is something to do with privacy, ha ha, but obviously they just want to maintain their lock in. Anyway, don’t worry about it, you can still search for your highlights and annotations using your desktop search tool and open the files it finds directly, without worrying about mendeley at all and without killing a single kitten.


PDF zotero


Previous post
Theory Maker - technical details Theory Maker is a free and simple web app for making diagrams of theories of change, logframes, etc. Its special feature is that you can also create
Next post
Using attr labels for ggplot In social sciences, variables often have long and informative labels. Mostly they are too long to put into variable names, and often you want to


This blog by Steve Powell is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, syndicated on r-bloggers and powered by Blot.
Privacy Policy
.