The Living Thing / Notebooks :

Citation management

How a 21st century researcher proves their ideas are important enough to present through 20th century infrastructure in the manner of a 19th century gentleman

Contents

The genealogy of truth is important and there are many important ideas about how we could track it, especially with advances in technology; However, this page is not about that propagation of evidence that the practice of citation was developed to support, but rather the pragmatics of the charade of of evidential support that actually-existing academic publishing requires.

In particular, how can I get my journal-ready citations in the 1890 format required by my journals with the bare minimum of dicking around? Moving citation workflows forward into the 1940s can be accomplished by someone else with time.

Citation traumas to avoid repeating

For my own part, I use Zotero. This option is open source, powerful and hackable. It could be more user-friendly; but then the competitors set the bar so low that this is hardly a criticism. Slightly more user-friendly but less hackable is Mendeley, a closed-source reference manager that also is not awful.

Since I have no patience for things that cannot be automated, Zotero is an easy choice for me; You may wish to try both.

There are other options such as Papers (meh) and Sente (ugh!) and (sigh) Endnote. I won’t link or refer to those further here, for the reason that I’ve already lost too much data that way, and I don’t intend to lose more. I can’t in good conscience advise anyone else to waste their precious time.

Still, if you’re keen: For any closed-source software my advice is the same — Try and see how well you can get data in and out, en masse, because that’s what you’ll have to do if the company goes bankrupt or gets bought by Google and shut down, or by Yahoo and accidentally set on fire, or by Facebook and you are only allowed to use it if you click on ads promoting sports shoes for 18 minutes out of every hour.

All the alternatives apart from Mendeley and Zotero have failed the test of preserving my precious data when I exported it to a different software package, so using those other packages is putting my work in the uncaring hands of an unaccountable third party. e.g. To actually extract my data from Sente I had to burn a whole working week turning their malformed markup into valid XML, (which is a specialty that I don’t care about and no-one should ever need to care about) and I still couldn’t work out how to parse some of it. Then many other things went wrong.

If you mostly care about LaTeX (which decision is constrained by the lowest common denominator amongst your collaborators) you might be able to survive on Bibdesk or jabref or just editing a plain Biblatex file, but I for one could not bear to give up the browser integration of Zotero, which has saved innumerable hours of painstaking pointless typing.

See also text editors, academic writing workflow.

TODO: complain about the entire structure of citations in the electronic age (keep it short though, because everyone is tired of complaining about it, and at least it’s better than the general howling void of unsourced internet media.)

TODO: apologise for accidentally complaining at length despite my stated aim of keeping it short.

Zotero

My weapon of choice.

The main insight from my outdated previous citation management page:

Zotero has an API with which you can both read and write data. So do Connotea and Bibdesk. But Zotero’s API is cleanest, and has an active community around it, and I don’t feel that I am locking my data away forever if I rely upon it. (Of course, you can always try to migrate data around from anything to anything with BibTeX, but if you have URLS in there, or consort with foreigners who dare to have diacritics in their names, this usually leads to trouble). I can use the API to make changes that I couldn’t make manually, without worrying about that parsing nonsense. Since all citation software is, basically, awful, it is even more important that whichever application you choose, it is one that you can get your data out of it when you find a less awful option. Moreover, some other apps, such as Mendeley, already use the Zotero API, so you know that it’s not going to be a community of one playing with it[…]

Useful hacks

  • zotero better bibtex plugin, a.k.a BBT

  • For exporting references from Zotero to my plain text blog, I wrote a CSL file which renders my citations and bibliographies as Markdown and RestructuredText, which is good enough for the internet. See below for more on that, or doing it better

  • you can write a custom exporter for Zotero without too much pain, but if you have BBT you probably have all the ones you need. Nontheless, here are overview docs, detail docs, and all the code. Moreover, here is a simple example which deals with the sensitive citekey issue (albeit with an outdated version of the citekey system) Here is`a soothing walk through the proces <https://www.mediawiki.org/wiki/Citoid/Creating_Zotero_translators>`__.

Zotero for your tablet/e-reader

to be evaluated, the currently active clients seem to be

Plaintext citing for your website/blog

tl;dr. There are many over-engineered solutions. The style file one is the simplest I’ve found, and it should additionally work for anything which supports CSL. This includes Mendeley at the least, and possibly Papers. Others?

Anyway, if you want a more fragile but fancier solution, there are other options below.

Render using a CSL style file

Slightly weird, but robust and simple.

Screencap of Zotero
  • A nice CSL editor online

  • Here is my ReST style file, restructuredtext.csl, which renders citations as plain text with ReST markup, including anchor links.

    It’s ugly, because it has to battle with a grumpy rich-text XML infrastructure to render plain text, but it gets the job done without any coding, and is robust against software changes.

  • Here is my Markdown style file, markdown.csl, which, likewise, renders citations as plain text with Markdown markup, including anchor links..

pandoc-based

Web-native solutions

For general blogs there are a few suggestions below.

For jupyter etc I filed the suggestions under scientific workbooks.

These can avoid the need to create bibfiles, rendering bibliographies directly rather than going through bibtex, which is a good thing for websites.

zotero citation

For Atom+Zotero+Markdown, try zotero citation. References look like

[\(Heyns, 2014\)](#@heyns2014)
[\(Heyns, 2014\)](?@heyns2014,heyns2015)

The bibliography is rendered in either pandoc-YAML or plaintext format at

[#bibliography]: #

zotxt

For zotero+any plaintext try zotxt.

You can choose how to wrap the citekeys; per default they look like bibtex:

\cite{heyns_foo_2014,heyns_bar_2015}

They are rendered in the output by an in-built pandoc filter, which is installed separately:

The preferred pandoc-citeproc format seems to be something with an @ sign and/or occasional square brackets

Blah blah [see @heyns_foo_2014, pp. 33-35; also @heyns_bar_2015, ch. 1].
But @heyns_baz_2016 says different things again.

play around:

# Using the CSL transform
pandoc -F pandoc-citeproc --csl="APA" --bibliography=bibliography.bib -o document.pdf document.md
# or using biblatex and the traditionalist workflow.
pandoc -F pandoc-citeproc --biblatex --bibliography=bibliography.bib -o document.tex document.md

You will need to set up Preferences>export>Default format to be Better Bibtex citation key quick copy and set Better Bibtex to copy pandoc style.

See the pandoc manual and the pandoc-citeproc manual.

Misc

Oh, BibTeX

None of that faffing about is useful if you are working with academics, who don’t regard words on the internet as a real thing. Your words must be behind a paywall where no-one can read them to count as significant. Moreover, they must have been rendered harder to analyse by running them through LaTeX, and obfuscating them into a PDF, which probably also entails using BibTeX to do the citation stuff.

If you are starting from Zotero, you can use Better BibTeX to make this less painful.

To manage annoying BibTeX problems from BibTeX files themselves algorithmically I use bibtool.

To mention: Jabref, bibdesk, biblatex vs bibtex vs biber, character set sadness.