The Living Thing / Notebooks :

PDFs

Efficiently using a thousand dollar computer to simulate a one cent piece of paper

Command line tips

Reduce size of bloated PDF:

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 -dPDFSETTINGS=/ebook \
    -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

or, wrapped up into a nice little script, ShrinkPDF: (90 is the dpi here.)

./shrinkpdf.sh in.pdf out.pdf 90

This works to concatenate PDFs:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite \
    -dPDFSETTINGS=/prepress -sOutputFile=output.pdf input*.pdf

EPS to PDF conversion:

ps2pdf14 -dEPSCrop Logo.eps

Quick and dirty RGB-CMYK using recent ghostscript

Programmatic editing and generation

pdfrw:

pdfrw is a Python library and utility that reads and writes PDF files:

* Operations include subsetting, merging, rotating, modifying metadata, etc. [...]
* Can be used either standalone, or in conjunction with reportlab to reuse existing PDFs in new ones

Here is a gentle HOWTO. You can use it to put matplotlib plots in reportlab PDFs

svglib provides a pure python library that can convert SVG to PDF, and a command line utility for same, svg2pdf. you can add SVGs to PDFs.

reportlab is far more famous and even includes a modicum of typesetting. It doesn’t edit PDFs so much, but it generates them pretty well. It’s integration with other things is often a little week - if you though that dropping LaTeX equations in would be simple, or HTML snippets etc. OTOH it includes its own chart generation and so on. Use it if this is a natural way to make two columns for you:

from reportlab.platypus import BaseDocTemplate, Frame, Paragraph, PageBreak, PageTemplate
from reportlab.lib.styles import getSampleStyleSheet
import random

words = "lorem ipsum dolor sit amet consetetur sadipscing elitr sed diam nonumy eirmod tempor invidunt ut labore et".split()

styles=getSampleStyleSheet()
Elements=[]

doc = BaseDocTemplate('basedoc.pdf',showBoundary=1)

#Two Columns
frame1 = Frame(doc.leftMargin, doc.bottomMargin, doc.width/2-6, doc.height, id='col1')
frame2 = Frame(doc.leftMargin+doc.width/2+6, doc.bottomMargin, doc.width/2-6, doc.height, id='col2')

Elements.append(Paragraph(" ".join([random.choice(words) for i in range(1000)]),styles['Normal']))
doc.addPageTemplates([PageTemplate(id='TwoCol',frames=[frame1,frame2]), ])

#start the construction of the pdf
doc.build(Elements)

pypdf2 is another alternative python pdf library.

scribus is a reasonable open-source desktop publishing tool. If your content cannot automtically be layed out it is a good choice, for e.g. posters. It includes a Python API, albeit a reputedly quirky one, which is AFAICT Python 2.

For all that, it’s the cleanest way I have yet seen of generating PDFs, so might be a goer for you.

crop marks

There are a few options

None makes it clear which of TrimBox, BleedBox, Cropbox or ArtBox is what you truly want. This might clarify it slightly but I vagued out.

You can add crop marks to a PDF document with different PDF tools, eg. pdftk.:

  1. Export the first page with crop marks to a PDF file (your_cropmark.pdf)

  2. Join it with your PDF document (your_document.pdf) in the command line:

shell pdftk your_document.pdf multistamp your_cropmark.pdf output result.pdf

NOTE: you can also set PDF cropping values with GhostScript for printing:

  1. Create a plain text file with the right cropping values (eg. this is 5mm crop of A4):
[/CropBox [14.17 14.17 581.1 827.72] /PAGES pdfmark

Alternatively, use the command line

gs -c "[/CropBox  [14.17 14.17 581.1 827.72] /PAGES pdfmark" \
  1. Convert your_document.pdf using the previous file (pdfmark.txt):
gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=result.pdf $OPTIONS -c .setpdfwrite -f your_document.pdf pdfmark.txt

Color conversion

Nightmares. Colour management is generally complicated. ghostcript colour management speciifically is complicated, and has many moving parts and rapid changes - e.g. the -dUseCIEColor option was removed in ghostscript 9.

CMYK

NOTE II: optional color conversion of RGB PDF with GhostScript:

PDF to TIFF example.

gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sColorConversionStrategy=CMYK -sColorConversionStrategyForImages=CMYK -sDEVICE=pdfwrite -sOutputFile=result_cmyk.pdf -dProcessColorModel=/DeviceCMYK -dCompatibilityLevel=1.5 your_document.pdf

grayscale

gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sColorConversionStrategy=Gray -sDEVICE=pdfwrite -sOutputFile=result_gray.pdf -dProcessColorModel=/DeviceGray -dCompatibilityLevel=1.4 your_document.pdf