The Living Thing / Notebooks :

Media metadata management, transcoding and editing

OK, I make music and DJ, and I would like to bulk edit and search my media using my own criteria, especially when it comes to dealing with the crappy media metadata that other artists give me with their tracks.

See also machine listening, playing music.



Technical details of converting AV formats from whatever you have, to whatever you need to use.

Check with your local jurisdiction’s intellectual property laws before doing any of these.

See also remix, innovation.

rip web videos

Remember kids, for fair use only!

youtube-dl is an incredible script that (despite the name) downloads not just youtube videos but whole playlists of videos from many many websites, setting up transcoding etc for offline use.

rip VCDS

Rip VCDs because copying the files doesn’t work. (see also ripping VCD to various formats) Two choices. Firstly, using Mencoder which is ubiquitous but ugly.

$ mplayer vcd://  # tells you how many tracks. rip desired ones:
$ for i in 2 3 4 5 6; do
>   mencoder vcd://$i -oac lavc -ovc lavc -o track_$i.avi ;
>   done

Depending on where you want to play it, the following non-re-encoding step might be more hi-fi:

$ mplayer vcd://2 -dumpstream -dumpfile filename.mpg  # No re-encoding

On the other hand, you might want to play this on a mac, which won’t work with either of the above steps without specialist software, so you’ll need to re-encode. See FFMPEG for that, since I couldn’t make it work with Mencoder.

Alternatively, use a specialist vcd ripper, such as vxdxrip in the vcdimager system.


ffmpeg (TBD) is amazing for video, or extracting audio from video.

If you wish to salvage pure audio for your fair-use sampling by getting rid of the empty video track:

ffmpeg -i mangled_file.m4a -acodec copy -vn plain_audio_file.m4a

Or to stitch photos to video for making animated GIFs:

ffmpeg -framerate 5 -start_number 1234 -i IMG_%04d.JPG -c:v libx264 -pix_fmt yuv420p -vf scale=1920:-2:flags=lanczos -crf 20 -preset slow -c:a copy ../something.m4v

Documentation is not so much abstruse, as requiring knowledge of video formats, which is one of the most boring domains of human endeavour imaginable.

There is a wiki for each major format, e.g. AAC audio and H.264 video.

There is a dummies’ guide here.

using these I have stitched together a workflow:

$ for i in 2 3 4 5 6 7 8 9 10 11 12 13; do
> ffmpeg -i filename_$i.avi \
  -c:v libx264 -preset slower -crf 22 \
  -c:a libfdk_aac -vbr 5 \
> done

Speaking of concatenation, here’s how to concatenate photos into low-framerate movie from IMG_5815 and subsequent sequential shots:


  • XLD, X Lossless Decoder, is an amazing free app for transcoding arbitrary audio between formats
  • Need offline versions of youtube videos or youtube video soundtracks? Firefox extension Media extractor gets these. There are SO MANY times you need this, such as giving lectures in Indonesia with supporting videos where you don’t have 1 hour to cache EACH VIDEO if YOU ARE LUCKY. Grrrrrrr. Also, AFAICT it’s legal in Indonesia as long as you don’t show penises in said video, but I am no lawyer.
  • Automated transcoding of some format playable in iTunes which you can’t sample in your audio software:
    • Macsome Audiobook/itunes converter (Sporadically updated, has trouble with recent iTunes)
    • tuneskit have a converter that seems to mostly focus on DRM-removal rather than general media conversion. Whilst it works with recent iTunes, it is not fantastically flexible in configuration options; If you have a particular encoding quality or sampling rate that you wish to convert to, this will not help you, and you will end up using XLD.
    • noteburner will also give it a burl.


Reduce size of bloated PDF:

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 -dPDFSETTINGS=/ebook \
  -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

or, wrapped up into a nice little script, ShrinkPDF (infile, outfile, dpi):

./ in.pdf out.pdf 90

This works to concatenate PDFs:

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite \
  -dPDFSETTINGS=/prepress -sOutputFile=output.pdf input*.pdf

editing/annotating metadata


exiftool, exiv2 seem to be popular media maniplation libraries. pyexiv2 is a python binding.

Erase all (or most) of the explicit metadata from an image:

exiftool -all= filename.jpg


ExifTool is not guaranteed to remove metadata completely from a file when attempting to delete all metadata. For JPEG images, all APP segments (except Adobe APP14, which is not removed by default) and trailers are removed which effectively removes all metadata, but for other formats the results are less complete:

  • JPEG - APP segments (except Adobe APP14) and trailers are removed.
  • TIFF - XMP, IPTC and the ExifIFD are removed, but some EXIF may remain in IFD0.
  • PNG - Only iTXt, tEXt and zTXt chunks (including XMP) are removed.
  • PDF - The original metadata is never actually removed.
  • PS - Only some PostScript and XMP may be deleted.
  • MOV/MP4 - Only XMP is deleted.
  • RAW formats - It is not recommended to remove all metadata from RAW images because this will likely remove some proprietary information that is necessary for proper rendering of the image.