Python's audio analysis toolkit is impressive; see machine listening. However, its synthesis is mediocre. Which is not to say terrible. Although I might, later. But in a pinch, you can get audible feedback from that audio analysis.
DIY, bareback style.
Tedious, but do-able. The best example I know is paulstretch, which creates a lovely phase vocoder from raw FFT, reasonably compactly. For real-time things this is not easy, nor are many basic audio techniques easy even offline.
Not quite trivial.
Here is a comparison of options. Summary:
- resampy is fast and works on python3, and is OK
- NNresample optimises builtin scipy resampling for quality in audio
- scikit-samplerate is hifi but infrequently maintained and YMMV with the external c-code dependencies.
- if you are loading a file often the file loader can do sample rate conversion
If you want to read MP3, audioread is simple and easy, but it
breaks in opaque ways when you use it concurrently
and has crappy error handling (everything is
This is the
If you don't care about MP3 then SoundFile does the job, but it is hard to compile.
if you want to load everything and do it really fast, but have a tricky time with trapping the cause of errors, you can invoke ffmpeg from python, which is very fast and does various FX processing for free. This is what I now do.
Notable non-abandoned projects include pydub, amen and pippi.
pippi is a little idiosyncratic but seems to do synthesis quick-n-easy with reasonable optimisation using cython. The documentation and packaging are a mess, though. I think it might even do realtime? Heavily developed.
brew install libsndfile pip install pippi
Amen has strong machine-listening tools, integrating
but its effects are weak sauce -
just cutting and pasting audio around in time with nice crossfades.
It does cute re-edits, but nothing else.
The idea of keeping the analysis metadata attached to the samples is nice though.
brew install libsndfile1 libav-tools pip install amen
pydub has a lots of audio DSP, effects and editing procedures, but only basic audio analysis. Also, weirdly, it aims to be pure python (i.e. no numpy) which makes some things embarrassingly slow, and means there is a lot of re-implementing numpy. So it runs everywhere, but not great anywhere. (maybe this would be fast with a jit-compiled python?)
More specialised: pyworld is a python wrapper for a speech-specific anlaysis-resynthesis method WORLD, by Masanori Morise. This will really only work on things that are very much like solo human voice.
Fancy realtime audio libraries
pyo (github) is a python audio processing framework by Olivier Bélanger. Supports python 2.7 and 3.5-3.6. It wants to run a wxPython gui, which is its own kind of inconvenience in turn, as it conscripts you into a toolkit war. Nonetheless it can do some neat stuff, and wxPython GUI is pretty good, so if you don't mind a mildly opinionated library, this is a nice thing to work with. It's a one-man shop, indicates very impressive productivity on the part of its creator. This guy has more or less reimplemented the supercollider scsynth infrastructure.
It claims to be
a Python module written in C to help DSP script creation. Pyo contains classes for a wide variety of audio signal processing. With pyo, the user will be able to include signal processing chains directly in Python scripts or projects, and to manipulate them in real time through the interpreter. Tools in the pyo module offer primitives, like mathematical operations on audio signals, basic signal processing (filters, delays, synthesis generators, etc.), but also complex algorithms to create sound granulation and other creative audio manipulations. pyo supports the OSC protocol (Open Sound Control) to ease communications between softwares, and the MIDI protocol for generating sound events and controlling process parameters. pyo allows the creation of sophisticated signal processing chains with all the benefits of a mature, and widely used, general programming language.:
Here is an example
>>> s = Server().boot() >>> s.start() >>> wav = SquareTable() >>> env = CosTable([(0,0), (100,1), (500,.3), (8191,0)]) >>> met = Metro(.125, 12).play() >>> amp = TrigEnv(met, table=env, dur=1, mul=.1) >>> pit = TrigXnoiseMidi( met, dist='loopseg', x1=20, scale=1, mrange=(48,84) ) >>> out = Osc(table=wav, freq=pit, mul=amp).out()
See also cecilia, a gui for pyo.
If you don't want to use the default distribution by weird OS-specific installer packages which want to invade your system python installation, that is optional:
brew install liblo libsndfile portaudio portmidi --universal git clone https://github.com/belangeo/pyo.git cd pyo python setup.py install --use-coreaudio --use-double
Note that you might still have to deal with some wxPython weirdness on OSX.
csound supports python -- specifically embedding of and within python.
FoxDot runs actual
supercollider scripts from python.
(As opposed to
pyo, which implements a synthesis server that
Looks like supercollider.
It comes with an IDE, which is a waste of time IMO,
but many other worthwhile tricks, including a nice scheduler
(see the docs)
Rather than claiming to be a universal solution to audio,
it's a righteous hack that does some startling things very well and some other
things not at
A good start, creatively speaking.
(https://gstreamer.freedesktop.org/documentation/frequently-asked-questions/general.html) is a generic multimedia pipline library that seems to pop up in lots of neat projects. It happens to have [extensive python support() See Brett Virren's tutorial.
audiolazy, by Danilo de Jesus da Silva Bellini, looks great for technical audio analysis and synthesis, although a bit clunky for, you know, synths. Intermittently updated.
Prioritizing code expressiveness, clarity and simplicity, without precluding the lazy evaluation, and aiming to be used together with Numpy, Scipy and Matplotlib as well as default Python structures like lists and generators, AudioLazy is a package written in pure Python proposing digital audio signal processing (DSP), featuring:
A Stream class for finite and endless signals representation with elementwise operators (auto-broadcast with non-iterables) in a common Python iterable container accepting heterogeneous data;
Strongly sample-based representation (Stream class) with easy conversion to block representation using the Stream.blocks(size, hop) method;
Sample-based interactive processing with ControlStream;
Streamix mixer for iterables given their starting time deltas;
Multi-thread audio I/O integration with PyAudio;
Linear filtering with Z-transform filters directly as equations (e.g.
filt = 1 / (1 - .3 * z ** -1)), including linear time variant filters (i.e., the a in
a * z ** kcan be a Stream instance), cascade filters (behaves as a list of filters), resonators, etc.. Each LinearFilter instance is compiled just in time when called;
Zeros and poles plots and frequency response plotting integration with MatPlotLib;
Linear Predictive Coding (LPC) directly to
ZFilterinstances, from which you can find PARCOR coeffs and LSFs;
Both sample-based (e.g., zero-cross rate, envelope, moving average, clipping, unwrapping) and block-based (e.g., window functions, DFT, autocorrelation, lag matrix) analysis and processing tools;
A simple synthesizer (Table lookup, Karplus-Strong) with processing tools (Linear ADSR envelope, fade in/out, fixed duration line stream) and basic wave data generation (sinusoid, white noise, impulse);
Biological auditory periphery modeling (ERB and gammatone filter models);
Multiple implementation organization as
StrategyDictinstances: callable dictionaries that allows the same name to have several different implementations (e.g. erb, gammatone, lowpass, resonator, lpc, window);
Converters among MIDI pitch numbers, strings like “F#4” and frequencies;
Surprisingly, GNURadio supports extensive optimized DSP for python using a high-performance compiled real time dataflow graph.
- see LiveOSC, under scripting Live for outsourcing your sound to Ableton live
- You can control supercollider from python like this
- render audio using midi2audio a minimalist wrapper for fluidsynth, which renders midi using Soundfonts.
Real time audio
PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With PyAudio, you can easily use Python to play and record audio on a variety of platforms. PyAudio is inspired by:
pyPortAudio/fastaudio: Python bindings for PortAudio v18 API.
tkSnack: cross-platform sound toolkit for Tcl/Tk and Python.
Real time MIDI
Mido, iirc. TBC.
- Mori16: Masanori Morise (2016) D4C, a Band-aperiodicity Estimator for High-quality Speech Synthesis. Speech Commun., 84(C), 57–65. DOI
- GlLT09: John C. Glover, Victor Lazzarini, Joseph Timoney (2009) Simpl: A Python library for sinusoidal modelling. In DAFx 09 proceedings of the 12th International Conference on Digital Audio Effects, Politecnico di Milano, Como Campus, Sept. 1-4, Como, Italy (pp. 1–4). Dept. of Electronic Engineering, Queen Mary Univ. of London,
- MoYO16: Masanori Morise, Fumiya Yokomori, Kenji Ozawa (2016) WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications. IEICE Transactions on Information and Systems, E99.D(7), 1877–1884. DOI