Table of contents:

  What is sound?
What's the difference between digital and analogue?
Scales, keys and stuff
Delay
What is a filter, then?
What are dynamics and compression?
Standard effects
Creating a sonic environment
DSP effects
Part Two - Sound Tutorial by Tom Molesworth

If you would like to go to the first part, please, press here!
Introduction

Welcome to the second part of the Dreamweaver audio tutorial. We're going to take a quick tour through some of the things you can do to sounds, using an assortment of effects and techniques. If any of the terms in this section are unfamiliar, you might want to read the first part. First off, we'll explain a few phrases that you'll find in the rest of this document.

Common concepts and terminology

Oscillator Go to top of page

An oscillator produces a signal that varies with time. Common examples are a sine wave generator, a pulse (square wave) generator, and a ramp generator (sawtooth wave). In algorithmical terms, an oscillator is a periodic function across the range of 0..1, with a domain of -1 to 1.

LFO Go to top of page

LFO stands for Low Frequency Oscillator. It's often used to control the pitch of a sound, or the cutoff on a filter - it can vary the value by a defined amount, using an envelope or waveform. "Low frequency " is on the order of a few Hz - usually, this will range from 1/100 Hz up to 20 Hz. Most synthesisers include at least one LFO which can usually be configured to control most parameters on the synth.

Volume and dynamics Go to top of page

First, let's define volume. Volume is the energy of a sound, and is usually measured in decibels (dB). A decibel is in fact a relative measure, so to simplify things we usually compare against maximum possible volume - which in practise means that most sounds will have a negative decibel value. A value of -6dB is half the possible level, -12dB is a quarter, -18dB is an eighth, and so on.

Volume has a very strong effect on perception of a sound. A sound which increases in volume with time is perceived as growing or coming closer, for example. The variety of sound level in a sound is called "dynamics" - which literally indicates how "active" the sound is.

It is essential to pay attention to the dynamics in a track in order to control the flow and style of the sound. The same basic pattern can be played in almost an infinite variety of ways simply by altering the dynamics, and each variation will have a different "feel". It is intuitive to listen to the way the volume changes in a selection of tracks, psy-trance and other styles such as classical. As usual, ideas can be taken from many styles and incorporated into your own.

Effects Go to top of page

Sometimes the most entertaining part in writing a track - sometimes the hardest part to get right - is in the effects. There are a lot of different effects available - some of the more common ones are listed here. Combinations of these effects provide a vast territory of sound to explore, and it's very easy to get lost...

Phasing Go to top of page

A phaser shifts the phases of certain frequencies in a sound. This is, in effect, a notch filter - see the filter section for more details.

Chorus Go to top of page

A chorus is very similar to a phaser in sound. A chorus takes one sound and overlays a copy which is phase shifted by some degree - delayed in time by a small amount. If the phase shift varies with time (with an envelope or LFO), frequencies oscillate in amplitude across the spectrum. It simulates the effect of several input "voices" in unison, all detuned by a slight amount - as in a choir, many people singing the same song, but each at a very slightly different pitch. No voice will be at exactly the same pitch - analog voices, so the chances of two voices being perfectly in tune are almost non-existant. The chorus effect can give fullness and depth to an input sound.

The effect is achieved by mixing the original sound with a copy which passes through a delay unit. The output of this unit is delayed by a very short time - usually on the order of a millisecond or so. The delay time is often varied by an LFO. Several "voices" may be available - multiple copies of the original sound each displaced in time by a short amount.

The effect is achieved by mixing the original sound with a copy which passes through a delay unit. The output of this unit is delayed by a very short time - usually on the order of a millisecond or so. The delay time is often varied by an LFO. Several "voices" may be available - multiple copies of the original sound each displaced in time by a short amount.

Flanger Go to top of page

Delay Go to top of page

In its simplest form, a delay provides an output which is an identical, time-shifted copy of the input. The original input can then be mixed in with the delayed sound, doubling every sound to give effects like single echoes and double kicks. Most delay units provide a feedback option - this mixes the output back in with the o riginal and delayed sound, resulting in anything from a repeated, fading echo (between 100 to 1500 ms) to a metallic buzzing sound (shorter delay times, down to only a few samples).

Filter Go to top of page

We covered this very briefly in part one. Since filters are so heavily used in most psy-trance creation, we'll examine filters in a bit more depth here.

What do they do? A filter attenuates or amplifies certain frequencies. In analogue electronics, a filter is a circuit whose impedance is dependent on frequency. "Impedance" is how much the source signal is reduced this simple but it's close enough without going into complex numbers and electronics theory!). Digital filters perform the equivalent function - the amplitude of the output signal varies with the frequency of the input signal. Filters are a subtractive tool - they can only remove parts of a sound - but when coupled with an amplifier or gain unit, as many are, you can boost frequencies as well.
So, what's a filter used for? You can use it to bring out parts of a sound which are drowned out by other frequencies, or to help shape a raw sound. Anything from a bassline to spatial effects can benefit from filters. Amplifying selected frequencies by a large amount forms the basis for many common electronic synths, like the TB303 for example.

Most filter effects provide four modes of operation - lowpass, bandpass, notch (or bandstop, or band reject), and highpass.

Lowpass Go to top of page

Low pass filters permit low frequency sounds to go through, but attenuate higher frequencies. The frequency at which the filter begins to attenuate sounds is called the "cutoff frequency". Low pass filters are commonly used on basslines or to simulate reflection of a sound off a soft surface.

Highpass

A highpass filter allows only the top end of the spectrum to pass. It effectively cuts out the lower frequencies of a sound. Like the lowpass filter, this one has a cutoff point at which frequencies begin to be attenuated.

Bandpass Go to top of page

Bandpass filters allow a part of the frequency spectrum through, but block the rest. It acts like a sliding window on the frequency spectrum of the sound. The bandpass filter has a cutoff frequency, and a bandwidth. The filter is usually centred around the cutoff frequency.

Notch

A notch filter, also called a band stop or band reject filter, functions in the opposite manner to a bandpass filter. It attenuates part of the frequency spectrum, and allow the rest of the sound to pass through. It has a cutoff frequency, and a bandwidth. The filter is usually centred around the cutoff frequency.

Resonance Go to top of page

The resonance of a sound controls how

Poles and zeroes Go to top of page

There's another label commonly attached to a filter - the number of "poles" it has, or "zeroes". Full details on this are best found in a dedicated book on filter design or DSP, but you can think of it as the amount of stages the sound is filtered through between input and output - each extra stage will result in the sound dropping in volume by half, or -6dB. A 2-pole filter is the most common, easiest to implement in hardware or software, and also has the least effect on the sound. The more poles a filter has, the sharper the drop at the frequency cut-off will be. This is usually -6dB for a 2-pole filter, -12dB for a 4-pole filter, -24dB for 6-pole.

Poles and zeroes actually define the defining coefficients for the filter equation, and some filter design tools or plugins allow you to place them yourself - with practise, these can make extremely flexible and useful tools.

If you're interested in the electronics or software behind filter design, try some of these:

Active filter design in Javascript (some other good links from that homepage too)

Assorted filter circuit calculators

More electronics and audio links

Other filter types Go to top of page

A formant filter is another common variation - it removes all frequencies in a vocal sample apart from the formants. These "formants" are the frequencies in the voice which don't change as pitch changes - if you sing a phrase, then sing the same phrase an octave higher but with the same duration (so that the tempo stays the same but pitch goes up one octave), you might notice that some parts of the voice change pitch and others stay the same. There are about four formant frequencies, here's some good values:

some random formant values? Go to top of page

Formant filters are usually multiple bandpass filters in parallel, sometimes with an all-pass filter strapped over the output.

A parametric EQ is equivalent to a filter with multiple bands which are controlled "parametrically" - using numbers. A "paragraphic" EQ has sliders for each frequency band, and additionally a parametric section. A "graphic" EQ has sliders for each frequency band.

A "multipeak" filter is simply a filter with more than one peak in the frequency curve - multiple resonant bands in the filter response. A filter needs to have two poles for every "peak" in a multipeak filter.

Echo Go to top of page

An echo effect can be as simple as a delay in combination with the original sound - you hear the sound once, then a moment later you hear the reflection. However, the sound usually changes somewhat as it is reflected to you, and this is often simulated with filters, additional smaller delays to spread the sound, and even doppler effects if the scenery is moving.

Reverb Go to top of page

Reverb can simulate the subtle echo effect of an enclosed space, or provide the feel of an outdoor location. When used carefully, it can enhance the presence and feel of a track. Too much reverb has the effect of washing out the sound, blurring details and confusing the stereo placement. Too little reverb can sound sterile and synthetic.

An assortment of reverb effects on different sounds as a track progresses is one way to make it feel like a track is moving, or that the environment is changing. This can be overdone, which leads to a confused image of the environment - it's hard, but not impossible, to build a coherent picture if sounds don't obey some of the natural laws.

Common parameters, and what they mean Go to top of page

Dry out Amount of original signal to mix into the output
Wet out Amount of effect to mix into the output - sometimes called just "amount"
Type One of the types listed above
Room size How long it takes before the sound is echoed back to the listener. Lower values for smaller spaces
Reverb time How long a sound hangs around for before it disappears. You can calculate this by dividing the approximate average distance from a speaker to a wall by the speed of sound in air - about 340 metres per second
Diffusion How much to spread the sounds out as they are reflected. Low values simulate harder surfaces, higher values make the sound "softer" and more dispersed across the stereo field
Low cutoff Filters the sound to remove frequencies below this value before passing through to the reverb effect
High cutoff Highest frequency permitted through the filter - see the low cutoff parameter
Damping Cutoff point at which to attenuate high frequencies. This applies a shallow low-pass filter or average filter to the reverb output.

DSP Go to top of page

DSP opens up a whole new selection of sound manipulation techniques and tools. DSP stands for "Digital Signal Processing", and is used to describe a programmable unit which takes a digitized input sound (or generates the sound itself) and manipulates it in some way. This "unit" can be a just a formula, a software module, or maybe a program for a dedicated DSP microchip. The concepts remain similar no matter what the DSP "platform" you use.

Signals and audio Go to top of page

Signal analysis describes several attributes of an audio signal (a "sound"). Two important terms here are "energy" and "power". The "Energy" of a signal is measured in decibels, and is calculated by taking the square root of the average value ("mean") of the sum of each sample squared... this somewhat length term is shortened to "Root mean square" (rms), and will vary between about -16dB to -12dB for music and about -10dB for speech. The energy of a signal is a good indication of the "strength" or perceived volume of a sound.

The power of a signal is the average distance from zero - for sound, this should be zero, so that the speaker cone spends the same amount of time at either extreme. As with the energy of a signal, this is measured in decibels. This value is the "DC component" of a spectrum - see the frequency domain section for more information on this. A track with too much bass can sometimes have low frequency components that shift the DC value of the wave - these can be fixed using a highpass (or "bass cut") filter with no resonance and a very low cutoff - about 20-60 Hz should do it.

Frequency spectrum Go to top of page

To start with, let's go back to frequency spectrums and digitised sound again.

Time domain Go to top of page

The "time domain" is the usual way of looking at things - see how they change in time. If you've seen a sampled waveform on a display, like this:

then you've seen a time-domain representation of a sound.

Frequency domain Go to top of page

The "frequency domain" is reached by analysing a time-domain waveform and finding the amplitudes of each frequency component in that waveform. Remember that all waveforms can be made by adding many sinewaves together? In this case, we mix together every possible sinewave frequency that can be represented in this time-domain waveform (theoretically infinite, but if you're dealing with digitised data there's only a finite amount).
Frequency spectrum of a sine

wave Frequency spectrum of a sine

wave A sine wave. There is a single frequency component at the frequency of the waveform - in this case, there is a single cycle, so the frequency is 1 Hz. This frequency is at maximum "strength", or "volume"; all other frequencies are at zero strength. This frequency is called the "fundamental" frequency. A sinewave, as noted before, can be a very soft sound, like a flute.
Frequency spectrum of a square

wave Square

wave A square wave. Again, there is a fundamental frequency, but in addition there are harmonics which extend up the frequency spectrum - these are odd harmonics, which means that they are on every odd multiple of the fundamental frequency: first, third, fifth, etc. This has a much sharper, harsher sound than a pure sinewave.
Frequency spectrum of a sawtooth

wave Sawtooth

wave A sawtooth wave. This time, there are odd and even harmonics, and the amplitude drops more sharply than for a square wave.
Frequency spectrum of a

triangle wave Triangle wave A triangle wave. Similar to a squarewave.
Frequency spectrum of a noise

wave Random noise Random noise - as you'd expect, a random frequency distribution.

Phase domain Go to top of page

Okay, this one isn't very common in trance production, but it's worth mentioning for completeness. As well as amplitude, each frequency also has a phase value. This is more commonly integrated into the frequency domain as a 2D (complex) number. Now, the human hearing system is not very well equipped to deal with phase information - which is great news for digital audio designers and effects producers since it means we can do a lot to the phase information without audibly altering a sound. The phase information is commonly integrated into the frequency domain by representing values as 2D (complex) numbers.

FFT Go to top of page

The Fast Fou rier Transform is an efficient algorithm for taking a block of audio data and determining its frequency spectrum. It is the standard way for converting audio data from the time domain into the frequency domain.

An FFT will give you the real and imaginary parts of a signal, but what we really want is the magnitude and angle. Magnitude, the distance from zero to an FFT point, is the amplitude of a frequency. The angle is the phase information, from 0 to 360 degrees.

Windowing Go to top of page

We're heading into dangerous signal processing territory here, but you might find references to "windowing" in your favourite audio program. It's a step which is often automatic but can sometimes be tweaked for optimal effect - like anything, the more parameters you have, the longer it will take to get anything useful out of it...!

Windowing is required because a Fourier series is supposed to be a continuous waveform - remember that, although we use a finite-length sample, the Fourier series is continuous, and operates on a waveform. A waveform is cyclic - as phase goes from 0 to 1 (or 0 to 360 degrees), the waveform changes, but a phase of 1.5 is equivalent to 0.5, 2.5, 5.5, -0.5 etc. When you want to take an FFT of a small part of a sample, the data is usually not continuous - this will give strange results on the FFT. Windowing smooths the edges of the sample so that, if it were to be looped continuously, it wouldn't sound quite so bad - and when you take an FFT of a windowed sample, spurious frequencies and phase information are reduced to manageable levels.

There's a load of different types of window - Hamming, Hanning, Blackman-Harris, triangle, sinc, etc etc etc. Square or rectangle is the fastest. Triangle comes a close second. Hamming etc are better quality but slow things down significantly. A sinc window is farcically slow but gives theoretically perfect results. Unless this is a final mix, you're best with a triangle window.

Spectrum Analysis Go to top of page

The FFT is used to convert sample data into a frequency spectrum. This is used in a number of applications for detecting patterns in signal data - SETI scan for frequencies in radio telescope data, for example. A common feature of many softsynths or audio plugins nowadays is a "spectrum analyser" - something that displays the frequency spectrum of a sound as it plays. There are variations on this idea - the spectrum can be displayed in many ways, just as with the raw sample data itself.

Spectrum - linear or logarithmic plot of frequency spectrum along x axis and strength on y axis, updated for each sample block (animated in time).

Voiceprint - plot of signal strength (colour, logarithmic) against frequency (y axis) by time (x axis), often focussed on the human vocal range with formant highlighting. A "voiceprint" is so-called because it is possible to recognise vocal features peculiar to an individual - you can tell who's talking by the frequency spectrum of their voice.

Aliasing Go to top of page

A quick detour here to explain aliasing. This was mentioned earlier as something which Must Be Avoided - so we're going to explain why, and then examine where you'd actually find aliasing useful.

Aliasing is the effect of having a sampling rate set too low for the signal that is being recorded. Remember that the sampling rate must be twice the highest possible frequency in the sound? If you try to sample something that is above the Nyquist frequency, such as 28kHz with a 44.1kHz samplerate, then what actually happens is that the higher frequencies "reflect" back onto the acceptable spectrum - the audio.

Sonic environments Go to top of page

Stereo placement techniques Go to top of page

You can make a sound appear to come from the left or right by making it louder on the left or right speaker - this is what is normally called "panning".

Channel delay Go to top of page

This isn't the only way to move a sound around, though. Since we have two ears, sounds can reach one of them later than it reaches the other - if this happens, then the brain interprets the sound as coming from the direction in which the sound was first heard. However, as you adjust the delay between left and right, you can get 3D sound effects - the sound can come from in front, or behind, or sometimes even above or below. This effect can be heightened by introducing changes to the sound itself, and even by changing the sound between left and right speakers.

A lowpass filter on the sound can make it sound like it's coming from behind. Use a low-pass filter at about 7kHz or so, to simulate the effect of the back of the ear filtering the sound. Then try delaying the sound about 6-9 ms on one channel, or invert one of the channels, to give a "surround" effect. Be careful with inverted sounds - they have a strange effect when wearing headphones, much like twisting one's brain inside out.

Doppler shift Go to top of page

When something moves away, it takes longer and longer for sound waves to reach it, bounce back and hit your ears. The longer the sound takes, the lower the pitch - a waveform that cycles at 8 Hz is of a lower pitch than one at 16 Hz. As something moves away, its perceived frequency drops - the "red shift". As it moves towards the listener, the pitch appears to increase - "blue shift". This can be heard with stars, cars, motorbikes, passing emergency vehicles - anything that makes a sound while moving relative to the listener will have the "Doppler" effect.

Easy to simulate even if you don't have a Doppler plugin. Use a pitch bend envelope, as in SoundForge or similar - start about a semitone or two down, slide up slowly, accelerating as it gets closer to zero, then when it reaches zero (which means that it's at the same point as the listener), it will start to recede again, and the frequency will begin to drop. Slide the pitch down to just below the pitch you started at, and that should give you a basic Doppler effect. Make this pitch effect coincide with careful stereo placement, and you too can make motorbikes come out of people's ears...

Sonic Transition in the time domain Go to top of page

One of the most important, effective, entertaining and exasperating parts of the music-writing process is the transitions: from one part of a track to the next, between sounds, in melodies, in the rhythmical structure, transitions are all over the place in music. Movement - change - is what gives a track "life". Here's a few techniques for getting from one sound to the next...

Mixing Go to top of page

Simplest way of having two sounds together is to mix them so that one comes in as the other starts to fade. This works well when the original sounds fade in and out, or when the sounds are similar enough to be placed end to end. Any sample editor worth using will be able to mix sounds together.

Crossfading Go to top of page

Instead of mixing the two sounds as they are, you can reduce the volume of one as you increase the volume of the other - this way, one sound will fade into the other, which should be familiar to DJs who have used the crossfader on a mixer.

Sonic transitions in the frequency domain Go to top of page

Moving from sound to sound is also about the variation in texture, tone, quality and shape. This is where the frequency domain comes in - many effects are best characterised by their impact on the frequencies of a sound.

Frequency domain crossfading Go to top of page

A very simple form of spectral morphing. Crossfade one frequency spectrum into another. This often works when applied against two different sounds which are similar in spectrum but vary in time relative to each other.

Full spectral morphing Go to top of page

Spectral morphing could be described as the process of blending one sound into another, by applying a function in the frequency domain to the frequencies of the two sounds. For this tutorial, "spectral morphing" refers specifically to the task of blending from one spectral waveform to another - preferably by matching characteristics of one sound to the other and calculating intermediary points between the two waves.

Going from a 440 Hz sine wave to a 1760 Hz sinewave would result in a rising sine wave from 440 Hz to 1760 Hz. This is not always a straightforward task, although as usual there's a few algorithms which can cope. This process is similar to "tweening" in animation.

Unfortunately, examples are difficult to provide at this point due to lack of available tools - Kyma is known to offer spectral morphing techniques, for example, but in terms of pure software I can't think of much offhand. I've included some C source code which could be used as a starting point for implementing spectral morphing.

Granular synthesis Go to top of page

Granular synthesis is to digital sound what decks are for analogue. It is a technique which involves breaking the original sound down into small pieces, called "grains", and then playing these back at different pitches, speeds, volumes or with other effects. This is a truly versatile tool which can provide timestretching effects, extra presence and a fatter sound, atmospheric effects, psyched-out sound effects and even just to bring a sampled beat in time with the rest of a track.

Moving on... Go to top of page

Now we know how to shape and place a sound, let's take a step back and look at how to make those sounds in the first place - part three will explore common synthesis methods and examine how to actually make a sound in the first place.


Copyright Tom Molesworth, 2001.
Converted from html by Plamen Todorov.

PT Design