Sound Design Fundamentals: Creating Sounds from Scratch

Sound design sits at the intersection of physics, perception, and creative decision-making — the practice of building audio from its raw components rather than reaching for a preset or a sample. This page covers the core mechanics of synthesis, the causal logic that connects parameter choices to sonic outcomes, and the conceptual boundaries that distinguish sound design from adjacent disciplines like sampling and mixing. Whether the goal is a bass patch for an electronic music production context or a cinematic texture for film, the foundational principles apply equally.

Definition and scope
Core mechanics or structure
Causal relationships or drivers
Classification boundaries
Tradeoffs and tensions
Common misconceptions
Checklist or steps (non-advisory)
Reference table or matrix

Definition and scope

A synthesizer generates sound by producing and shaping electrical signals that model acoustic phenomena — or, increasingly, phenomena that have no acoustic counterpart at all. Sound design is the deliberate construction of a sonic result through control of those parameters: waveform, amplitude envelope, filter cutoff, modulation routing, and so on. The term applies equally to hardware synthesizers, software instruments, and digital signal processing environments.

The scope is broader than most producers initially assume. Sound design encompasses subtractive synthesis (carving harmonics away from a rich source), additive synthesis (stacking sine waves to build timbre), FM synthesis (using one oscillator to modulate the frequency of another), wavetable synthesis (scanning through stored single-cycle waveforms), and physical modeling (mathematically simulating the behavior of acoustic objects). Each method has a different relationship to the music production software plugins ecosystem and to the hardware it runs on or replicates.

Core mechanics or structure

Every synthesis architecture, regardless of type, involves at least three functional stages: generation, shaping, and modulation.

Generation is the oscillator — the source of the periodic (or noise-based) signal. A sawtooth wave contains all harmonics in the harmonic series, making it harmonically dense. A square wave contains only odd harmonics. A sine wave contains exactly 1 harmonic — the fundamental alone. These are not aesthetic preferences; they are mathematical facts about the Fourier composition of those waveforms, described in the Fourier theorem and routinely cited in acoustic textbooks including those published by the Acoustical Society of America.

Shaping involves filters and amplitude envelopes. A low-pass filter (LPF) attenuates frequencies above a cutoff point; a high-pass filter (HPF) attenuates below it. The resonance parameter boosts the frequencies immediately around the cutoff, creating the characteristic "wah" quality associated with filter sweeps. Amplitude envelopes — typically described as ADSR (Attack, Decay, Sustain, Release) — control how loudness behaves over time. A 0 ms attack produces a hard transient; a 200 ms attack produces a soft fade-in.

Modulation is where complexity compounds. An LFO (Low Frequency Oscillator) running at 4 Hz applied to filter cutoff produces a rhythmic filter tremolo. An envelope applied to oscillator pitch produces pitch slides. Modulation routing transforms a static patch into a dynamic, time-varying sound.

FM synthesis, documented extensively in John Chowning's 1973 paper "The Synthesis of Complex Audio Spectra by Means of Frequency Modulation" (Stanford Center for Computer Research in Music and Acoustics, CCRMA), adds a distinct layer: the ratio between carrier and modulator frequencies determines whether sidebands land harmonically or inharmonically, with inharmonic ratios producing bell-like or metallic timbres.

Causal relationships or drivers

The connection between parameter value and perceptual result follows predictable patterns, though those patterns interact nonlinearly.

Raising filter cutoff frequency increases perceived brightness, because more high-frequency harmonics pass through. But at high resonance values, the same cutoff increase also raises perceived volume due to the resonant peak, which means a filter sweep can feel like a dynamic swell rather than just a tonal shift. These two effects run simultaneously, which is why mix engineers treat filter automation as a dynamic tool as much as a timbral one — a topic covered in greater depth at EQ in music production.

Oscillator detuning — running two oscillators a few cents apart — causes beating artifacts at a rate equal to the frequency difference between them. At 5 cents detuning on a 440 Hz note, the beat frequency is approximately 1.27 Hz, producing a slow, wide chorus effect. Increasing detuning to 15 cents raises the beat rate, thickening the sound until it smears into perceived width rather than discrete beating.

Attack time on an amplitude envelope shapes transient presence, which directly affects how a sound sits in a dense mix. Percussion elements built with sub-10 ms attacks cut through because transients trigger the ear's onset detection; long-attack pads occupy spectral space without competing for transient attention. This is not a mixing trick — it is a structural characteristic baked in during sound design, before any signal ever reaches a compressor.

Classification boundaries

Sound design is not the same as sampling in music production, though the two overlap when a synthesizer uses recorded samples as its oscillator source (as in a rompler or sample-based synth). The distinction is functional: sampling retrieves and manipulates existing audio; synthesis generates new audio from mathematical or modeled sources.

Sound design is also distinct from audio editing fundamentals. Editing operates on recorded material — cutting, trimming, comping takes. Sound design operates upstream, before recording, or in parallel on virtual instruments.

Within synthesis, the classification boundaries between subtractive, additive, FM, wavetable, and physical modeling are not always clean. Hybrid architectures (e.g., Moog's Matriarch or Native Instruments' Massive X) combine multiple methods. In practice, the distinguishing question is: what is the primary source of timbral complexity? If it comes from filtering a harmonically rich waveform, that is subtractive. If it comes from modulator-to-carrier interaction, that is FM. If it comes from scanning a table of stored waveforms, that is wavetable.

Tradeoffs and tensions

The central tension in sound design is between control and complexity. Additive synthesis offers theoretically complete control — any timbre can be built by summing the right sine waves — but requires specifying each partial individually, which becomes impractical for rich sounds requiring hundreds of partials. Subtractive synthesis is vastly more efficient but surrenders precise control over upper partials.

FM synthesis is computationally inexpensive and produces rich, evolving timbres, but the relationship between parameters and perceptual results is notoriously counterintuitive. A single unit change in the modulation index of a 2-operator FM patch can shift a soft metallic shimmer to an aggressive, dissonant clang — a relationship that even experienced sound designers describe as requiring patient empirical exploration rather than direct intentional mapping.

Physical modeling resolves some of this by grounding parameter names in familiar physical concepts (string tension, body resonance, bow pressure), but at the cost of computational overhead and constrained timbral range — a physical model of a violin stays convincingly in violin territory; it does not produce a useful bass patch without severe compromise.

The music production trends in the US context adds a market-level tension: most commercial synth presets are designed to be immediately useful, which creates a production culture where custom sound design is less common than preset selection. This shifts the discipline toward macro-level tweaking rather than ground-up patch construction — a legitimate workflow, but a different skill set.

Common misconceptions

"More oscillators always means a better sound." More oscillators add harmonic density and width but also introduce phasing artifacts and low-end build-up. A single-oscillator patch with careful filter and envelope programming can outperform a 6-oscillator patch that has not been designed with intention.

"FM synthesis is only for electric piano and bell sounds." FM produces those timbres readily because those sounds have inharmonic overtones, but FM can generate bass, pads, percussion, and noise textures with equal facility. The association with electric pianos comes from the Yamaha DX7 (released in 1983) dominating commercial music for roughly a decade — a historical artifact, not a technical constraint.

"Wavetable synthesis is just sample playback." Wavetable synthesis uses single-cycle waveforms as its source — each cycle is typically 2048 samples long — and derives its timbral range from scanning through tables of different waveforms under modulation. This is architecturally distinct from sample playback, which reproduces recorded audio at length.

"Analog synthesis is inherently warmer than digital." Analog circuits introduce thermal noise and component variation that produce subtle harmonic characteristics. Digital synthesis can model these artifacts with high fidelity. The perceived warmth difference narrows significantly at high sample rates (96 kHz or above) and with careful gain staging. The home studio setup guide covers how interface quality affects the capture of both analog and digital instruments.

Checklist or steps (non-advisory)

The following sequence describes the logical stages of building a synthesizer patch from scratch:

Oscillator selection — Choose waveform type (sine, saw, square, triangle, noise, or wavetable) based on the harmonic content required. Saw for harmonically dense sources; sine for sub-bass fundamentals.
Tuning and detuning — Set base pitch. Engage a second oscillator detuned ±1–15 cents for chorus width, or at an exact interval (fifth, octave) for harmonic stacking.
Filter type and cutoff — Select filter topology (LPF, HPF, BPF) and set cutoff to define the initial brightness. LPF with cutoff around 2 kHz is a common starting point for mid-range pads.
Filter envelope — Apply an envelope to cutoff frequency to create timbral movement over time. High positive depth with fast decay produces a pluck attack.
Amplitude envelope — Set ADSR to define the loudness contour. Fast attack for percussive; slow attack for pads; long release for reverberant decay.
Modulation routing — Assign an LFO or secondary envelope to at least one destination (pitch, filter cutoff, amplitude) to introduce time-based variation.
Effects (dry chain) — Add internal effects (chorus, distortion, unison) at the patch level. External effects such as reverb and delay are typically applied at the mix stage.
Gain staging — Check output level of the patch against nominal mix level. Clipping at the synth output stage introduces harmonic distortion that propagates downstream.

Reference table or matrix

Synthesis Type	Primary Timbral Source	CPU Load (relative)	Parameter Intuitiveness	Typical Use Cases
Subtractive	Filtered oscillator harmonics	Low	High	Basses, leads, pads, classic synth tones
Additive	Stacked sine partials	High	Low	Complex, evolving timbres; spectral editing
FM (Frequency Modulation)	Carrier/modulator interaction	Low	Low	Electric piano, bells, metallic percussion, aggressive leads
Wavetable	Single-cycle waveform scanning	Low–Medium	Medium	Modern leads, motion pads, morphing textures
Physical Modeling	Mathematical acoustic simulation	Medium–High	High	Realistic acoustic instrument emulation
Sample-based (Rompler)	Recorded multi-samples	Medium	High	Realistic orchestral, acoustic, and vintage instruments

The music production terminology glossary provides expanded definitions for terms like modulation index, operator, resonance, and ADSR that appear throughout synthesis documentation. For the broader context of how sound design fits into a production workflow, the music production process stages overview situates synthesis within the full arc from composition to mastering. The main musicproductionauthority.com resource covers the full scope of production disciplines from gear to distribution.