Key Detector

Find the musical key, BPM and Camelot code of any song. Real-time pitch tuner included. Krumhansl-Schmuckler algorithm, browser-based.

Upload
Drag & drop an audio file here
or click to browse
Choose an audio file to detect key and pitch (any format)

About the Key & Pitch Detector

This Key & Pitch Detector analyses an audio file and reports the most likely musical key (e.g. C major, A minor, F# minor), tonic note, mode (major or minor), tempo in BPM, Camelot wheel notation for DJ mixing, and a list of harmonically compatible keys. A separate real-time mode listens to your microphone and shows the current note and its cents deviation from the nearest 12-tone equal-tempered pitch, useful as a fast browser tuner. Everything runs locally in your browser using the Web Audio API and the Meyda feature-extraction library — no uploads, no server, no logging.

Key detection is a Music Information Retrieval (MIR) task with a long academic history. The algorithm here uses a chromagram — a 12-element vector summarising how much energy is present in each pitch class (C, C#, D, ..., B) over the entire track — and matches it against the 24 Krumhansl-Schmuckler key profiles (12 major + 12 minor) using cosine similarity. The Krumhansl-Schmuckler profiles were derived in the 1980s from human-perception experiments and remain a strong baseline; modern MIR systems (Mauch & Dixon 2010, Korzeniowski & Widmer 2017) layer convolutional neural nets on top, but for the kind of clearly tonal material most users analyse here, the classical approach is fast, transparent, and accurate.

Beyond the key itself, the tool reports the Camelot wheel position (1A, 1B, 2A, ... 12A, 12B), a notation popularised by Mixed In Key software and used by professional DJs to find harmonically compatible tracks: songs with the same number mix easily, and ±1 number on the wheel is also musically natural. Compatible keys (relative major/minor, dominant, subdominant) and alternative keys (the next-most-likely candidates from the matching score) are shown so you can sanity-check the result. BPM is detected with an autocorrelation-based onset method.

The real-time tuner uses the YIN pitch-detection algorithm (de Cheveigné & Kawahara, 2002), which is the de-facto standard for monophonic pitch tracking in software tuners and pitch correction plug-ins. YIN is robust to harmonic mixing and runs in real time at 44.1 kHz on commodity hardware. We display the detected pitch class, the deviation from the nearest equal-tempered note in cents, and a visual meter centred on zero. Practising musicians can use this for quick instrument tuning; vocalists can use it to check their pitch accuracy on long sustained notes.

Useful applications include: DJs preparing harmonic mixes (Camelot wheel matches), producers looking for sample-pack songs that fit a project, songwriters identifying the key of a melody they captured by ear, music students transcribing or analysing repertoire, choir and band leaders checking that they're singing or playing in the correct key, and Vietnamese, Latin and French popular-music musicians who often work in modal or modally-flavoured material (Bolero, V-pop ballads, flamenco, salsa, chanson) where confirming the tonal centre quickly saves rehearsal time. Privacy is by design: the audio file is decoded locally, analysis runs in JavaScript, and nothing is uploaded.

How the detection works

Step 1 is decoding. The Web Audio API decodes your file (MP3, WAV, FLAC, OGG, M4A, OPUS, video containers — anything the browser supports) to a 32-bit float PCM buffer at the file's native sample rate. We sum stereo to mono for the analysis, because key is a global property that doesn't benefit from per-channel processing.

Step 2 is the chromagram. We compute a Short-Time Fourier Transform (typically fftSize 4096, hop size 1024 at 44.1 kHz, giving ~93 ms windows with ~75 % overlap). Each magnitude spectrum is mapped to 12 pitch classes by summing energy across all octaves of each note (the spectral peak at A2 = 110 Hz, A3 = 220 Hz, A4 = 440 Hz, A5 = 880 Hz all contribute to the 'A' bin). We weight by perceptual importance — middle-register pitches (where most melodies sit, ~200–2000 Hz) get more weight than very low or very high partials. The chromagrams from every frame are averaged across the entire track, giving a single 12-element vector summarising the song's tonal profile.

Step 3 is key matching using the Krumhansl-Schmuckler algorithm (Krumhansl & Schmuckler, 1990). Each of 24 candidate keys (12 major + 12 minor) has a stored 12-element profile that represents how often each scale degree statistically appears in tonal music in that key. The major profile peaks at the tonic, dominant (5th) and mediant (3rd); the minor profile peaks at the tonic, dominant and minor third. We compute the Pearson correlation between the song's chromagram and each candidate key profile. The key with the highest correlation wins. Confidence is the maximum correlation score expressed as a percentage; values above ~80 % indicate clearly tonal music, 60–80 % moderate confidence, below 60 % atonal or modulating material where the result is unreliable.

Step 4 is BPM detection. We compute an onset envelope by tracking spectral flux — the sum of positive differences between consecutive magnitude spectra — and run autocorrelation on the envelope to find the strongest periodicity in the typical tempo range (60–200 BPM). The peak of the autocorrelation gives the beat period; we convert to BPM and constrain to the same range. This is the simple Tempogram approach (Grosche & Müller, 2009); it works well for songs with a clear pulse and worse for rubato, free-time, or polyrhythmic material.

Step 5 is real-time monophonic pitch detection. When the user starts the tuner mode, we open a 44.1 kHz microphone stream via MediaStreamSource and feed 2048-sample buffers to the YIN algorithm (de Cheveigné & Kawahara, 2002). YIN computes a difference function then cumulative-mean normalised difference, finds the first dip below a threshold (typically 0.1), and parabolic-interpolates the period to sub-sample accuracy. Period is converted to frequency, frequency to MIDI note number, and the cents deviation is 1200 × log2(detected_freq / nearest_note_freq). A small visual meter shows the deviation in real time.

Accuracy and when results are reliable

The Krumhansl-Schmuckler approach reaches roughly 75–85 % key-detection accuracy on the standard MIREX (Music Information Retrieval Evaluation eXchange) test sets of western tonal music, which include classical, rock, pop and jazz tracks selected to have a clear single key. Modern deep-learning systems push this above 90 %. In practice, on clearly tonal pop and rock recordings, the algorithm here is reliable enough that DJs and producers use chromagram-based tools (Mixed In Key, Beatport's key tags) for live work. Confidence below ~70 % means treat the result as a hypothesis, not a fact.

Where it fails or struggles: atonal music (free jazz, much classical 20th-century repertoire, ambient drone, noise music), heavily modulating tracks (a song that changes key three or four times — the algorithm averages and may report something halfway between), modal music that lacks the strong dominant-tonic pull of common-practice tonality (some folk, modal jazz, pentatonic K-pop, certain Vietnamese and Indonesian traditional music), heavy autotune or melodic vocal effects (which warp the chromagram), very short clips (< 30 s — not enough material to estimate an average chromagram), and music where the bass and harmony don't agree (e.g. a synth-bass riff in C while the chord progression says G).

The relative-major/relative-minor confusion is a classic failure mode: C major and A minor share the same notes but have different tonics. If the song spends much time on the tonic of the relative key, the Krumhansl-Schmuckler vectors look similar and the detector may pick the wrong one. The 'alternative keys' list always includes the relative; if your ear says minor but the tool says major, swap to the relative minor and re-listen. Similarly, dominant-tonic confusion (G major in a key that's actually C major) appears occasionally; the alternative-keys list helps spot it.

  • Best on western tonal music (pop, rock, classical, jazz with clear functional harmony); 75–85 % MIREX-style accuracy.
  • Atonal, free-jazz, ambient drone, and modal music score noticeably lower; treat any result on these with skepticism.
  • Heavy autotune, vocoder, or pitch-corrected synth leads warp the chromagram and can flip the detected key.
  • Modulating tracks (multiple keys) are reported as a single best-fit key — the average — which may not be musically meaningful.
  • Very short clips (under 30 seconds) lack enough chromagram averaging to be reliable.
  • Relative-major / relative-minor confusion is common; always check the alternative-keys list before committing to the result.
  • BPM detection works best on songs with a clear, steady pulse; rubato, free-time, and polyrhythmic music score poorly.
  • Monophonic real-time tuner expects one note at a time; chords or polyphonic instruments confuse YIN.
  • Cents readings depend on A4 = 440 Hz reference; historical performance practice (e.g. baroque A=415 Hz) needs separate handling.

Glossary

Chromagram (pitch class profile)
A 12-element vector summarising how much energy is present in each pitch class (C, C#, D, D#, E, F, F#, G, G#, A, A#, B) of an audio signal, summed across octaves. Foundation of most chord and key detection algorithms.
Tonic
The note on which a key is centred — the 'home' tone the music tends to return to. C is the tonic of C major; A is the tonic of A minor.
Mode
The pattern of intervals that defines a scale. Major mode has the pattern W-W-H-W-W-W-H (whole and half steps); natural minor is W-H-W-W-H-W-W. There are five other classical 'church modes' (Dorian, Phrygian, Lydian, Mixolydian, Locrian) and many world-music modes.
Circle of fifths
A way of arranging the 12 keys around a circle so that adjacent positions differ by a fifth, making harmonically related keys (dominant, subdominant) visually adjacent. Camelot wheel is a DJ-friendly relabelling.
Camelot wheel
A DJ-oriented notation popularised by Mixed In Key. Each key is labelled with a number (1–12) and a letter (A = minor, B = major). Tracks with the same number, or ±1, are harmonically compatible. C major = 8B, A minor = 8A; these two share the same number because they share the same notes.
Harmonic minor / melodic minor
Variants of the natural minor scale: harmonic minor raises the 7th degree to provide a stronger dominant–tonic pull; melodic minor raises both the 6th and 7th when ascending and reverts when descending.
Modulation
A change of key within a piece. A song that starts in C major and modulates to G major spends some time on the C profile and some on the G profile, which can confuse a chromagram-averaged key detector.
Krumhansl-Schmuckler key profile
The 12-element template per key, derived from Carol Krumhansl's 1980s perceptual experiments, that quantifies how much each scale degree typically appears in major and minor tonal music. Cosine-correlated against a song's chromagram to estimate the most likely key.
YIN pitch detection
Monophonic fundamental-frequency estimation algorithm (de Cheveigné & Kawahara, 2002). Computes a normalised difference function on a sliding window and finds the first periodic dip. Standard for software tuners and pitch correction.
Cents
A logarithmic unit of pitch interval. 100 cents = one semitone; 1200 cents = one octave. ±5 cents is generally inaudible and considered in tune; ±20 cents is noticeably out of tune.

Frequently Asked Questions

How does the AI detect the musical key?

It computes a chromagram (12 pitch-class energies summed over the whole track) and matches it against 24 Krumhansl-Schmuckler key profiles (12 major + 12 minor) using cosine correlation. The best-matching profile wins. A confidence score is reported — values above ~80 % are reliable, below 60 % the result is uncertain. The whole pipeline runs in your browser via the Web Audio API and Meyda.

What audio formats are supported?

Anything the browser can decode: MP3, WAV, OGG, AAC, M4A, FLAC, OPUS, plus video containers (MP4, MKV, MOV, WebM) from which the audio is auto-extracted. The Web Audio API handles decoding entirely on your device.

What is musical key, exactly?

A key is a tonal centre plus a mode. C major means the music gravitates to the note C and uses the major-scale pattern of intervals (C, D, E, F, G, A, B). A minor uses the same notes but treats A as home. Knowing the key lets musicians transpose, harmonise, improvise solos that fit, and explains why some chords sound 'right' together.

What's the Camelot wheel and how do I use it?

Camelot is a DJ-friendly numbering of the 24 keys around a circle. Each key has a number (1–12) and letter (A for minor, B for major). The rule: tracks sharing a Camelot code mix perfectly; ±1 around the wheel mixes well; jumping +7 (a perfect fifth) also works. C major = 8B, G major = 9B, F major = 7B, A minor = 8A. Mixed In Key, RekordBox and Beatport all label tracks this way.

How accurate is the key detection?

On clearly tonal western music (pop, rock, classical, jazz with functional harmony), the Krumhansl-Schmuckler approach reaches 75–85 % accuracy on the standard MIREX test sets — competitive with commercial tools that use chromagrams. Atonal, modal, very short, or heavily modulating tracks score lower; the confidence percentage tells you how much to trust the answer.

Why does the tool give a different key than what I think it is?

Common reasons: (1) relative-major / relative-minor confusion — C major and A minor share the same notes, look at the alternative keys list; (2) the song actually modulates, and what you hear is the second key; (3) heavy autotune or pitch effects warp the chromagram; (4) the recording is in a non-standard tuning (e.g. 432 Hz or baroque 415 Hz instead of A=440 Hz). Inspect the alternative-keys list — often the right answer is second on it.

What does the cents deviation mean?

It tells you how far the detected pitch is from the nearest equal-tempered semitone. 0 cents = perfectly in tune. +5 cents = slightly sharp (5 % of a semitone above the nearest note). −20 cents = noticeably flat. Most listeners can detect deviations larger than about ±10 cents on sustained tones; ±5 cents is generally considered 'in tune'.

Can I use this as a real-time guitar tuner?

Yes. The real-time mode opens your microphone via MediaStreamSource, runs YIN pitch detection on the live signal, and displays the detected note and cents deviation at 30+ updates per second. It works on any monophonic instrument: guitar, bass, violin, flute, voice. It does not work on chords or polyphonic instruments — YIN expects a single fundamental at a time.

What is BPM detection based on?

We compute spectral flux (the rate of change of the magnitude spectrum, which spikes at note onsets) over the whole track and run autocorrelation on the resulting onset envelope. The peak of the autocorrelation in the 60–200 BPM range gives the dominant tempo. This is the standard Tempogram approach. It is accurate on songs with a clear pulse, less so on rubato, jazz with floating tempo, or free-time material.

Does it work on Bolero, V-pop, salsa, samba, chanson, or other regional genres?

Yes for any tonal material — most popular Vietnamese, Latin, French, and Brazilian music sits comfortably in major or minor keys and detects reliably. Modal genres (Cape Verdean morna, certain Andalusian flamenco palos, modal jazz) may produce a 'best fit' major or minor that approximates the mode but isn't strictly correct. Confidence below 70 % is your signal that the genre or mix is fighting the algorithm.

Is my audio safe and private?

Yes. Decoding, chromagram computation, key matching, and BPM detection all happen in your browser via the Web Audio API and Meyda. Your audio file is never uploaded. Real-time tuning uses your microphone locally and never streams audio anywhere. We don't store, log, or share anything you analyse.

What if the key detection is uncertain (low confidence)?

The tool reports a confidence percentage. If it's below ~70 %, the result should be treated as a guess. Try (1) analysing a different section of the song that's harmonically clearer, (2) using a longer audio segment, (3) checking the alternative-keys list, or (4) listening to the song against a known reference key in your DAW or tuning app.

References & academic sources

  1. Krumhansl, C. L.. (1990). Cognitive Foundations of Musical Pitch (Krumhansl-Schmuckler key-finding algorithm) Oxford University Press.
  2. de Cheveigné, A., & Kawahara, H.. (2002). YIN, a fundamental frequency estimator for speech and music Journal of the Acoustical Society of America 111(4).
  3. Mauch, M., & Dixon, S.. (2010). Approximate Note Transcription for the Improved Identification of Difficult Chords (NNLS Chroma) ISMIR Proceedings.
  4. Temperley, D.. (2007). Music Theory of Tonal Pitch Space (Temperley extensions of Krumhansl-Schmuckler) MIT Press.
  5. MIREX organising committee. (2024). Music Information Retrieval Evaluation eXchange (MIREX) — Key Detection task ISMIR / IMIRSEL, University of Illinois.
  6. Korzeniowski, F., & Widmer, G.. (2017). End-to-end Musical Key Estimation Using a Convolutional Neural Network EUSIPCO.

Last reviewed: · Reviewed by WuTools Audio Engineering Team