COPPER

A PS Audio Publication

Issue 48 • Free Online Magazine

Issue 48 QUIBBLES AND BITS

Brainz The Size of a Planet, Part 2

In the last edition of Copper, I introduced MusicBrainz, a crowd-sourced, free-to-use database of metadata for recorded music. From a skeptical beginning, I have come to appreciate that MusicBrainz is actually a first-class resource, seriously well thought-out – one which has accomplished far more than I would have thought possible, and one which I enthusiastically endorse going forward.

MusicBrainz starts from the premise that the metadata typically associated with ripped and/or downloaded music is inadequate, and puts in place a framework to improve upon that. This is so much easier said than done. If you are going to improve upon something, you have to have clear view of what is wrong with it, and in what specific ways it has to be improved. In doing so, it is critical that the data structures you put in place can be applied to the widest possible spectrum of music styles and formats, and also that it is compatible (to the largest extent practicable) with the norms which have hitherto become accepted as standard practice. Both of these requirements involve challenges, and those challenges exist both as fundamental issues regarding how the database is structured, and problems regarding how the data will be used, viewed, and understood in the real world.

MusicBrainz is what is called a ‘relational database’. This type of database comprises lists of similar entities, together with tables of relationships that describe how items on one list relate to items on another list. For example, one list can be a list of people, and another can be a list of musical compositions. A Person can be related to a Work via a composer relationship. Typically a relationship is a two-way affair so that the Person is ‘composer of’ the Work, and the Work was ‘composed by’ the Person. Therefore the first things to understand about MusicBrainz are the primary lists. There are actually 15 of those lists, but only four of them form the vast bulk of the critical relationships in the database. These are Artists, Releases, Recordings, and Works, so let’s just focus on those.

Artists include both people and ensembles. The Beatles constitute an Artist, as do John Lennon and Paul McCartney. A relational database also allows for relationships within a list, so that John Lennon has a relationship ‘member of’ with The Beatles, as indeed does Paul McCartney, and The Beatles have corresponding ‘has member’ relationships with both Lennon and McCartney. Orchestras, choirs, conductors, and producers all end up as part of the Artists list, as do composers, photographers, lyricists, and arrangers. Mostly, though, Artists have important relationships with entities on other lists. So, for example, the only way we know if an Artist is a composer is if he has a ‘composed by’ relationship with an entity in the list of Works. This is very helpful in the big picture because, as we know, individuals can wear many different hats over the course of a career. Leonard Bernstein’s recorded oeuvre includes appearances as conductor, composer, and concert pianist. And ‘ambient music performer’ Brian Eno (much beloved of the NY Times crossword) appears in MusicBrainz as guitarist, keyboard player, percussionist, composer, lyricist, arranger, producer, engineer, vocalist, illustrator, chorus master… that list just goes on and on and on.

MusicBrainz is a highly structured and formalized environment, and the relationships that individual entities can have within and among each other are carefully controlled. Strict hierarchies are maintained. For example:

  • Works are individual pieces of music, and have ‘recording of’ relationships to individual Recordings
  • Recordings are specific recorded performances and have‘performance of’ relationships to individual Works
  • Tracks are structural components of a Release (which is MusicBrainz-speak for an album). Individual Tracks contain individual Recordings;
    Releases contain one or more Tracks

This may be complicated-sounding – and in fact it gets even more complicated than this – but believe me, the complication is the necessary evil to be accommodated if the system is to apply smoothly and consistently across all the possibilities encountered in the world of recorded music.

Works can be broken down into multiple parts, each of which is in itself a Work, and has a ‘part of’ relationship with the parent Work. This is most commonly seen in classical music, where, for example, Beethoven’s 9th Symphony has four movements. In this case the symphony itself is a Work, and each of the movements exist in MusicBrainz as separate ‘part of Works. These ‘part of’ relationships can be nested as deep as you need. Works typically have a ‘composed by’ relationship with someone in the Artists list, and will often also have lyricist, arranger, or even ‘revised by’ relationships. Interestingly, one of the legacy aspects that MusicBrainz has decided to live with instead of imposing its own view is that it includes composer, lyricist, librettist, and writer as separate relationships, which can be viewed as conveying a certain ambiguity.

Releases are an important entity in MusicBrainz because music is typically released onto the market in self-identified collections, such as albums. Therefore albums, EPs, singles, and downloadable releases comprising just one item, all constitute Releases. But for most music, when we are talking about Releases, we are talking about albums. MusicBrainz allows for a lot of information to be stored in respect of albums, including release date, record label, catalog number, cover art, and a whole lot more.

Recordings in MusicBrainz are what we normally call tracks. A Release will comprise a number of Recordings, which are just the tracks on the album. Each of those Recordings will have its own set of attributes, including track number, duration, performers associated with the track, and so forth. One of the key things about MusicBrainz – which causes a lot of trouble – is the relationship between Recordings and Works, and this illustrates nicely one of the requirements I laid out at the start of this column, that the MusicBrainz database should be compatible with the widest possible spectrum of music styles and formats. In the classical music world any given piece of music may have many different recordings of it that have been released by various performers, or even by the same performers at different times. So it follows that a Recording and a Work cannot be the same thing. A Recording has to be a recording of a Work (read that sentence again in order to understand why I have been so anal with my use of capitalization and italics). In existing digital audio metadata no such distinction is made – so a track has both performers and composers, and there is no place at all for the Work, unless it is somehow (i.e. informally) captured in the track’s Title. In MusicBrainz it is only the Recording which can have performers, and the Work which can have composers – you cannot associate composers with Recordings, nor performers with Works. If you think about it, it makes perfect sense.

The second aspect of MusicBrainz that I want to cover in this column is the crowd-sourcing aspect. Crowd-sourcing means that – like Wikipedia as a well-known example – anybody can sign in and enter data. (As an experiment, some years ago, I made a minor edit to the official Wikipedia page for the state of North Carolina. Not an overtly controversial edit, but one with mild socio-political overtones, replacing a text which had slightly less mild socio-political overtones. I was interested to see how long it would last – and who would change it (and why). But, no, it is still there!)

MusicBrainz then has a community of Editors who pore over newly crowd-sourced data and edit out any errors or any data that do not conform to the MusicBrainz ‘style guidelines’. At least that’s the theory. In practice, based on what I am seeing, there are serious limits on how much of the submitted material the Editors can actually review, and as a result huge swathes of the database are not in strict compliance with the style guidelines. But this is not surprising when you consider that over 15,000 new Releases (i.e. albums) are added to the MusicBrainz database every month, a rate which is actually slowly accelerating. (Every time I read that number I find it so incredible I have to go back and check it again in case I made a mistake!)

Adding new data to MusicBrainz can be a tedious process, particularly since the available tools are not particularly user-friendly, but also because every time you want to add a relationship to an entity which is not pre-existing in MusicBrainz, you must first create that entity from scratch, a task which gets old very quickly. With modern (pop/rock etc.) music this often means creating a new Work for each track on the album, which is doubly tedious because you have to check first to see if the Work already exists (the process for doing that isn’t as simple or convenient as you might wish, for all sorts of irritating reasons). And there are always those occasions when you know that a track you are entering is a cover of another track which is already in MusicBrainz, but you find that nobody has bothered to create the Work for it. Because of issues like these, a great many Releases in MusicBrainz have neither performers nor Works associated with their Recordings, which means it is left to editors to step in and fill in the blanks. But there aren’t remotely enough editors to be able to keep up.

The complexity, and thoroughness, of the MusicBrainz database is at the same time its greatest strength and greatest weakness. Strength, because it allows the most comprehensive metadata relationships to be unambiguously assembled. Weakness, because if nobody is motivated to enter the data in the first place, then the strengths are quite irrelevant. By far MusicBrainz’s biggest problem is the paucity with which key data relationships have been entered by the community. With modern music in particular, it is surprising how few Releases in MusicBrainz actually have Works associated to their individual tracks. This means, among other things, that the composer relationships for such tracks is empty.

There are a couple of other important databases which are associated with MusicBrainz, and which form key parts of the MusicBrainz ecosystem. AcoustID is a bit like Shazam. AcoustID publishes an algorithm with which an ‘acoustic fingerprint’ of any given track can be calculated and the resultant fingerprint stored in the AcoustID database. This can then be used to identify the track. Somebody can then take an unknown piece of music, calculate its acoustic fingerprint, submit that to AcoustID, and, if a match is found, find the matching Recording in the MusicBrainz database. Of course, this requires that if a new album is entered into the MusicBrainz database, you need to be able to generate acoustic fingerprints for each track, register them with AcoustID, and then register the match with MusicBrainz. This is done using a free app called MusicBrainz Picard, and, naturally, can only be done if you actually have the music to hand. The other associated database is CoverArtArchive which is used to store cover art and other imagery associated with MusicBrainz data (because images themselves are not handled by the MusicBrainz database).

So that is a basic introduction to MusicBrainz, and believe me, there is more than ten times as much that I could have written if I had the space, and if I thought you had the patience to read it. So in the next and final MusicBrainz column, I’ll deal mostly with how you can use MusicBrainz to power a state-of-the-art music server.

More from Issue 48

View All Articles in Issue 48

Search Copper Magazine

#231 Piano Prodigy Jude Kofie Releases His Debut Album On Octave Records by Frank Doris Jun 01, 2026 #231 Underappreciated Artists, Part Two: City Boy by Rich Isaacs Jun 01, 2026 #231 Music and the Art of Creation: Talking With Saxophonist Rob Scheps by Joe Caplan Jun 01, 2026 #231 How to Play in a Rock Band, 24: Further Adventures at the 2026 Montauk Music Festival by Frank Doris Jun 01, 2026 #231 Courtney Barnett: Creature of Habit by Wayne Robins Jun 01, 2026 #231 Angine de Poitrine: Interstellar Guitar Rock Saviors Headed for Late-Night TV Pop Stardom? by Mark Lepage Jun 01, 2026 #231 My Impressions of AXPONA 2026, Part One by Frank Doris Jun 01, 2026 #231 2026 La Jolla Concours d'Elegance: Another Aesthetic Feast by B. Jan Montana Jun 01, 2026 #231 Country Music Icon Jo Dee Messina’s Bridges: A New Beginning by Ray Chelstowski Jun 01, 2026 #231 The Luxury Dispatch Hosts a Video Podcast With Ken Kessler by Ken Kessler Jun 01, 2026 #231 The Vinyl Beat: Tracking in the Motor City by Rudy Radelic Jun 01, 2026 #231 Lots of Fun With DSP: The Ferrum Audio WANDLA DAC and Its Tube Mode by Frank Doris Jun 01, 2026 #231 From The Audiophile's Guide: Digital Source Components and Streaming Audio by Paul McGowan Jun 01, 2026 #231 Onkyo’s Monster M-510 power amplifier by The Staff at Just Audio Jun 01, 2026 #231 PS Audio in the News by PS Audio Staff Jun 01, 2026 #231 Naming Convention by Peter Xeni Jun 01, 2026 #231 Les Invisibles by Frank Doris Jun 01, 2026 #231 Wildlife Scene by James Schrimpf Jun 01, 2026 #230 Camaraderie by B. Jan Montana May 04, 2026 #230 AXPONA 2026: A Family Gathering by Paul McGowan May 04, 2026 #230 Pianist Ryan Benthall Explores Jazz Realms and Far Beyond With Divine Sky by Frank Doris May 04, 2026 #230 The Vinyl Beat in AXPONA-Land by Rudy Radelic May 04, 2026 #230 Teddy Thompson’s Musical Growth Deepens With Never Be the Same by Ray Chelstowski May 04, 2026 #230 More Fun in the Sun: Florida Audio Expo, Part Two by Frank Doris May 04, 2026 #230 CanJam NYC 2026 Show Report: Heady Sound, Part Two by Frank Doris and Harris Fogel May 04, 2026 #230 Sonic Youth On Murray Street by Wayne Robins May 04, 2026 #230 Graffeo Coffee: A Symphony of Sensory Experience by Joe Caplan May 04, 2026 #230 The Saul Authority: The Story of Hi-Fi Pioneer Saul Marantz by Olivier Meunier-Plante May 04, 2026 #230 How to Play in a Rock Band, 23: Encounters With Famous Musicians, Part Two by Frank Doris May 04, 2026 #230 An Outlier in the Rack: A Vintage BIC Beam Box by The Staff at Just Audio May 04, 2026 #230 PS Audio in the News by PS Audio Staff May 04, 2026 #230 A Cautionary Tale by Rich Isaacs May 04, 2026 #230 Reel-to-Reel Roots, Part 33 (Revised): Ken Kessler Reports On the 2026 (British) AudioJumble by Ken Kessler May 04, 2026 #230 Text Messaging by Frank Doris May 04, 2026 #230 The Audiophile Rat Race by Peter Xeni May 04, 2026 #230 On the Rocks by Rich Isaacs May 04, 2026 #229 The Earliest Stars of Country Music, Part Three by Jeff Weiner Apr 06, 2026 #229 The Healing Power of Music and Sound at the Omega Institute by Joe Caplan Apr 06, 2026 #229 CanJam NYC 2026 Show Report: Heady Sound, Part One by Frank Doris Apr 06, 2026 #229 Florida Audio Expo 2026: Warming Up to High-End Audio, Part One by Frank Doris Apr 06, 2026 #229 Quick Takes: Anne Bisson, Sam Morrison, The Velvet Underground, and the Stooges by Frank Doris Apr 06, 2026 #229 The Vinyl Beat: New Arrivals, and Old Audio Show Demo Scores to Settle by Rudy Radelic Apr 06, 2026 #229 Harvard Gets a High-End Audio Education by Frank Doris Apr 06, 2026 #229 No Country for Old Knees by B. Jan Montana Apr 06, 2026 #229 How To Play in A Rock Band, 22: Encounters With Famous Musicians, Part 1 by Frank Doris Apr 06, 2026 #229 The Soulful Grooves of Guinea-Bissau by Steve Kindig Apr 06, 2026 #229 Four-Hand Piano Performance at Its Finest by Stephan Haberthür Apr 06, 2026

Brainz The Size of a Planet, Part 2

In the last edition of Copper, I introduced MusicBrainz, a crowd-sourced, free-to-use database of metadata for recorded music. From a skeptical beginning, I have come to appreciate that MusicBrainz is actually a first-class resource, seriously well thought-out – one which has accomplished far more than I would have thought possible, and one which I enthusiastically endorse going forward.

MusicBrainz starts from the premise that the metadata typically associated with ripped and/or downloaded music is inadequate, and puts in place a framework to improve upon that. This is so much easier said than done. If you are going to improve upon something, you have to have clear view of what is wrong with it, and in what specific ways it has to be improved. In doing so, it is critical that the data structures you put in place can be applied to the widest possible spectrum of music styles and formats, and also that it is compatible (to the largest extent practicable) with the norms which have hitherto become accepted as standard practice. Both of these requirements involve challenges, and those challenges exist both as fundamental issues regarding how the database is structured, and problems regarding how the data will be used, viewed, and understood in the real world.

MusicBrainz is what is called a ‘relational database’. This type of database comprises lists of similar entities, together with tables of relationships that describe how items on one list relate to items on another list. For example, one list can be a list of people, and another can be a list of musical compositions. A Person can be related to a Work via a composer relationship. Typically a relationship is a two-way affair so that the Person is ‘composer of’ the Work, and the Work was ‘composed by’ the Person. Therefore the first things to understand about MusicBrainz are the primary lists. There are actually 15 of those lists, but only four of them form the vast bulk of the critical relationships in the database. These are Artists, Releases, Recordings, and Works, so let’s just focus on those.

Artists include both people and ensembles. The Beatles constitute an Artist, as do John Lennon and Paul McCartney. A relational database also allows for relationships within a list, so that John Lennon has a relationship ‘member of’ with The Beatles, as indeed does Paul McCartney, and The Beatles have corresponding ‘has member’ relationships with both Lennon and McCartney. Orchestras, choirs, conductors, and producers all end up as part of the Artists list, as do composers, photographers, lyricists, and arrangers. Mostly, though, Artists have important relationships with entities on other lists. So, for example, the only way we know if an Artist is a composer is if he has a ‘composed by’ relationship with an entity in the list of Works. This is very helpful in the big picture because, as we know, individuals can wear many different hats over the course of a career. Leonard Bernstein’s recorded oeuvre includes appearances as conductor, composer, and concert pianist. And ‘ambient music performer’ Brian Eno (much beloved of the NY Times crossword) appears in MusicBrainz as guitarist, keyboard player, percussionist, composer, lyricist, arranger, producer, engineer, vocalist, illustrator, chorus master… that list just goes on and on and on.

MusicBrainz is a highly structured and formalized environment, and the relationships that individual entities can have within and among each other are carefully controlled. Strict hierarchies are maintained. For example:

  • Works are individual pieces of music, and have ‘recording of’ relationships to individual Recordings
  • Recordings are specific recorded performances and have‘performance of’ relationships to individual Works
  • Tracks are structural components of a Release (which is MusicBrainz-speak for an album). Individual Tracks contain individual Recordings;
    Releases contain one or more Tracks

This may be complicated-sounding – and in fact it gets even more complicated than this – but believe me, the complication is the necessary evil to be accommodated if the system is to apply smoothly and consistently across all the possibilities encountered in the world of recorded music.

Works can be broken down into multiple parts, each of which is in itself a Work, and has a ‘part of’ relationship with the parent Work. This is most commonly seen in classical music, where, for example, Beethoven’s 9th Symphony has four movements. In this case the symphony itself is a Work, and each of the movements exist in MusicBrainz as separate ‘part of Works. These ‘part of’ relationships can be nested as deep as you need. Works typically have a ‘composed by’ relationship with someone in the Artists list, and will often also have lyricist, arranger, or even ‘revised by’ relationships. Interestingly, one of the legacy aspects that MusicBrainz has decided to live with instead of imposing its own view is that it includes composer, lyricist, librettist, and writer as separate relationships, which can be viewed as conveying a certain ambiguity.

Releases are an important entity in MusicBrainz because music is typically released onto the market in self-identified collections, such as albums. Therefore albums, EPs, singles, and downloadable releases comprising just one item, all constitute Releases. But for most music, when we are talking about Releases, we are talking about albums. MusicBrainz allows for a lot of information to be stored in respect of albums, including release date, record label, catalog number, cover art, and a whole lot more.

Recordings in MusicBrainz are what we normally call tracks. A Release will comprise a number of Recordings, which are just the tracks on the album. Each of those Recordings will have its own set of attributes, including track number, duration, performers associated with the track, and so forth. One of the key things about MusicBrainz – which causes a lot of trouble – is the relationship between Recordings and Works, and this illustrates nicely one of the requirements I laid out at the start of this column, that the MusicBrainz database should be compatible with the widest possible spectrum of music styles and formats. In the classical music world any given piece of music may have many different recordings of it that have been released by various performers, or even by the same performers at different times. So it follows that a Recording and a Work cannot be the same thing. A Recording has to be a recording of a Work (read that sentence again in order to understand why I have been so anal with my use of capitalization and italics). In existing digital audio metadata no such distinction is made – so a track has both performers and composers, and there is no place at all for the Work, unless it is somehow (i.e. informally) captured in the track’s Title. In MusicBrainz it is only the Recording which can have performers, and the Work which can have composers – you cannot associate composers with Recordings, nor performers with Works. If you think about it, it makes perfect sense.

The second aspect of MusicBrainz that I want to cover in this column is the crowd-sourcing aspect. Crowd-sourcing means that – like Wikipedia as a well-known example – anybody can sign in and enter data. (As an experiment, some years ago, I made a minor edit to the official Wikipedia page for the state of North Carolina. Not an overtly controversial edit, but one with mild socio-political overtones, replacing a text which had slightly less mild socio-political overtones. I was interested to see how long it would last – and who would change it (and why). But, no, it is still there!)

MusicBrainz then has a community of Editors who pore over newly crowd-sourced data and edit out any errors or any data that do not conform to the MusicBrainz ‘style guidelines’. At least that’s the theory. In practice, based on what I am seeing, there are serious limits on how much of the submitted material the Editors can actually review, and as a result huge swathes of the database are not in strict compliance with the style guidelines. But this is not surprising when you consider that over 15,000 new Releases (i.e. albums) are added to the MusicBrainz database every month, a rate which is actually slowly accelerating. (Every time I read that number I find it so incredible I have to go back and check it again in case I made a mistake!)

Adding new data to MusicBrainz can be a tedious process, particularly since the available tools are not particularly user-friendly, but also because every time you want to add a relationship to an entity which is not pre-existing in MusicBrainz, you must first create that entity from scratch, a task which gets old very quickly. With modern (pop/rock etc.) music this often means creating a new Work for each track on the album, which is doubly tedious because you have to check first to see if the Work already exists (the process for doing that isn’t as simple or convenient as you might wish, for all sorts of irritating reasons). And there are always those occasions when you know that a track you are entering is a cover of another track which is already in MusicBrainz, but you find that nobody has bothered to create the Work for it. Because of issues like these, a great many Releases in MusicBrainz have neither performers nor Works associated with their Recordings, which means it is left to editors to step in and fill in the blanks. But there aren’t remotely enough editors to be able to keep up.

The complexity, and thoroughness, of the MusicBrainz database is at the same time its greatest strength and greatest weakness. Strength, because it allows the most comprehensive metadata relationships to be unambiguously assembled. Weakness, because if nobody is motivated to enter the data in the first place, then the strengths are quite irrelevant. By far MusicBrainz’s biggest problem is the paucity with which key data relationships have been entered by the community. With modern music in particular, it is surprising how few Releases in MusicBrainz actually have Works associated to their individual tracks. This means, among other things, that the composer relationships for such tracks is empty.

There are a couple of other important databases which are associated with MusicBrainz, and which form key parts of the MusicBrainz ecosystem. AcoustID is a bit like Shazam. AcoustID publishes an algorithm with which an ‘acoustic fingerprint’ of any given track can be calculated and the resultant fingerprint stored in the AcoustID database. This can then be used to identify the track. Somebody can then take an unknown piece of music, calculate its acoustic fingerprint, submit that to AcoustID, and, if a match is found, find the matching Recording in the MusicBrainz database. Of course, this requires that if a new album is entered into the MusicBrainz database, you need to be able to generate acoustic fingerprints for each track, register them with AcoustID, and then register the match with MusicBrainz. This is done using a free app called MusicBrainz Picard, and, naturally, can only be done if you actually have the music to hand. The other associated database is CoverArtArchive which is used to store cover art and other imagery associated with MusicBrainz data (because images themselves are not handled by the MusicBrainz database).

So that is a basic introduction to MusicBrainz, and believe me, there is more than ten times as much that I could have written if I had the space, and if I thought you had the patience to read it. So in the next and final MusicBrainz column, I’ll deal mostly with how you can use MusicBrainz to power a state-of-the-art music server.

0 comments

Leave a comment

0 Comments

Your avatar

Loading comments...

🗑️ Delete Comment

Enter moderator password to delete this comment:

✏️ Edit Comment

Enter your email to verify ownership: