Cataloging a Classical Music Collection

Cataloging a Classical Music Collection

Written by Ted Shafran

Let’s begin by setting the stage. I have a very large collection of digital music, representing the equivalent of well over 4,400 albums, most of it classical. A number of years ago, I went through my entire collection of CDs and ripped them to FLAC files. I’ve done the same thing with my more limited collection of SACDs and DVD-As, ripping them at the highest possible resolution. I even digitized my entire collection of vinyl.

Why did I do this? Three reasons, really. First of all, I moved from a relatively large home to a considerably smaller condo and there simply wasn’t enough room for my entire collection. The second reason is that I also own a lake house (we call it a cottage here in Canada) about three hours north of Toronto where I live, and I wanted to have access to my entire music collection in both locations. And finally, as I get older, it’s just a lot easier to choose what to play from the comfort of the couch rather than getting up, bending over, and rifling through a collection of disks.

Don’t get me wrong: I haven’t thrown out my vinyl and CDs, but a lot of it is in boxes in a storage unit. And I still play LPs and CDs fairly regularly. But most of my recent purchases have been digital downloads, at the highest-possible resolution. However, relying on a largely digital collection has left me with some conundrums. Herewith, a couple of confessions:

First: On more than one occasion I have purchased an online download, only to discover that it was already in my library. Oops. There goes another 20 bucks down the toilet.

Second confession: Over the years, I have made a number of attempts to catalog my collection, in part to avoid such embarrassments, but also so I know what I actually own. I’m quite sure I have dozens, potentially hundreds of albums that have not been played in years, largely because I forgot about them. And regardless, as someone once said, men love making lists. But unfortunately, none of my attempts to create a catalog have succeeded so far.

What I really want is something that will go out, scan my digital files, and generate a database that can be queried, or generate reports that will show – for example – all of my chamber music recordings, or how many complete Beethoven symphony cycles I own (it’s 30, by the way. And yes, I know – I have a problem).

By training, I’m a software engineer, and I did make some early attempts to build my own software to catalog my collection. But, frankly, I haven’t done a lot of coding for years and I’m getting a little long in the tooth. Everything I tried to build was clumsy and not at all intuitive or, for that matter, effective.

So where did that leave me?

Enter Google. A (not-so) quick search identified a large number of programs for cataloging digital music. I tried out quite a number of them. You can see my results below:

Classical Music Collector – Not updated since Oct. 2011
Classicat – The last revision was 2011, and it’s no longer available.
Muso – I couldn’t get it to install.
Reprtoir – I didn’t test it because the price is based on the number of tracks and it would have cost hundreds of dollars a month for a collection the size of mine.
MediaMonkey – This is a commercial product (although there is a dumbed-down free version) that did the best job on my library, but it has no reporting capabilities and it doesn’t pick up all of the relevant information.
Musicnizer – It did a reasonable job of scanning files but lacks classical-related tag information.
My Music Collection – Same as above, and lacking in reporting.
MusiChi Suite – Showed a lot of promise, but the full version is no longer available and the free version is a VERY old version.
MusicBee – Free, but again not suited to classical music.
Clementine – Hasn’t been updated in six years.
Songbird – Hasn’t been updated since 2013.

And yes, I know all about Roon and jRiver. They both have beautiful user interfaces and they do a pretty good job of finding the metadata for a lot of recordings. But they’re not encyclopedic and, in any case, I wasn’t really looking for a streaming interface; my goal was to create a catalog. (For the record, I stream using Volumio on an Allo USBridge Signature connected by USB to my DAC.)

Ultimately, I gave up in frustration. I simply couldn’t find any software that did everything I wanted. Even a combination of tools just wouldn’t do the trick. Moreover, in the process of testing all of the above software, I made an important discovery:

The metadata on most classical tracks is largely incomplete, inconsistent, inaccurate or poorly organized. And ultimately, even the best software cataloging tool is going to do a poor job without accurate information. There’s that old expression in the computer industry: garbage in, garbage out (also known as GIGO).

Take, for example, this album:

 

The metadata looks like this:

So, what’s wrong with that?

Well, pretty much everything. The name of the opera is missing, as is the composer. And the information about the conductor and the orchestra is lumped in with the soloists. The only information we have to work from is the Artist field which contains very little detail about who does what. And, unfortunately, this is typical of many CD rips, as well as a lot of downloads.

Another depressing discovery was that the quality and accuracy of the metadata varies from one record label to another, from one download source to another!

As it turns out, any music catalog application is fundamentally useless, unless you’re willing to do the work of fixing all your metadata.

And so, I dug in; after all, what choice did I have? Using a free program called MP3Tag (which, notwithstanding its name, supports pretty much every audio format including DSD), I started by spending a few hours every day, over the course of several weeks, cleaning up and re-tagging music files. One of the other nice features of MP3Tag is that – once you’ve corrected the metadata – you can export all or part of your music library as a CSV file which can then be opened and sorted in Microsoft Excel (or any other compatible program). You can download MP3Tag here:

http://www.mp3tag.de/en/index.html

I started by ensuring that I was using a consistent system for assigning genres to all of my tracks. That was the easy part because I already had my music (mostly) organized in folders by genre.

But it wasn’t long before I realized that I was facing a nearly impossible task. I had successfully worked my way through about 20 percent of my collection, one album at a time. But this was after weeks of effort, and it became obvious that I could be looking at months of tedious, repetitive work before I finished manually re-tagging every album in my collection. I just didn’t see that as a realistic goal.

And then I had a sudden realization. Most of the data that I needed was already there, in the names of the music folders themselves. For example, here’s a typical folder name from my library:

Mozart Symphony #38 – Klemperer, Philharmonia

If I could simply capture that information, it would probably provide me with all the information I needed to build a catalog. The only challenge is that I would need to use a consistent naming scheme for my music folders which – fortunately – I do, for the most part. So I turned back to my programming background. And the question I asked myself was: how can I automate this process as much as possible?

Through a process of trial and error, I tried a number of approaches, but the one that worked the best was to create a series of scripts to do the “grunt work.” I’m going to share the fruit of my efforts (below), but here are some caveats you need to be aware of:

  1. The scripts I created are written using Linux/UNIX shell-scripting language. I wish I was an expert in Microsoft PowerShell – but I’m not. If you don’t have access to a Linux environment, don’t panic. There’s a free program called Cygwin that will allow you to run Linux scripts and other utilities on any Windows PC. And, since Apple Macintosh is largely based on UNIX, these scripts also work on Mac OS (using the Bash shell). You can download Cygwin here:

http://www.cygwin.com/

  1. The tools are designed to work in a very specific way, with specific naming conventions for your music folders. If you want to use a different convention, you’ll need to know a bit of programming yourself. I realize this is a big ask. But the tools come with some documentation, so that may help.
  2. As with all free software, these tools are provided on a “use at your own risk” basis.
  3. Be aware that Windows sometimes does funky things with upper- and lower-case names in files, and even funkier things with special characters like the umlaut. One of the reasons I ended up using Linux is because it does a much better job.

In the simplest terms, here’s what you need to do:

  1. Download the scripts. You will find them here:

FTP: music.connectability.com (use a free FTP client like FileZilla).
The user name is: anonymous
There is no password

You can download a ZIP file or a Linux archive (tgz) file containing the entire script library. In the alternative, you can also download the files individually.

  1. Install them on your Linux computer, or in Cygwin, or on your Mac in a Bash shell. You’ll find detailed instructions in the README and HOWTO files.
  2. Make sure the computer you are using has full access to your digital music library.
  3. Read and follow the instructions in the README and HOWTO files. If you get stuck, you’ll find instructions for contacting me in the README. I won’t guarantee quick response, but I’ll do my best to help.

Once you have run the utility (it’s called scan-music), you will end up with a CSV file which can then be opened in Microsoft Excel or any other compatible program (for example, LibreOffice, which is free). If you’re really clever, you can probably import the results into a database like Microsoft Access or any one of several free databases. Once you do that you can run searches and sorts to slice and dice your music catalog in whatever way works best for you.

And here’s an example of what the results might look like (I apologize for the small type):

 

I wish you good success in your cataloging. Hopefully, I’ve done a lot of the heavy lifting for you.

Header image: portrait of Charles Gounod by Théobald Chartran, published in Vanity Fair, February 1, 1879. 

Back to Copper home page

1 of 2