Part 1A: Cataloging My Music

Well, that was fun! Just finished capturing all my vinyl recordings using the dictate functionality in Microsoft Word; what could be better on a cloudy, windy day in the Motor City? This is merely the first step in capturing all the music I have accumulated over the last 25-30 years; unfortunately some old albums from my youth are long gone 🙁

I’m following this template for turning the recordings into a dataset I can use for analysis and visualization:

  • Step 1: Capture the recordings using the dictate functionality in MS Word
  • Step 2: Copy & Paste the data into MS Excel for initial editing & saving as a .csv file
  • Step 3: Use Exploratory for data wrangling – capitalizing names and titles, correcting misspellings, and adding additional information (sub-genres, release dates?, etc.)
  • Step 4: Visualize the data using Flourish and other tools

Along the way I’ll blog on this site about what I’m finding, starting right now. First thing to note is the unintentionally hilarious fails using voice recognition, especially when there are foreign (i.e.- non-English) names for artists and titles. Having said that, I’m estimating a solid 90% success rate when the names are not too challenging.

Here’s a view of the workload for today – more than 130 albums:

130+ albums to capture using voice recognition

I’ll start with some examples where the voice rec did a perfect job of capturing my dictation. Note that I didn’t expect it to properly capitalize items; that’s an easy fix I can implement in Exploratory. Here are 3 good examples:

  • Jimmy Smith, hobo flats, verve, jazz, vinyl
  • Sonny Rollins, on impulse, impulse, jazz, vinyl
  • Traffic, the low spark of high heeled boys, island records, rock, vinyl

The format here is the artist (individual or group), the album title, the record label, the genre (simplified for now), and the format (vinyl or CD). In each of these cases, voice rec worked great, and even capitalized the artist names. The only work for me will be to capitalize the album name and the record labels.

Now let’s look at 3 examples where voice rec almost got us there:

  • yusef Latif, Angel eyes, arista, jazz, vinyl
  • milked Jackson, and the Thelonious Monk quintet, blue note, jazz, vinyl
  • sunny fortune, awakening, A&M records, jazz, vinyl

A bit of trouble with names here – we should have Yusef Lateef, Milt Jackson, and Sonny Fortune; close but no cigar.

Then we get to the unintentionally humorous interpretations, especially with non-English names:

  • Karl Bohm and the Konzertvereinigung Wiener Staatsopernchor becomes the hilarious Carl bomb concert varina gun Finder stats upper and core 🙂 The voice rec was clearly attempting to use recognizable English words to comprehend the name
  • Another attempt at Yusef Lateef missed the mark – he became usef life chief 🙂
  • Karl Munchinger evolved to Carl moon chinger 🙂
  • Kiril Kondrashin became Curol congressmen 🙂
  • My absolute favorite is the translation of the admittedly challenging Symphonie-Orchester Des Bayerischen Rundfunks to Symphony or caster they buy a Russian round funks 🙂 🙂 🙂

Warts and all, this approach saves me many hours of time and effort that I can devote to the analysis and visualization portions of this exercise. I need to repeat this next with my CD collection (another day), where there are perhaps > 250 titles versus the 130+ LPs.

That’s it for now – thanks for reading!