Removing duplicate tracks from iTunes with Ruby and RBOSA

When I put a new hard drive in my computer, I decided to reinstall the operating system and install applications and data from scratch. Unfortunately, I had a small mishap and accidentally imported two copies of my iTunes library. Removing duplicates by hand would have been possible, but it would have been tedious as well. Mercifully, I stumbled on to RBOSA, so I was able to let the computer do it.

RBOSA is basically Applescript for people who never got around to learning Applescript. The interface to things like iTunes is very simple, so it didn’t really take a lot of work to get something to find duplicates up and running.

The strategy I used was to look at songs in the main library (the method I used for finding the “main library” looks kind of suspect, but it worked. Use caution if you try this at home) and put all duplicates in to a new playlist. Once they were there, I was able to check them over to make sure that they were dups and delete them.

Now, if you’re playing the home game and you know the secret trick for finding and deleting large groups of duplicates (around 8,500 tracks in this case) without busting out the programming: please tell me. I’m pretty sure that I’ll need to do this again at some point, and I’m all about doing things the easy way.

Follows is the script. I used Ruby 1.8.6 and RubyOSA 0.3.0.1 (installed via gem.)

require 'rubygems'
require 'rbosa'

itunes = OSA.app 'iTunes'

dups = itunes.make OSA::ITunes::Playlist
dups.name = 'Duplicate Tracks'

class OSA::ITunes::Track
  def eql?(o)
    artist == o.artist &&
      album == o.album &&
      track_number == o.track_number &&
      name == o.name &&
      time == o.time
  end
  def hash
    to_s.hash
  end
  def to_s
    "#{artist}/#{album}/#{track_number}/#{name}/#{time}"
  end
end

seen = Hash.new
itunes.sources[0].playlists[0].tracks.each do |track|
  seen[track] ||= Array.new
  seen[track] << track
end

seen.values.each do |tracks|
  if 1 < tracks.length
    # Keep the file with the largest bitrate.
    tracks = tracks.sort { |a,b| b.bit_rate <=> a.bit_rate }
    keep, rest = tracks[0], tracks[1..-1]
    rest.each { |t| t.duplicate dups }
  end
end

5 Responses to “Removing duplicate tracks from iTunes with Ruby and RBOSA”

  1. clynne Says:

    I would have sworn that I had seen a “remove duplicates” function in iTunes. … aha, OK, I think what I was was the View option “View Duplicates.”

  2. cp Says:

    Yeah, “View Duplicates” is definitely there. If there were only a few dozen, that would definitely get the job done.

  3. clynne Says:

    Oh, and “View Duplicates” is sort of useless, as it appears to be going off only title and artist. I have two versions of, for example, George Clinton’s “Atomic Dog.” One clocks in at 2:45, the other at 3:xx. Clearly they’re variant recordings, but iTunes considers them both duplicates.

  4. cp Says:

    Yeah, that always bugged me. Worse is that neither iTunes’ “View Duplicates” nor the one that I wrote does anything smart like checking for track or album name similarities with short edit distances or matching soundex codes or whatever. I’ve had a few cases where both Stephanie and I imported the same album and they ended up with slightly different names — different punctuation or capitalization, etc — so they don’t show up as “duplicates.”

    Hmm. None of this would be hard to do. Perhaps I should improve upon this script tonight.

  5. clynne Says:

    I’ve had a few cases where both Stephanie and I imported the same album and they ended up with slightly different names — different punctuation or capitalization, etc — so they don’t show up as “duplicates.”

    Totally lame.

Leave a Reply


This is a free Wordpress template provided by Mathew Browne | Web Design | SEO