Why the iTunes XML format is WRONG.

Monday, 30 November 2009 11:26 by MartinKirk

In this post, I’d like to explain in how many ways Apple has fucked up their iTunes Library XML-format.

Let’s first have a look at a sample:

   1: <?xml version="1.0" encoding="UTF-8"?>
   2: <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
   3: <plist version="1.0">
   4: <dict>
   5:     <key>Major Version</key><integer>1</integer>
   6:     <key>Minor Version</key><integer>1</integer>
   7:     <key>Application Version</key><string>9.0.2</string>
   8:     <key>Features</key><integer>5</integer>
   9:     <key>Show Content Ratings</key><true/>
  10:     <key>Music Folder</key><string>file://localhost/J:/Music/iTunes/iTunes%20Music/</string>
  11:     <key>Library Persistent ID</key><string>D3E174FD6A92E637</string>
  12:     <key>Tracks</key>
  13:     <dict>
  14:         <key>4804</key>
  15:         <dict>
  16:             <key>Track ID</key><integer>4804</integer>
  17:             <key>Name</key><string>Amen</string>
  18:             <key>Artist</key><string>Astral Projection</string>
  19:             <key>Album</key><string>Amen </string>
  20:             <key>Kind</key><string>MPEG audio file</string>
  21:             <key>Size</key><integer>8873315</integer>
  22:             <key>Total Time</key><integer>362448</integer>
  23:             <key>Track Number</key><integer>1</integer>
  24:             <key>Year</key><integer>2002</integer>
  25:             <key>Date Modified</key><date>2008-12-12T11:29:57Z</date>
  26:             <key>Date Added</key><date>2008-12-12T11:57:48Z</date>
  27:             <key>Bit Rate</key><integer>192</integer>
  28:             <key>Sample Rate</key><integer>44100</integer>
  29:             <key>Comments</key><string>psilocybin^2002</string>
  30:             <key>Play Count</key><integer>8</integer>
  31:             <key>Play Date</key><integer>3339144676</integer>
  32:             <key>Play Date UTC</key><date>2009-10-23T11:11:16Z</date>
  33:             <key>Rating</key><integer>80</integer>
  34:             <key>Album Rating</key><integer>80</integer>
  35:             <key>Album Rating Computed</key><true/>
  36:             <key>Artwork Count</key><integer>1</integer>
  37:             <key>Persistent ID</key><string>075B96F461F27CCD</string>
  38:             <key>Track Type</key><string>File</string>
  39:             <key>Location</key><string>file://localhost/J:/Music/iTunes/iTunes%20Music/Astral%20Projection/Amen/01%20Amen.mp3</string>
  40:             <key>File Folder Count</key><integer>4</integer>
  41:             <key>Library Folder Count</key><integer>1</integer>
  42:         </dict>
  43:         <key>4806</key>

The first thing we notice, is that the whole library is double-wrapped: <plist><dict>…</dict></plist>. Doing so seems odd, when there is only 1 node inside the plist element.

Digging further, we notice that key-value pairs are formulated by two elements following each other. This is ill formed and may cause errors if another XML-reader/writer which doesn’t respect the order of elements – It works, but only as long as the order is respected! The main problem here is the extreme amount of metadata being generated.

Thirdly, we notice that data is organized in Dictionaries. Apple might translate that into objects of another kind then dictionaries, but i doubt it because of the naming scheme. Dictionaries are very fast at storing single objects-per-key but i wouldn’t use it for data which is being browsed (for-each’ing). I suspect that each <dict> is translated into an object, which may be the root to performance issues seen in iTunes.

Fourthly we see that the files are located using a file://localhost scheme.. WHY ? – i suspect that the reason is to make it easier for network connected listeners to gain access to the file, only requiring a renaming of localhost to the host name. Although i don’t understand why the file is located using a full path, when music is stored in a subfolder, surely a relative path would be better – enabling easy copying of whole iTunes folder…

Fifthly, using this structure created EXTREMELY HUGE files! - I counted 260.000 lines in my library !!!!!! o_O

 

Let’s optimize the XML to 2009 best-practice:

Rules:

  1. Elements only occurring once for the whole node should become attributes to parent
  2. Each branch should be as thick as possible (loose the metadata)
   1: <?xml version="1.0" encoding="UTF-8"?>
   2: <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
   3: <plist version="1.0"
   4:        MajorVersion="1" MinorVersion="1" ApplicationVersion="9.0.2"
   5:        Features="5" ShowContentRatings="true" MusicFolder="J:/Music/iTunes/iTunes%20Music/"
   6:        LibraryPersistentID="D3E174FD6A92E637">
   7:   <tracks>
   8:     <song TrackID="4804" Name="Amen" Artist="Astral Projection"
   9:           Kind="MP3" Size="8873315" TotalTime="362448" TrackNumber="1"
  10:           Year="2002" DateModified="2008-12-12T11:29:57Z" DateAdded="2008-12-12T11:57:48Z"
  11:           BitRate="192" SampleRate="44100" PlayCount="8" PlayDate="3339144676"
  12:           PlayDateUTC="3339144676" Rating="80" AlbumRating="80" AlbumRatingComputed="true"
  13:           ArtworkCount="1" PersistentID="075B96F461F27CCD" TrackType="File"
  14:           >
  15:       <Path type="Rel">Astral%20Projection/Amen/01%20Amen.mp3</Path>
  16:       <Comments>psilocybin^2002</Comments>
  17:     </song>
  18:   </tracks>
  19:   <Playlists>
  20:     <Playlist Name="Astral Projection">
  21:       <song>4804</song>
  22:     </Playlist>
  23:   </Playlists>
  24: </plist>

Path and Comments are created as elements rather then attributes, because of their content. The rest is converted to attributes.

The length of the file is reduced to aprox 36% (260.000 –> 93000) if not less… even less if the song-element isn’t styled with newline chars, loosing them results in a library of 10-20.000 lines => 4 lines pr song.

The amount of data is reduced to aprox 44% (14.178 KB –> 6.233 KB) – Which would make loading/saving faster !!!

The library would also be much safer/easier to backup/copy/move if needed because of the file locations being relative (or real absolutes)

Categories:  
Actions:   E-mail | Permalink | Comments (0) | Comment RSSRSS comment feed
Comments are closed