Music in the cloud, a new kind of cloud service, has become very popular over the last couple of years.
It makes sense. It helps free valuable portable space on smartphones, it facilitates the sharing of music with family and friends and most importantly, It’s much cheaper than regular cloud storage. In some cases, as with Google Music, it’s free.
Theoretically, when you pay for this kind of service, you’re buying extremely cheap cloud storage space. In the case of Google, you’re even getting it for free. Let’s see:
Songs | Maximum File Size | Storage Space | Price | |
---|---|---|---|---|
Apple | 25'000 | 200 MB | 4.7 TB | $25 |
20'000 | 300 MB | 5.7 TB | Free | |
Amazon | 250'000 | 100 MB | 19 TB | $25 |
Of course, this makes sense for service providers because in most cases, people use these services to store music in the form of MP3 files. If a file is already contained in the service provider’s catalog, there’s no need to even upload it from the client’s computer. And one single copy of this MP3 can serve millions of happy customers. In the end, only a small fraction of all that theoretical space is actually occupied.
What I would like to show your here is how we can teach a new trick to our music-in-the-cloud-services in order to make them accept, not only MP3 files, but any kind of file.
I know what you’re probably thinking: “Let’s take any file and change the extension to MP3”
Genius, right?… wrong… In fact, these services are a little smarter than that. They all expect you to upload music. So we need to do some work on our files before we can upload them to the cloud. Here’s how it really works:
An MP3 file has a very specific structure. It is composed of several frames, each of them preceded by a header. The header contains information about the MP3 version, the bit-rate, the frequency and some other meaningful information. Metadata is optionally added to the file by means of ID tags.
I wrote a Python script that wraps any file into an MP3 disguise. It does this by cutting the data in chunks of very specific size, adding the necessary headers, putting the frames together and finally, adding an ID tag to store information about the original binary file. If the file is to large, the script cuts it into pieces and then wraps each piece as an MP3 file. Each chunk is marked using the track information contained in the ID tag.
By default, the script marks the files with the Artist name “Fake MP3 Encoder” and the Album name “My Data in the Music Cloud”. Both values can be changed directly from the command line. Album artwork is also added to the file to help identify more easily the Album containing your wrapped files in iTunes, Google Music or Amazon Cloud Player.
So this is my wrapper:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 |
#!/usr/bin/env python """ Encoder.py Creates a fake MP3 by encapsulating and arbitrary binary file using MP3 file structure Created by Abraham Rubinstein on 2012-12-23. Copyright (c) 2012 Abraham Rubinstein. All rights reserved. """ """file ops""" import sys, getopt, os, argparse, io """mp3 tagging""" from mutagen.mp3 import MP3 from mutagen.easyid3 import EasyID3 import mutagen.id3, math from mutagen.id3 import ID3, APIC # for album art MAX_FILE_SIZE = 195000000 DEFAULT_ARTIST = 'Fake MP3 Encoder' DEFAULT_ALBUM = 'My Data in the Music Cloud' class Startup: def __init__(self): parser = argparse.ArgumentParser() parser.add_argument('infile', help="File to encapsulate", action="store") parser.add_argument("-o", "--outfile", dest="output_filename", help="MP3 output file", action="store") parser.add_argument("-a", "--artist", dest="artist", help="Specify artist metadata", action="store") parser.add_argument("-l", "--album", dest="album", help="Specify album metadata", action="store") args = parser.parse_args() if not args.artist: self.artist = DEFAULT_ARTIST else: self.artist = args.artist if not args.album: self.album = DEFAULT_ALBUM else: self.album = args.album self.fullpath=args.infile if os.path.isdir(self.fullpath): print 'nEROOR: "' + self.fullpath + '" is not a file but a directory.n' print 'You can zip the directory and then encode it with this tooln' sys.exit() if not args.output_filename: #No output name given. Use original infile name plus mp3 extension self.filePlusExt=os.path.basename(self.fullpath) self.filename = os.path.splitext(self.filePlusExt)[0]+'.mp3' else: #Output name given. Add mp3 extension if not pressent if not "mp3" in args.output_filename.lower(): self.filename = args.output_filename+".mp3" else: self.filename = args.output_filename try: self.filesize = os.path.getsize(self.fullpath) except OSError: print 'nERROR: File "'+self.filename+'" does not seem to exist.n' sys.exit() class Encoder: def __init__(self, infile, artist, album): self.header = '\xff\xfb\x92d' self.numBytesPadding = 0 self.filesize = 0 self.infile = infile self.artist = artist self.album = album def process_chunk(self, wf, outfile): try: with wf: ofile = open(outfile,'wb') """Determine file size and read 1st block of data""" self.filesize = len(wf.read()) wf.seek(0) bloc = wf.read(414) """Iterate over whole file. Read by blocs of 414 Oct and compose frame by adding the header at each iteration""" while bloc: """compose MP3 frame""" frame = self.header+bloc ofile.write(frame) bloc = wf.read(414) """are we reading the last bloc? Add padding to complete the 418 bytes for the frame""" if bloc and len(bloc) < 414: self.numBytesPadding = 414-len(bloc) bloc = bloc+'xff'*(414-len(bloc)) padding = True ofile.close() wf.close() except IOError: print 'nERROR: File "'+infile+'" does not seem to exist.n' sys.exit() def tag_mp3(self, outfile, i=None): """Tag MP3 file with infile name""" audiofile = MP3(outfile, ID3=EasyID3) audiofile.add_tags(ID3=EasyID3) audiofile['artist'] = self.artist audiofile['album'] = self.album audiofile['composer'] = str(self.filesize) audiofile['title'] = os.path.basename(self.infile) if i: audiofile['tracknumber'] = str(i) audiofile.save() """ Add cloud album art """ audiofile = MP3(outfile, ID3=ID3) audiofile.tags.add( APIC( encoding=3, # 3 is for utf-8 mime='image/png', # image/jpeg or image/png type=3, # 3 is for the cover image desc=u'Cover', data=open('cloud_folder.png').read() ) ) audiofile.save() def main(): args = Startup() wf = open(args.fullpath, "rb") fakeMP3 = Encoder(args.fullpath, args.artist, args.album) if args.filesize > MAX_FILE_SIZE : num_of_chunks = int(math.ceil(float(args.filesize)/float(MAX_FILE_SIZE))) print 'nWARNING: File size too big. The file will be fragmented into '+str(num_of_chunks)+ ' MP3 files.n' print 'The MP3 files will be numbered. However, some cloud services change the file names. The track metadata will indicate the order of the fragments' chunk_size = int(round(args.filesize/num_of_chunks))+1 for i in range(1,num_of_chunks+1): print 'Processing file '+str(i)+'/'+str(num_of_chunks) onlyname, fileExtension = os.path.splitext(args.fullpath) chunk_name = onlyname+str(i)+'.mp3' chunk_data = io.BytesIO(wf.read(chunk_size)) fakeMP3.process_chunk(chunk_data, chunk_name) fakeMP3.tag_mp3(chunk_name, i) else: fakeMP3.process_chunk(wf,args.filename) fakeMP3.tag_mp3(args.filename) print 'nSuccess! You may now import your MP3 file(s) into your cloud service.n' if __name__ == "__main__": main() |
Of course, a second Python script removes the disguise and yields the original data. Here’s what it looks like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
#!/usr/bin/env python """ decoder.py decodes a fake MP3 by extracting a data file encapsulated into a fake MP3 file structure Created by Abraham Rubinstein on 2012-12-23. Copyright (c) 2012 Abraham Rubinstein. All rights reserved. """ """file ops""" import sys, getopt, os, os.path, argparse, io """mp3 tagging""" from mutagen.mp3 import MP3 from mutagen.easyid3 import EasyID3 import mutagen.id3, re class Decoder: def __init__(self): self.header = '\xff\xfb\x92d' def getFileNameFromTag(self, MP3FileName): audiofile = MP3(MP3FileName, ID3=EasyID3) return audiofile['title'][0] def getFileSizeFromTag(self, MP3FileName): audiofile = MP3(MP3FileName, ID3=EasyID3) return audiofile['composer'][0] def removeIdTag(self, file): s = f.read() return s[s.find(self.header):] def processFile(self, buffer, filesize): i = 0 bloc = 1 newf = io.BytesIO() while bloc: frame = buffer[i:i+418] i = i+418 bloc = frame[4:] newf.write(bloc) newf.seek(int(filesize)) newf.truncate() newf.seek(0) return newf.read() def natural_sort(l): convert = lambda text: int(text) if text.isdigit() else text.lower() alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] return sorted(l, key = alphanum_key) def main(): parser = argparse.ArgumentParser() parser.add_argument('infile', help="File to decapsulate", action="store", nargs='+') parser.add_argument("-o", "--outfile", dest="output_filename", help="Override original file name", action="store") parser.add_argument("-d", "--delete", dest="delete_mp3", help="Delete MP3 file(s)", action="store_true") args = parser.parse_args() infiles = natural_sort(args.infile) outfile = args.output_filename decoder = Decoder() for file in infiles: if not os.path.exists(file): print 'ERROR: File "'+ file +'" does not seem to exist.' sys.exit() if not outfile: outfile = decoder.getFileNameFromTag(infiles[0]) if os.path.exists(outfile): print 'The file that you are processing will decapsulate to "'+outfile+'"' print 'That file already exists.' ans = raw_input("Would you like to (a)bort, (o)verwrite or (r)ename file? (o) ") if ans == 'a': print "Aborting..." sys.exit() elif ans == 'r': outfile = raw_input("Please enter a new name for the file: ") else: print 'Proceeding... "'+outfile+'" will be overwritten' pass bigfile = open(outfile, 'wb') for file in infiles: print 'Processing '+file filesize=decoder.getFileSizeFromTag(file) f = open(file,'rb') buffer = f.read() buffer = buffer[buffer.find('\xff\xfb\x92d'):] result = decoder.processFile(buffer, filesize) bigfile.write(result) bigfile.close() if args.delete_mp3: for file in infiles: os.remove(file) if __name__ == "__main__": main() |
If you want to try this, you will also need to download this image and save it to the same directory where you put both scripts. Do not change the file name.
So let’s give it a try. First, we need to install the Python package responsible for the ID tagging services of the script: mutagen.
1 |
arubinst$ sudo easy_install mutagen |
Ok, now we’re ready to go. For this test, I will upload 790 MB Ubuntu installer ISO file. The encoder script gives me all these options:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
arubinst$ ./encoder.py -h usage: encoder.py [-h] [-o OUTPUT_FILENAME] [-a ARTIST] [-l ALBUM] infile positional arguments: infile File to encapsulate optional arguments: -h, --help show this help message and exit -o OUTPUT_FILENAME, --outfile OUTPUT_FILENAME MP3 output file -a ARTIST, --artist ARTIST Specify artist metadata -l ALBUM, --album ALBUM Specify album metadata |
But I will not use any options and rather go with the default values:
1 2 3 4 5 6 7 8 9 10 11 12 |
arubinst$ ./encoder.py ubuntu-12.10-desktop-i386.iso WARNING: File size too big. The file will be fragmented into 5 MP3 files. The MP3 files will be numbered. However, some cloud services change the file names. The track metadata will indicate the order of the fragments Processing file 1/5 Processing file 2/5 Processing file 3/5 Processing file 4/5 Processing file 5/5 Success! You may now import your MP3 file(s) into your cloud service. |
I now find 5 new MP3 files in my working directory:
You will notice that the total size of all the MP3 files together is a bit bigger than the size of the original file. This is to be expected, since the encapsulation (wrapping) process adds some overhead. There’s also some file padding involved.
I’m now going to upload the MP3 files to the Apple iTunes Match service, Google Music Player and Amazon Cloud Player.
Let’s start with Amazon. Amazon uses a little piece of software called the Amazon Music Importer. After pointing the importer to the specific directory containing my 5 fake MP3 files, the upload process begins:
After a long wait, my files are uploaded and displayed online as music in the Amazon Cloud Player. I delete the originals from my hard drive. This is how the online album looks like:
and here’s the list of “songs”:
Downloading is much faster… after a little while, I have all the files once again. I will now proceed to remove the wrappers and stitch them all together again, all with one single command:
1 2 3 4 5 6 |
arubinst$ ./decoder.py *.mp3 Processing ubuntu-12.10-desktop-i3861.mp3 Processing ubuntu-12.10-desktop-i3862.mp3 Processing ubuntu-12.10-desktop-i3863.mp3 Processing ubuntu-12.10-desktop-i3864.mp3 Processing ubuntu-12.10-desktop-i3865.mp3 |
and my ISO file is back:
1 2 |
arubinst$ ls -l *.iso -rw-r--r-- 1 arubinst staff 789884928 8 jan 21:29 ubuntu-12.10-desktop-i386.iso |
I now import the files to iTunes and manually select an iCloud update:
The only way to interact with Apple’s service is through iTunes. This is the upload/download hub for your music. I wait a little while (less than what I had to wait with amazon) and finally find my files uploaded to the cloud. I can now delete the local versions. This is how iTunes looks like when you have deleted your local files. The cloud buttons on the right of each song allow you to download local copies again. That’s what I’m going to do now.
iTunes likes to rename files by putting the track numbers in front of the name. This is no problem for the decoding script:
1 2 3 4 5 6 |
arubinst$ ./decoder.py *.mp3 Processing 01 ubuntu-12.10-desktop-i386.iso.mp3 Processing 02 ubuntu-12.10-desktop-i386.iso.mp3 Processing 03 ubuntu-12.10-desktop-i386.iso.mp3 Processing 04 ubuntu-12.10-desktop-i386.iso.mp3 Processing 05 ubuntu-12.10-desktop-i386.iso.mp3 |
Once again, I recover my ISO file:
1 2 |
arubinst$ ls -l *.iso -rw-r--r-- 1 arubinst staff 789884928 8 jan 22:18 ubuntu-12.10-desktop-i386.iso |
I’m going to run a last test with the only one of the three services that is completely free up to 20’000 songs, Google Music (amazon will let you upload 245 songs for free).
Google Music uses a small application called Music Manager, responsible for the uploading (and downloading) of music. I let Music Manager upload my fake album:
My songs are now uploaded to Google Music and this is how they look:
Downloading from Google is a breeze. The files download much faster than with the other two services. But downloading music from Google is a little tricky though. You may download your files as many times as you want, but you have to use the Music Manager Application. The downside is, you have to download the whole library every time. Downloading a specific file may be done directly from the Google Music website, but you can download each and every song only twice.
Google also likes to rename files. They use a naming scheme very similar to the one used by iTunes. I will now use my decoder tool:
1 2 3 4 5 6 |
arubinst$ ./decoder.py *.mp3 Processing 01 - ubuntu-12.10-desktop-i386.iso.mp3 Processing 02 - ubuntu-12.10-desktop-i386.iso.mp3 Processing 03 - ubuntu-12.10-desktop-i386.iso.mp3 Processing 04 - ubuntu-12.10-desktop-i386.iso.mp3 Processing 05 - ubuntu-12.10-desktop-i386.iso.mp3 |
and here’s my ISO file again:
1 2 |
arubinst$ ls -l *.iso -rw-r--r-- 1 arubinst staff 789884928 8 jan 22:49 ubuntu-12.10-desktop-i386.iso |
So this is it. It’s still a work in progress. There are some “known issues” with the code and who knows how many “unknown issues”. It’s not very robust. It doesn’t check a lot of stuff and it doesn’t do a lot of error trapping. As a matter of fact, because the code started as a proof of concept, it’s not very elegant. But at least it does what it does.
Non ASCII file names are not supported at this time by the way… sorry…
Of course, your may have already guessed that this is a scientific curiosity for me more than any other thing. Keep in mind that these services are not optimized for this kind of use. They’re often slower than the real thing.
The interface, of course, is not exactly a pleasure to use. I mean, if you need to constantly upload and download many files, using this thing will be a pain. You can’t match the simplicity of dedicated cloud storage services such as the Amazon Cloud Drive, Google Drive, Dropbox or even the somehow limited iCloud storage for applications.
As a matter of fact, I don’t even use it myself 😉
Congrats Abraham !
You are already published elsewhere 😉 :
http://www.macg.co/news/voir/258446/hebergez-4-7-to-de-donnees-dans-le-nuage-grace-a-itunes-match
See u !
Thanks Marc! 🙂
hello,
I find your idea great,
I copied your code as shown
but i receive the following error message you have an idea?
bash:. / encoder.py: Permission denied
thanks for help
Hi there,
Is your file executable? You may try entering the command like this:
> python encoder.py
Or you may render your python file executable with “chmod +x encoder.py”
Let me know if this helps!
yes thank you, it was well permits!!
you are the best !!!
Great!
Pretty cool !
Thanks!
Hi!
Nice post, I actually had a pretty similar idea, posted the same day as you ,-) Instead of MP3, I embed data in PNG files.
Since Google Picasa provides unlimited storage for images below 2048×2048 in size, you could potentially store any amount of data you want (even tough 5.7TB on Google Music is pretty close to unlimited already ,-)).
You could use googlecl (http://code.google.com/p/googlecl/) to automatically upload/download images (I don’t show how to do it in my post, I don’t want to make it too easy for people to use). Is there such an interface for Google Play music? (can’t test it, it’s not available in Singapore (yet))
My blog post here: http://drinkcat.blogspot.sg/2013/01/picasa-as-random-data-storage.html
Cheers!
Hi!
Nice! Google Picasa was my next project 🙂
They do have an interface for Google Play that uploads and downloads music. But it forces you to download the whole library at once.
Google doesn’t check your IP for the Google Music service. If you have a Google account associated to a geographical address in a country where Google Music is available, it will work worldwide. At least it does in my case!
Greetings from Switzerland!
Pingback: Il stocke gratuitement 5800 gigas de données grâce à Google Music | Entete.ch | Alexandre Haederli