MP3 files have been used for a long time as they provide an efficient way of storing and transmitting audio data that can be rendered with almost CD quality, and are widely used because of these features. While there has been a lot of work in using images as steganography medium, steganography in the MP3 files has been less explored. This project will describe the techniques of inserting covert data into an encoded MP3 main data portion, which is the part of an MP3 file where the audio data is actually stored. It will also describe the steganalysis involved in recovering the hidden information.
Merriam-Webster defines steganography as the practice of concealing a message, image, or file within another message, image, or file. With steganography, success lies on the fact that anyone other than the intended recipient will not notice anything different with the file, only the intended recipient knows the hidden message exists and knows how to retrieve it. The science of steganography is very old; dating back to ancient Greece and China. Steganography can be implemented in any kind of medium, but today, in the digital world, it is usually associated with hiding data in digital images, videos, or other overt files. To date, there has been little work done on trying to hide covert information in an MPEG-1 Audio Layer II file. The purpose of this paper is to highlight the process of inserting covert data into an encoded MP3 main data portion.
MP3 files have been around for more than 20 years so they are a suitable steganography medium, since people are accustomed to them and therefore don’t suspect hidden data inside of them. The main data portion of the frame is where audio information is stored. The difficulty is the audio data is transformed, compressed, and Huffman encoded before it is stored. The compression is done using lossless and lossy compression, while the transformation is done using the Modified Discrete Cosign Transform (MDCT). The main data within the frame consist of two chunks of data called granules. In the case of stereo frame, each granule is further divided into two channels (left channel and right channel), while a mono frame would have a single channel per granule. Left and right channels inside the frame contain scale factors that help with decoding the channels later and the actual audio data compressed bytes. The scale factors are followed then by the MDCT encoded using Huffman coding. Information about the Huffman coding tables is found on the side information for each frame. For reasons such as performance or quality, the audio information is divided into three regions starting by region0, followed by region1 and region2. Each one of those regions is compressed individually by using the Huffman tables which are defined in the MP3 standard.
The authors have analyzed several possible approaches to embed covert information inside the main data portion of an MP3 file. One of the biggest challenges in hiding covert information inside the main data portion of an MP3 file is the introduction of audio artifacts. By injecting covert data inside the main data portion of the MP3, even if the amount of the injected data is relatively small compared to the main data size, we are still changing the main data structure. Any change in the main data structure will be noticeable during the decoding process.
The following sections contain the solutions analyzed for hiding covert information inside the main data portion of an MP3 file.
Masking occurs when the perception of one sound is affected by the presence of another sound and the human ear will be able to hear only the dominant sound while the sound that has been masked will not be perceived. A way of hiding covert data in an MP3 file would be to encode the cover information as audio data, making sure the real sound (the audio of the clean MP3 file) would mask the added signal containing the covert message. Using masking as the steganography technique, one could store a large amount of covert information without affecting the playability of the MP3 file.
Another “During-Encoding” technique analyzed was the Least Significant Bit (LSB) substitution. This technique would work by hiding the covert information in the least significant bits of the MDCT coefficients during the MP3 encoding process. By using the substitution technique, we would assure that the file size would remain unchanged, comparing the clean and the dirty MP3 file. Using such technique wouldn’t introduce sound artifacts, and that would be another advantage of the LSB substitution. “During encoding” steganography techniques would require a custom encoder as well as a custom decoder. Due to the complexity of developing a custom encoder and decoder, this technique wasn’t further analyzed.
Main Data Overwriting Main Data portion of an MP3 file is considered a “Post-Encoding” technique. This technique has been tested using the beta version of the SecretMp3 tool. Using this technique, the covert information is hidden inside the main data portion of the frame, replacing the existing audio information. SecretMp3 beta was a very flexible tool that allowed the user to set different hiding configuration parameters: the percentage of frame to be overwritten with the covert data, the first frame to be modified, the number of unchanged MP3 frames between the overwritten frames and also the possibility to define whether to overwrite the beginning of the frame or the end of the frame. Authors have conducted numerous tests with different hiding parameters. One of the most important testing outcomes was that the configuration didn’t actually matter, if you were actively listening you could hear the sound artifacts of a “bad” frame.
Even if the number of bytes modified was kept small (5-10% of the frame) there were disruptive noises introduced.
During their research, authors have identified MP3 files that were encoded as stereo MP3s but actually they were using only one channel for the audio data while the other channel was always empty. All of the files that had these attributes were encoded using LAME 3.98.2 encoder. This allowed the SecretMp3 tool to store a larger amount of covert data inside the MP3 file.
Using Empty Frames
MP3 files have empty frames, usually at the beginning or at the end of the MP3 file. Empty frames are frames that don’t contain valid audio data, therefore the MP3 players will ignore such frames, and they won’t be played. The main reason empty frames are found in the beginning of a file is because MP3 players use the first few frames to synchronize.
The MP3 file can start with two to three empty frames (usually), but this number can be even bigger. During the research and testing, authors have seen up to 8-10 empty frames in the beginning of a file. Modifying the empty frames doesn’t introduce sound artifacts, as long as the frame’s header and side information remains unchanged. The biggest concern with hiding the covert data in the empty frames is that not every MP3 file has empty frames, therefore this technique would be very carrier dependent on how much data we can store, as well as whether we can store data at all.
The solution to overcoming the problems of only using existing frames is adding some extra custom frames (empty frames in our case). The injection of such frames is done at the beginning of the MP3 file, before the first frame and at the end of the file after the last existing frame to avoid interfering with actual audio data as well as with the bit reservoir feature.
Injecting custom frames in the middle of the file causes disruption because of the MP3 bit reservoir feature.
A steganography technique would not be efficient if it noticeably increases the size of the carrier. Therefore, the authors concluded by keeping the number of added custom frames fixed and relatively small produces the best results. The current implementation of the SecretMp3 tool adds three frames at the beginning of the file and three frames at the end of the file. The next section gives more details about the tool, the techniques used, and the process of inserting and retrieving covert information as well as the testing results.
The goal of SecretMp3 is to insert covert information into the main data portion of an MP3 file. During the course of this project there were a number of different approaches explored. SecretMp3 tool inserts the covert data in the existing and added empty frames within the main data part of an MP3 file.
Inserting Covert Data
After exploring several options, overwriting the main data was the only option that could be completed within the allotted time constraints of the project. To accomplish this task, we used an open source library (NAudio) to overwrite the body data with the covert data. In addition to adding covert data in the existing empty frames of an MP3 file, SecretMp3 tool adds six empty frames located at the beginning and end of the MP3 file. Using these empty frames, SecretMp3 is able to overwrite all the main data section of these frames since the MP3 player only utilizes the header and side information of these frames to synchronize the audio. Besides the empty frames, a master frame will be added to keep track of the location of the inserted covert data.
Several tools were discussed for hiding covert data in an MP3. Utilizing a tool identified as SecretMp3, the authors were able to successfully insert covert data into the main data portion of an MP3 file. In order to determine the best setup for the SecretMp3 tool, we tested it with different MP3 files and cover text files. Finally, it was determined the existing and added empty frames at the beginning and the end of the MP3 files allowed the SecretMp3 tool to overwrite the entire main data portion of these frames with covert data while preserving the audio quality.