iOS 平台上音频编码成 aac

小程之前介绍解码 aac 时, 曾经使用了 fadd, 并且有提到, 如果想编码成 aac 格式, 可以使用 facc,fdk-aac 等, 但使用 fdk-aac 等编码方式, 都是软编码, 在 cpu 的消耗上会明显大于硬件编码.

硬编码的优势是可以用硬件芯片集成的功能, 高速且低功耗地完成编码任务.

在 iOS 平台, 也提供了硬编码的能力, APP 开发时只需要调用相应的 SDK 接口就可以了.

这个 SDK 接口就是 AudioConverter.

本文介绍 iOS 平台上, 如何调用 AudioConverter 来完成 aac 的硬编码.

从名字来看, AudioConverter 就是格式转换器, 这里小程使用它, 把 pcm 格式的数据, 转换成 aac 格式的数据.

AudioConverter 在内存中实现转换, 并不需要写文件, 而 ExtAudioFile 接口则是对文件的操作, 并且内部使用 AudioConerter 来转换格式, 也就是说, 读者在某种场景下, 也可以使用 ExtAudioFile 接口.

如何使用 AudioConverter 呢? 基本上, 对接口的调用都需要阅读对应的头文件, 通过看文档注释来理解怎么调用.

小程这里演示一下, 怎么把 pcm 格式的数据转换成 aac 格式的数据.

在演示代码之后, 小程只做简单的解释, 有需要的读者请耐心阅读代码来理解, 并应用到自己的开发场景中.

下面的例子演示从 pcm 转 aac 的实现(比如把录音数据保存成 aac 的实现).

typedef struct
{
    void *source;
    UInt32 sourceSize;
    UInt32 channelCount;
    AudioStreamPacketDescription *packetDescriptions;
}FillComplexInputParam;
// 填写源数据, 即 pcm 数据

OSStatus audioConverterComplexInputDataProc( AudioConverterRef inAudioConverter,

UInt32* ioNumberDataPackets,

AudioBufferList* ioData,

AudioStreamPacketDescription**  outDataPacketDescription,
                                            void*                           inUserData)
{
    FillComplexInputParam* param = (FillComplexInputParam*)inUserData;
    if (param->sourceSize <= 0) {
        *ioNumberDataPackets = 0;
        return -1;
    }
    ioData->mBuffers[0].mData = param->source;
    ioData->mBuffers[0].mNumberChannels = param->channelCount;
    ioData->mBuffers[0].mDataByteSize = param->sourceSize;
    *ioNumberDataPackets = 1;
    param->sourceSize = 0;
    param->source = NULL;
    return noErr;
}
typedef struct _tagConvertContext {
    AudioConverterRef converter;
    int samplerate;
    int channels;
}ConvertContext;
// init
// 最终用 AudioConverterNewSpecific 创建 ConvertContext, 并设置比特率之类的属性
void* convert_init(int sample_rate, int channel_count)
{
    AudioStreamBasicDescription sourceDes;
    memset(&sourceDes, 0, sizeof(sourceDes));
    sourceDes.mSampleRate = sample_rate;
    sourceDes.mFormatID = kAudioFormatLinearPCM;
    sourceDes.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger;
    sourceDes.mChannelsPerFrame = channel_count;
    sourceDes.mBitsPerChannel = 16;
    sourceDes.mBytesPerFrame = sourceDes.mBitsPerChannel/8*sourceDes.mChannelsPerFrame;
    sourceDes.mBytesPerPacket = sourceDes.mBytesPerFrame;
    sourceDes.mFramesPerPacket = 1;
    sourceDes.mReserved = 0;
    AudioStreamBasicDescription targetDes;
    memset(&targetDes, 0, sizeof(targetDes));
    targetDes.mFormatID = kAudioFormatMPEG4AAC;
    targetDes.mSampleRate = sample_rate;
    targetDes.mChannelsPerFrame = channel_count;
    UInt32 size = sizeof(targetDes);

AudioFormatGetProperty(kAudioFormatProperty_FormatInfo, 0, NULL, &size, &targetDes);

AudioClassDescription audioClassDes;
    memset(&audioClassDes, 0, sizeof(AudioClassDescription));

AudioFormatGetPropertyInfo(kAudioFormatProperty_Encoders, sizeof(targetDes.mFormatID), &targetDes.mFormatID, &size);

int encoderCount = size / sizeof(AudioClassDescription);
    AudioClassDescription descriptions[encoderCount];

AudioFormatGetProperty(kAudioFormatProperty_Encoders, sizeof(targetDes.mFormatID), &targetDes.mFormatID, &size, descriptions);

for (int pos = 0; pos <encoderCount; pos ++) {
        if (targetDes.mFormatID == descriptions[pos].mSubType && descriptions[pos].mManufacturer == kAppleSoftwareAudioCodecManufacturer) {
            memcpy(&audioClassDes, &descriptions[pos], sizeof(AudioClassDescription));
            break;
        }
    }
    ConvertContext *convertContex = malloc(sizeof(ConvertContext));

OSStatus ret = AudioConverterNewSpecific(&sourceDes, &targetDes, 1, &audioClassDes, &convertContex->converter);

if (ret == noErr) {
        AudioConverterRef converter = convertContex->converter;
        tmp = kAudioConverterQuality_High;

AudioConverterSetProperty(converter, kAudioConverterCodecQuality, sizeof(tmp), &tmp);

UInt32 bitRate = 96000;
        UInt32 size = sizeof(bitRate);
        ret = AudioConverterSetProperty(converter, kAudioConverterEncodeBitRate, size, &bitRate);
    }
    else {
        free(convertContex);
        convertContex = NULL;
    }
    return convertContex;
}
// converting
void convert(void* convertContext, void* srcdata, int srclen, void** outdata, int* outlen)
{
    ConvertContext* convertCxt = (ConvertContext*)convertContext;
    if (convertCxt && convertCxt->converter) {
        UInt32 theOuputBufSize = srclen;
        UInt32 packetSize = 1;
        void *outBuffer = malloc(theOuputBufSize);
        memset(outBuffer, 0, theOuputBufSize);
        AudioStreamPacketDescription *outputPacketDescriptions = NULL;
        outputPacketDescriptions = (AudioStreamPacketDescription*)malloc(sizeof(AudioStreamPacketDescription) * packetSize);
        FillComplexInputParam userParam;
        userParam.source = srcdata;
        userParam.sourceSize = srclen;
        userParam.channelCount = convertCxt->channels;
        userParam.packetDescriptions = NULL;
        OSStatus ret = noErr;
        AudioBufferList* bufferList = malloc(sizeof(AudioBufferList));
        AudioBufferList outputBuffers = *bufferList;
        outputBuffers.mNumberBuffers = 1;
        outputBuffers.mBuffers[0].mNumberChannels = convertCxt->channels;
        outputBuffers.mBuffers[0].mData = outBuffer;
        outputBuffers.mBuffers[0].mDataByteSize = theOuputBufSize;

ret = AudioConverterFillComplexBuffer(convertCxt->converter, audioConverterComplexInputDataProc, &userParam, &packetSize, &outputBuffers, outputPacketDescriptions);

if (ret == noErr) {
            if (outputBuffers.mBuffers[0].mDataByteSize> 0) {

NSData* rawAAC = [NSData dataWithBytes:outputBuffers.mBuffers[0].mData length:outputBuffers.mBuffers[0].mDataByteSize];

*outdata = malloc([rawAAC length]);
                memcpy(*outdata, [rawAAC bytes], [rawAAC length]);
                *outlen = (int)[rawAAC length];
// 测试转换出来的 aac 数据, 保存成 adts-aac 文件
#if 1
                int headerLength = 0;
                char* packetHeader = newAdtsDataForPacketLength((int)[rawAAC length], convertCxt->samplerate, convertCxt->channels, &headerLength);
                NSData* adtsPacketHeader = [NSData dataWithBytes:packetHeader length:headerLength];
                free(packetHeader);
                NSMutableData* fullData = [NSMutableData dataWithData:adtsPacketHeader];
                [fullData appendData:rawAAC];
                NSFileManager *fileMgr = [NSFileManager defaultManager];
                NSString *filepath = [NSHomeDirectory() stringByAppendingFormat:@"/Documents/test%p.aac", convertCxt->converter];
                NSFileHandle *file = nil;
                if (![fileMgr fileExistsAtPath:filepath]) {
                    [fileMgr createFileAtPath:filepath contents:nil attributes:nil];
                }
                file = [NSFileHandle fileHandleForWritingAtPath:filepath];
                [file seekToEndOfFile];
                [file writeData:fullData];
                [file closeFile];
#endif
            }
        }
        free(outBuffer);
        if (outputPacketDescriptions) {
            free(outputPacketDescriptions);
        }
    }
}
// uninit
// ...
int freqIdxForAdtsHeader(int samplerate)
{
    /**
     0: 96000 Hz
     1: 88200 Hz
     2: 64000 Hz
     3: 48000 Hz
     4: 44100 Hz
     5: 32000 Hz
     6: 24000 Hz
     7: 22050 Hz
     8: 16000 Hz
     9: 12000 Hz
     10: 11025 Hz
     11: 8000 Hz
     12: 7350 Hz
     13: Reserved
     14: Reserved
     15: frequency is written explictly
     */
    int idx = 4;
    if (samplerate>= 7350 && samplerate <8000) {
        idx = 12;
    }
    else if (samplerate>= 8000 && samplerate <11025) {
        idx = 11;
    }
    else if (samplerate>= 11025 && samplerate <12000) {
        idx = 10;
    }
    else if (samplerate>= 12000 && samplerate <16000) {
        idx = 9;
    }
    else if (samplerate>= 16000 && samplerate <22050) {
        idx = 8;
    }
    else if (samplerate>= 22050 && samplerate <24000) {
        idx = 7;
    }
    else if (samplerate>= 24000 && samplerate <32000) {
        idx = 6;
    }
    else if (samplerate>= 32000 && samplerate <44100) {
        idx = 5;
    }
    else if (samplerate>= 44100 && samplerate <48000) {
        idx = 4;
    }
    else if (samplerate>= 48000 && samplerate <64000) {
        idx = 3;
    }
    else if (samplerate>= 64000 && samplerate <88200) {
        idx = 2;
    }
    else if (samplerate>= 88200 && samplerate <96000) {
        idx = 1;
    }
    else if (samplerate>= 96000) {
        idx = 0;
    }
    return idx;
}
int channelIdxForAdtsHeader(int channelCount)
{
    /**
     0: Defined in AOT Specifc Config
     1: 1 channel: front-center
     2: 2 channels: front-left, front-right
     3: 3 channels: front-center, front-left, front-right
     4: 4 channels: front-center, front-left, front-right, back-center
     5: 5 channels: front-center, front-left, front-right, back-left, back-right
     6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel
     7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel
     8-15: Reserved
     */
    int ret = 2;
    if (channelCount == 1) {
        ret = 1;
    }
    else if (channelCount == 2) {
        ret = 2;
    }
    return ret;
}
/**
 *  Add ADTS header at the beginning of each and every AAC packet.
 *  This is needed as MediaCodec encoder generates a packet of raw
 *  AAC data.
 *
 *  Note the packetLen must count in the ADTS header itself.
 *  See: http://wiki.multimedia.cx/index.php?title=ADTS
 *  Also: http://wiki.multimedia.cx/index.php?title=MPEG-4_Audio#Channel_Configurations
 **/
char* newAdtsDataForPacketLength(int packetLength, int samplerate, int channelCount, int* ioHeaderLen) {
    int adtsLength = 7;
    char *packet = malloc(sizeof(char) * adtsLength);
    // Variables Recycled by addADTStoPacket
    int profile = 2;  //AAC LC
    //39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;
    int freqIdx = freqIdxForAdtsHeader(samplerate);
    int chanCfg = channelIdxForAdtsHeader(channelCount);  //MPEG-4 Audio Channel Configuration.
    NSUInteger fullLength = adtsLength + packetLength;
    // fill in ADTS data
    packet[0] = (char)0xFF;
// 11111111  = syncword
    packet[1] = (char)0xF9;
// 1111 1 00 1  = syncword MPEG-2 Layer CRC
    packet[2] = (char)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
    packet[3] = (char)(((chanCfg&3)<<6) + (fullLength>>11));
    packet[4] = (char)((fullLength&0x7FF)>> 3);
    packet[5] = (char)(((fullLength&7)<<5) + 0x1F);
    packet[6] = (char)0xFC;
    *ioHeaderLen = adtsLength;
    return packet;
}

以上代码, 有两个函数比较重要, 一个是初始化函数, 这个函数创建了 AudioConverterRef, 另一个是转换函数, 这个函数应该被反复调用, 对不同的 pcm 数据进行转换.

另外, 示例中, 把 pcm 转换出来的 aac 数据, 进行了保存, 保存出来的文件可以用于播放.

注意, AudioConverter 转换出来的都是音频裸数据, 至于组合成 adts-aac, 还是封装成苹果的 m4a 文件, 由程序决定.

这里解释一下, adts-aac 是 aac 数据的一种表示方式, 也就是在每帧 aac 裸数据前面, 增加一个帧信息(包括每帧的长度, 采样率, 声道数等), 加上帧信息后, 每帧 aac 可以单独播放. 而且, adts-aac 是没有封装的, 也就是没有特定的文件头以及文件结构等.

adts 是 Audio Data Transport Stream 的缩写.

当然, 读者也可以把转换出来的 aac 数据, 封装成 m4a 格式, 这种封装格式, 先是文件头, 然后就是祼音频数据:

{packet-table}{audio_data}{trailer}, 头信息之后就是音频裸数据, 音频数据不带 packet 信息.

至此, iOS 平台把 pcm 转换成 aac 数据的实现就介绍完毕了.

总结一下, 本文介绍了如何使用 iOS 平台提供的 AudioConverter 接口, 把 pcm 格式的数据转换成 aac 格式. 文章也介绍了怎么保存成 adts-aac 文件, 读者可以通过这个办法检验转换出来的 aac 数据是否正确.

来源: http://www.bubuko.com/infodetail-2574220.html

与本文相关文章

暂无,快来抢沙发吧！