使用Node.js解析PNG文件

写上篇博客前对 Node 的 Stream 的官方文档扫了一遍，之后还想继续使用 Stream 写些 demo，就选择了写个小程序使用 Node 读取解析 PNG 图片（想的是如果可以方便地解析、生成 PNG 图片，那就可以很方便地生成验证码图片发给前端），结果就把自己坑了。。。PNG 还是比较复杂的 (以前数字图像处理的课中接触的主要就是 bmp、tiff，要么就直接用 OpenCV、GDAL 直接读取各种格式的图片，还没有仔细看过 PNG 的具体格式)，由于时间关系我只解析了 "非隔行扫描、非索引颜色、FilterMethod 为 0" 的 PNG 图片 -_-||
使用 Node 的 fs.createReadStream() 可以创建一个文件读取流，在这里我使用的是 Paused 模式（Paused 模式和 Flowing 模式可以看上一篇的介绍），通过 stream.read() 方法可以比较精细地读取 readable 流中的数据：

this.path = path;
this.stream = fs.createReadStream(this.path);
//使用paused模式
this.stream.pause();
this.stream.once('readable', () = >{
    //使用stream.read()消耗readable数据流
    // ......
});

关于 PNG 的格式，有很多博客都写得比较详细的，但是几乎所有的文章都略过了 IDAT 数据块中的 data 解压方法、滤波方法，当时还是在 PNG 官方文档中弄明白的。这里先给出文档链接：W3C - Portable Network Graphics (PNG) Specification (Second Edition)

PNG 全称是 Portable Network Graphics，即 "便携式网络图形"，是一种无损压缩的位图图形格式。其设计目的是试图替代 GIF 和 TIFF 文件格式，同时增加一些 GIF 文件格式所不具备的特性。

PNG 文件结构

一个完整的 PNG 数据都是以一个 PNG signature 开头和一系列数据块（chunk）组成，其中第一个 chunk 为 IHDR，最后一个 chunk 为 IEDN。

PNG 结构:
signature
chunk (IHDR)
…
chunk
…
chunk (IEDN)

官方文档的描述是：This signature indicates that the remainder of the datastream contains a single PNG image, consisting of a series of chunks beginning with an IHDR chunk and ending with an IEND chunk.

PNG Signature

PNG signature 位于 PNG 文件的最开头，占 8 个字节，每个字节用十进制可以表示为 [137, 80, 78, 71, 13, 10, 26, 10] ，通过下面的函数可以验证 signature 的正确性：

checkSignature(){
     //PNG的Signature长度为8字节, 1Byte = 8bit
     let buffer = this.stream.read(8);
     let signature = [137, 80, 78, 71, 13, 10, 26, 10];
     for(let i=0; i<signature.length; i++){
         let v = buffer.readUInt8(i);
         if(v !== signature[i]) 
             throw new Error('It is not PNG file !');
     }
     return true;
 }

PNG Chunk

PNG 定义了两种类型的数据块，一种是称为关键数据块 (critical chunk)，这是标准的数据块，另一种叫做辅助数据块 (ancillary chunks)，这是可选的数据块。关键数据块定义了 4 个标准数据块 (IHDR, PLTE, IDAT, IEND)，每个 PNG 文件都必须包含它们（没有 PLTE 的话就默认为 RGB 色），PNG 读写软件也都必须要支持这些数据块。虽然 PNG 文件规范没有要求 PNG 编译码器对可选数据块进行编码和译码，但规范提倡支持可选数据块。
下表就是 PNG 中数据块的类别，其中，关键数据块是前 4 个。

Chunk name	Multiple allowed	Ordering constraints
IHDR	No	Shall be first	文件头数据块
PLTE	No	Before first IDAT	调色板数据块
IDAT	Yes	Multiple IDAT chunks shall be consecutive	图像数据块
IEND	No	Shall be last	图像结束数据

cHRM	No	Before PLTE and IDAT	基色和白色点数据块
gAMA	No	Before PLTE and IDAT	图像γ数据块
iCCP	No	Before PLTE and IDAT. If the iCCP chunk is present, the sRGB chunk should not be present.	ICCP
sBIT	No	Before PLTE and IDAT	样本有效位数据块
sRGB	No	Before PLTE and IDAT. If the sRGB chunk is present, the iCCP chunk should not be present.	标准 RPG 颜色
bKGD	No	After PLTE; before IDAT	背景颜色数据块
hIST	No	After PLTE; before IDAT	图像直方图数据块
tRNS	No	After PLTE; before IDAT	图像透明数据块
pHYs	No	Before IDAT	物理像素尺寸数据块
sPLT	Yes	Before IDAT	建议调色板
tIME	No	None	图像最后修改时间数据块
iTXt	Yes	None	国际文本数据
tEXt	Yes	None	文本信息数据块
zTXt	Yes	None	压缩文本数据块

每个 chunk 由 4 个部分组成（当 Length=0 时，就没有 chunk data），如下：

name	meaning
Length	A four-byte unsigned integer giving the number of bytes in the chunk's data field. The length counts only the data field, not itself, the chunk type, or the CRC. Zero is a valid length. Although encoders and decoders should treat the length as unsigned, its value shall not exceed 2^31-1 bytes.
Chunk Type	A sequence of four bytes defining the chunk type. Each byte of a chunk type is restricted to the decimal values 65 to 90 and 97 to 122. These correspond to the uppercase and lowercase ISO 646 letters (A-Z and a-z) respectively for convenience in description and examination of PNG datastreams. Encoders and decoders shall treat the chunk types as fixed binary values, not character strings. For example, it would not be correct to represent the chunk type IDAT by the equivalents of those letters in the UCS 2 character set.
Chunk Data	The data bytes appropriate to the chunk type, if any. This field can be of zero length.
CRC	A four-byte CRC (Cyclic Redundancy Code) calculated on the preceding bytes in the chunk, including the chunk type field and chunk data fields, but not including the length field. The CRC can be used to check for corruption of the data. The CRC is always present, even for chunks containing no data.

由于 Length，Chunk Type，CRC 的长度都是固定的（都是 4 字节），而 Chunk Data 的长度由 Length 的值确定。因此解析每个 Chunk 时都需要确定 Chunk 的 type 和其 data 的长度。

  /**
   * 读取数据块的名称和长度
   * Length 和 Name(Chunk type) 位于每个数据块开头
   * Length, Chunk type 各占4bytes
   * @returns {{name: string, length: *}}
   */
  readHeadAndLength(){
      let buffer = this.stream.read(8);
      // 将Length的4bytes读成一个32bits的整数
      let length = buffer.readInt32BE(0);
      let name = buffer.toString(undefined, 4, 8);
      return {name, length};
  }

我的 demo 中解析的主要 chunk 是 IHDR 和 IDAT，后者相对复杂一点。通过递归逐个解析 chunk：

 readChunk({name, length}){
     if(!length || !name){
         console.log(name, length);
         return;
     }
 
     switch(name){
         case 'IHDR':
             this.readChunk(this.readIHDR(name, length));
             break;
         case 'IDAT':
             this.readChunk(this.readIDAT(name, length));
             break;
         case 'PLTE':
             // 还不支持调色板PLTE数据块
             throw new Error('PLTE');
             break;
         default:
             // 跳过其他数据块
             console.log('Skip',name,length);
             // length+4为data+CRC的数据长度
             this.stream.read(length+4);
             this.readChunk(this.readHeadAndLength());
     }
 }

IHDR 数据块

IHDR 数据块是 PNG 数据的第一个数据块，它是 PNG 文件的头文件数据，其 Chunk Data 由以下信息组成：

Name	Length
Width	4 bytes	图像宽度，以像素为单位
Height	4 bytes	图像高度，以像素为单位
Bit depth	1 bytes	图像深度。索引彩色图像: 1，2，4 或 8; 灰度图像: 1，2，4，8 或 16；真彩色图像：8 或 16
Colour type	1 bytes	颜色类型。0：灰度图像；2：真彩色图像；3：索引彩色图像；4：带α通道数据的灰度图像；6：带α通道数据的真彩色图像
Compression method	1 bytes	压缩方法（压缩 IDAT 的 Chunk Data）
Filter method	1 bytes	滤波器方法
Interlace method	1 bytes	隔行扫描方法。0：非隔行扫描；1： Adam7

知道 IHDR 的 data 部分的组成后，可以使用以下代码可以解析 IHDR 数据块的信息，这些信息对于解析 IDAT 数据十分重要：

readIHDR(name, length) {
    if (name !== 'IHDR') throw new Error('IHDR ERROR !');
 
    this.info = {};
    this.info.width = this.stream.read(4).readInt32BE(0);
    this.info.height = this.stream.read(4).readInt32BE(0);
    this.info.bitDepth = this.stream.read(1).readUInt8(0);
    this.info.coloType = this.stream.read(1).readUInt8(0);
    this.info.compression = this.stream.read(1).readUInt8(0);
    this.info.filter = this.stream.read(1).readUInt8(0);
    this.info.interlace = this.stream.read(1).readUInt8(0);
    console.log(this.info);
    //bands表示每个像素包含的波段数（如RGBA为4波段）
    switch (this.info.coloType) {
    case 0:
        this.info.bands = 1;
        break;
    case 2:
        this.info.bands = 3;
        break;
    case 3:
        // 不支持索引色
        throw new Error('Do not support this color type !');
        break;
    case 4:
        this.info.bands = 2;
        break;
    case 6:
        this.info.bands = 4;
        break;
    default:
        throw new Error('Unknown color type !');
    }
    // CRC
    this.stream.read(4);
}

以截图中的图片为例，这是一张包含透明通道的 5*5 大小的 PNG 图片，通过上面的代码得到其 IHDR 里面的信息：

{ width: 5,
  height: 5,
  bitDepth: 8,
  coloType: 6,
  compression: 0,
  filter: 0,
  interlace: 0 }

由 IHDR 的信息可以知道，这张图片是采用非隔行扫描、filter Method 为 0，带α通道数据的真彩色图像，每个通道占 8 比特，所以一个像素占 4*8 比特。

IDAT 数据块

IDAT 是图像数据块，它存储 PNG 实际的数据，在数据流中可包含多个连续顺序的图像数据块。IDAT 存放着图像真正的数据信息，因此，如果能够了解 IDAT 中 Chunk Data 的结构，我们就可以很方便地解析、生成 PNG 图像。具体的步骤包括解压、滤波等。

IDAT 数据块解压

图像数据块中的图像数据可能是经过变种的 LZ77 压缩编码 DEFLATE 压缩的，关于 DEFLATE 详细介绍可以参考《DEFLATE Compressed Data Format Specification version 1.3》，网址：http://www.ietf.org/rfc/rfc1951.txt 。可以使用 Node 的 zlib 模块直接解压。zlib 模块提供通过 Gzip 和 Deflate/Inflate 实现的压缩、解压功能，可以通过这样使用它：

const zlib = require('zlib');

通过下面的代码可以将Chunk Data解压成滤波后的数据：

readIDAT(name, length) {
    if (name !== 'IDAT') throw new Error('IDAT ERROR !');
 
    let buffer = this.stream.read(length);
    //解压数据块中data部分,得到真正的图像数据
    this.data = zlib.unzipSync(buffer);
    console.log("Unzip length", this.data.length);
 
    // CRC
    this.stream.read(4);
    return this.readHeadAndLength();
}

对于前文提到的图片，解压前 IDAT 的 Chunk Data 大小为 49 字节，解压后的大小为 105 字节。解压后的数据是以左上角为起点。对于我这张图片而言（非隔行扫描、filter Method 为 0，带α通道数据的真彩色图像），按照 RGBA RGBA RGBA 排列数据，每行的开头有一个 Filter Type 标识（占 1 字节）。下面的代码可以获得每行的 Filter Type：

/**
  * 获取每行的filter type
  * 每行有个1字节长度的filterType
  * @param row
  * @returns {*}
  */
getFilterType(row) {
    let offset = this.info.bitDepth / 8;
    let pointer = row * this.info.width * offset * this.info.bands + row;
    //读每行最开头的1字节
    return this.readNum(this.data, pointer, 8);
}

下面是解压后的 IDAT Chunk Data（滤波后的每个波段以及每行的 Filter Type）：

------Row0------
Filter type:1
[ 255, 0, 0, 255 ]
[ 0, 255, 255, 0 ]
[ 0, 1, 1, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
------Row1------
Filter type:2
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
------Row2------
Filter type:4
[ 0, 255, 255, 0 ]
[ 0, 0, 0, 0 ]
[ 1, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
------Row3------
Filter type:1
[ 0, 0, 0, 255 ]
[ 252, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 3, 255, 255, 1 ]
------Row4------
Filter type:4
[ 255, 255, 255, 0 ]
[ 0, 0, 0, 1 ]
[ 0, 0, 0, 0 ]
[ 0, 0, 0, 0 ]
[ 1, 1, 1, 255 ]

从中可以发现，原本第二行应该与第一行一模一样，这里却全是 0，其 Filter Type 为 2，指 Sub 滤波，也就是其值与上面一行对应。这样的好处就是便于压缩，减少空间。

IDAT 数据块滤波处理

PNG 的具体滤波方法可以参考官方文档：PNG Filtering
知道了 PNG 的滤波方法后就可以恢复真正的图像数据。对于 FilterMethod=0 的滤波而言，定义了 5 种 FilterType：

Type	Name
0	None
1	Sub
2	Up
3	Average
4	Paeth

根据官方文档的介绍，我写了下面的恢复滤波前的数据的方法：

/**
 * 处理filterMethod=0时整个图像中的一行
 * 这时每行都对应一种具体的FilterType
 * @param index
 * @param start
 * @param filterType
 * @param colByteLength
 * @returns {*}
 */
reconForNoneFilter(index, start, filterType, colByteLength) {
    let pixelByteLength = this.info.bands * this.info.bitDepth / 8;
    switch (filterType) {
    case 0:
        //None
        return this.data[index];
        break;
    case 1:
        //Sub
        if (index - start - 1 < pixelByteLength) return this.data[index];
        else return this.data[index] + this.data[index - pixelByteLength];
    case 2:
        //Up
        return this.data[index] + this.data[index - colByteLength];
    case 3:
        //Average
        {
            let a = 0,
            b = 0;
            a = index - start - 1 < pixelByteLength ? a: this.data[index - pixelByteLength];
            b = this.data[index - colByteLength];
            return this.data[index] + Math.floor((a + b) / 2);
        }
    case 4:
        //Paeth
        {
            let a = 0,
            b = 0,
            c = 0;
            b = this.data[index - colByteLength];
            if (index - start - 1 < pixelByteLength) {
                a = c = 0;
            } else {
                a = this.data[index - pixelByteLength];
                if (start >= colByteLength) {
                    c = this.data[index - pixelByteLength - colByteLength];
                }
            }
            //PaethPredictor function
            let p = a + b - c;
            let pa = Math.abs(p - a),
            pb = Math.abs(p - b),
            pc = Math.abs(p - c);
            let Pr = 0;
            if (pa <= pb && pa <= pc) Pr = a;
            else if (pb <= pc) Pr = b;
            else Pr = c;
 
            return Pr;
        }
    default:
        throw new Error('recon failed');
    }
}

恢复后的数据如下：

------Row0------
Filter type:1
[ 255, 0, 0, 255 ]
[ 255, 255, 255, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
------Row1------
Filter type:2
[ 255, 0, 0, 255 ]
[ 255, 255, 255, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
------Row2------
Filter type:4
[ 255, 0, 0, 255 ]
[ 255, 255, 255, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
[ 255, 0, 0, 255 ]
------Row3------
Filter type:1
[ 0, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 255, 255, 255, 0 ]
------Row4------
Filter type:4
[ 0, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 252, 0, 0, 255 ]
[ 255, 255, 255, 0 ]

这时刚好能和前面提到的图片对应上。^_^

参考资料
分析 PNG 图像结构
W3C - Portable Network Graphics (PNG) Specification (Second Edition)

代码地址：https://git.oschina.net/liuyaqi/JSPNG/

来源: http://blog.csdn.net/liuyaqi1993/article/details/77531328

与本文相关文章

暂无,快来抢沙发吧！