C++ 重写 new 和 delete, 比想像中困难

关于 C++ 内存管理这话题, 永远都不过时. 在我刚出道的时候, 就已经在考虑怎么检测内存泄漏(https://www.cnblogs.com/coding-my-life/p/3985164.html). 想用一份简单的代码, 并且不太影响执行效率去实现内存泄漏检测, 是不太现实的. 当时觉得重写 new 和 delete 是没有太大价值的, 不过后来在自己的项目 https://github.com/changnet/MServer 中还是重写了, 加了个计数. 在程序退出时检测下计数 new 的次数和 delete 次数是否对得上, 对不上就是有问题了, 再用 valgrind 之类的工具去检测. 这种排除不了所有情况, 但确实也解决了一些问题. 毕竟每次写新功能时发现问题立马去解决, 比你写了成千上万个功能, 上线后出问题再查找容易得多.

在 Windows 下则有另一种方案, C Run-time Library (CRT) debug,_CrtDumpMemoryLeaks()函数, 这也仅仅是发现泄漏, 定位还得用另一个工具 visual leak detector. 最近在解决公司程序内存泄漏过程中, 发现其实并没有内存泄漏, 而是程序是在 atexit 里调用_CrtDumpMemoryLeaks()函数的, 而 static 变量申请的内存, 可能要在 atexit 回调之后释放. 由此, 我忽然想到我以前重写 new 和 delete, 有些地方写得并不对, 在这里重新整理一下.

new,delete 并不是一个函数, 它在编译的时候会被解析成三个步骤: 1. 调用 operator new 分配内存; 2. 调用构造函数; 3. 把指针转换成对应的类型返回. 能够重写的, 是 operator new 函数.

#include <cstdlib>
#include <iostream>
int g_counter  = 0;
void *operator new(size_t size)
{
    ++g_counter;
    std::cout << "new mem:" << g_counter << std::endl;
    return ::malloc(size);
}
void operator delete(void* ptr)
{
    --g_counter;
    std::cout << "delete mem:" << g_counter << std::endl;
    ::free(ptr);
}
void on_exit()
{
    std::cout << "exit,mem counter =" << g_counter << std::endl;
}
int main()
{
    atexit(on_exit);
    char *ptr = new char[8];
    return 0;
}
$ g++ main.cpp$ ./a.out
new mem:1
exit,mem counter = 1

上面简单地重写了 operator new 和 operator delete, 在程序退出时可以检测到还有一次内存没释放掉. 但上面的代码存在很多问题.

1. 尽量重写所有函数

C++ 的 operator new 和 operator delete 函数通常比你想像中的多. 而且不同的版本会带来不同的函数, 17,20 版本都相应的增加了一些函数, 参考. 如果你没有重写完, 虽然能编译通过, 但可能并不是你想要的结果. 比如上面的代码, new char[8]本应该调用 operator new[]函数的, 由于没有重写 operator new[], 默认调用了 libstdcxx 中的 operator new[], 默认函数又调用了 operator new. 虽然这不一定有什么问题, 但在某些项目中, 对内存分配做了特殊处理, 或者一些特殊操作 (比如一个内存池重写了 operator new[] 但没重写 operator delete[], 而他们的内存是回收到不同地方), 这就会出问题.

2. 利用 atexit 统计内存并不准确

atexit 是在程序退出时调用, 对绝大多数变量来说都是 OK 的, 但对 static 和 global 变量则不一定了. 根据 C++ 标准: https://isocpp.org/files/papers/N3690.pdf 3.6.3 Termination https://isocpp.org/files/papers/N3690.pdf ,atexit 注册之前就已经创建的变量, 则在 atexit 之后释放, 这意味着你的 static 和 global 变量如果 new 了内存必须在 atexit 之后创建. 但这又引出 C++ 的另一个问题: static initialization order fiasco https://isocpp.org/wiki/faq/ctors . 当然我们有很多方法去处理它, 比如把所有 static 和 global 放到一个 cpp 文件里, 或者在程序退出时手动释放 new 的内存. 另外, gcc 链接的时候, 放在最后的 object 文件里的 global 变量会优先初始化, 或者用 gcc 的__attribute__ ((init_priority (N)))属性来指定初始化优先级, 但这不是标准, 不过这毕竟是值得注意的地方.

3. 线程安全

上面的代码没有加锁, 所以是不能用在多线程中的. 但现在有几个程序不用多线程的, 所以还是得把锁加上, 加锁的代码很简单.

static pthread_mutex_t *counter_mutex()
{
    static pthread_mutex_t _mutex;
    assert( 0 == pthread_mutex_init( &_mutex,NULL ) );
    return &_mutex;
}
static pthread_mutex_t *_mem_mutex_ = counter_mutex();
int g_counter  = 0;
void *operator new(size_t size)
{
    assert( _mem_mutex_ );
    pthread_mutex_lock( _mem_mutex_ );
    ++g_counter;
    pthread_mutex_unlock( _mem_mutex_ );
    std::cout << "new mem:" << g_counter << std::endl;
    return ::malloc(size);
}

4. 线程安全带来初始化问题

在上面说 atexit 统计内存不准确的时候提到 static initialization order fiasco https://isocpp.org/wiki/faq/ctors 的问题, 在这里变得更严重了. 因为线程安全是用一个 static pthread_mutex_t 指针来实现的, 那么在其他 global 变量创建时如果调用了 new, 那么它可能是没有被初始化的. 当然如果你已按上面的方法解决了, 那就不会有这个问题了. 或者, 根据 C++ 标准, Static initialization 初始化必须在所有 Dynamic initialization 之前, 我们可以这样写:

/* Static initialization */
static pthread_mutex_t *_mem_mutex_ = NULL;
class global_static
{
public:
    global_static()
    {
        assert( !_mem_mutex_ );
        _mem_mutex_ = counter_mutex();
    }
    ~global_static() {_mem_mutex_ = NULL;}
};
/* Dynamic initialization */
const static global_static gs;

这样虽然不能解决问题, 但是由于我们在 new 里校验了_mem_mutex_是否为 NULL, 至少能发现问题.

既然是 C++, 那么还可以 Construct On First Use Idiom: 在使用_mem_mutex_时去检测是否已初始化, 未初始化就初始化. 而不是像上面那样全局一次初始化, 以后都不用检测.

5. 能否统计到 STL,BOOST,so,.a 等外部代码中的 new,delete 是否会被重写

STL 和 BOOST 这种很多时候是是模板, 也就是源码, 和你项目中的代码一样, 当然也会被重写. 对于 so 动态链接库, 他和程序是分离的. 当你的程序加载这个 so 文件时, 它会优先在你的程序里查找他需要的符号, 如果找到了, 就会优先使用. 这和 LD_PRELOAD 的机制是一样的, 因此也是会被重写的. 而. a 这种静态链接库, 在 gcc 链接时会按你传入的库顺序查找符号, 一般来说你项目中的符号都是优先于 libgcc 这种标准库的, 因此也是会被重写的.

要明白这些, 要懂得 gcc 是如何编译, 链接一个程序的, 尤其是对符号的管理. https://akkadia.org/drepper/dsohowto.pdf

6. 可以用 nm 来判断是否重写

xzc@xzc-HP-ProBook-4446s:~/Documents/code/test$ nm -C a.out | grep new
0000000000400f60 T test_static_new()
                 U operator new[](unsigned long)@@GLIBCXX_3.4
0000000000400d34 T operator new(unsigned long)
0000000000401120 r operator new(unsigned long)::__PRETTY_FUNCTION__

T 表示 text, 说明你已经重写了. U 表示 undefine, 表示没有重写, 程序运行时, 要去库里查找这个符号.

大部分人重写 operator new 和 operator delete 的初衷, 无非就是检测内存泄漏, 或者实现自己的内存管理. 对于内存泄漏, 通过重写 operator new 来实现的, 可以看 http://wyw.dcweb.cn/leakage.htm 这里, 现在还在维护的项目是 https://github.com/adah1972/nvwa , 我没用过, 但看下逻辑应该还是不错的. 而对于在代码中重写 operator new 来实现内存管理, 我倒没见过. 毕竟想写一个通用的内存管理不容易, 写出来也是一个库了, 比如 jemalloc 这种.

来源: https://www.cnblogs.com/coding-my-life/p/10125538.html

与本文相关文章

暂无,快来抢沙发吧！