STL源码剖析笔记

本文主要探讨与STL实现相关，参考侯捷的《STL源码剖析》和博客，每一个知识点讨论力求简洁，便于记忆，但讨论深度有限，如要深入研究可点击参考链接。

第一章 STL概述(六大组件)

容器(container)：常用数据结构，大致分为两类，序列容器，如vector，list，deque，关联容器，如set，map。在实现上，是类模板(class template)
迭代器(iterator)：一套访问容器的接口，行为类似于“泛型指针”。它为不同算法提供的相对统一的容器访问方式，使得设计算法时无需关注过多关注数据。（“算法”指广义的算法，操作数据的逻辑代码都可认为是算法）
算法(algorithm)：提供一套常用的算法，如sort，search，copy，erase … 在实现上，可以认为是一种函数模板(function template)，只有算法是函数模版其他都是类模版。
配置器(allocator)：为容器提供空间配置和释放，对象构造和析构的服务，也是一个class template。
仿函数(functor)：重载了operator()的class，使class当函数使用。
配接器(adapter)：将一种容器修饰为功能不同的另一种容器，如以容器vector为基础，在其上实现stack，stack的行为也是一种容器。这就是一种配接器。除此之外，还有迭代器配接器和仿函数配接器。

第二章空间配置器

1、allocator简述

#include <vector>
int main()
{
    std::vector<int> v;
    // Do something;
    return 0;
}

allocator隐藏在所有容器（包括vector）身后，完成1.内存配置与释放；2.对象构造和析构的工作。

2、allocator接口规范

STL是一个标准，只对接口进行规范，接口背后的实现可以有不同版本，所以目前流行的STL如Visual C++采用的P. J. Plauger 版本，GCC编译器采用的SGI STL版本等都是常见的STL。SGI STL是最流行的版本，我们主要关注SGI STL的allocator实现。

STL标准要求的allocator必要接口：

// 以下几种自定义类型是一种type_traits技巧，暂时不需要了解
allocator::value_type
allocator::pointer
allocator::const_pointer
allocator::reference
allocator::const_reference
allocator::size_type
allocator::difference

// 一个嵌套的(nested)class template，class rebind<U>拥有唯一成员other，那是一个typedef，代表allocator<U>
allocator::rebind

allocator::allocator() // 默认构造函数
allocator::allocator(const allocator&) // 拷贝构造函数
template <class U>allocator::allocator(const allocator<U>&) // 泛化的拷贝构造函数
allocator::~allocator() // 析构函数

// 返回某个对象的地址，a.address(x)等同于&x
pointer allocator::address(reference x) const
// 返回某个const对象的地址，a.address(x)等同于&x
const_pointer allocator::address(const_reference x) const

// 配置空间，足以存储n个T对象。第二个参数是个提示。实现上可能会利用它来增进区域性(locality)，或完全忽略之
pointer allocator::allocate(size_type n, const void* = 0)
// 释放先前配置的空间
void allocator::deallocate(pointer p, size_type n)

// 返回可成功配置的最大量
size_type allocator:maxsize() const

// 调用对象的构造函数，等同于 new((void*)p) T(x)
void allocator::construct(pointer p, const T& x)
// 调用对象的析构函数，等同于 p->~T()
void allocator::destroy(pointer p)

上面的标准接口实际只需要关注最后的allocate，deallocate，construct和destroy函数的实现即可。整个allocator最重要的函数就是这四个。从功能上，这四者可以简单理解为C++的::operator new和::operator delete，构造函数和析构函数，实际上，一个最简单的allocator就可以理解为对new，delete的简单封装,以及对构造函数和析构函数的直接调用。

书中给出的一个最简单的allocator实现，符合STL标准

#ifndef _MYALLOC_
#define _MYALLOC_

#include <new>
#include <cstddef>
#include <cstdlib>
#include <climits>
#include <iostream>

namespace my_alloc
{
    // allocate的实际实现，简单封装new，当无法获得内存时，报错并退出
    template <class T>
    inline T* _allocate(ptrdiff_t size, T*) {
        set_new_handler(0);
        T* tmp = (T*)(::operator new((size_t)(size * sizeof(T))));
        if (tmp == 0) {
            cerr << "out of memory" << endl;
            exit(1);
        }
        return tmp;
    }

    // deallocate的实际实现，简单封装delete
    template <class T>
    inline void _deallocate(T* buffer) { ::operator delete(buffer); }

    // construct的实际实现，直接调用对象的构造函数
    template <class T1, class T2>
    inline void _construct(T1* p, const T2& value) { new(p) T1(value); }

    // destroy的实际实现，直接调用对象的析构函数
    template <class T>
    inline void _destroy(T* ptr) { ptr->~T(); }

    template <class T>
    class allocator {
    public:
        typedef T           value_type;
        typedef T*          pointer;
        typedef const T*    const_pointer;
        typedef T&          reference;
        typedef const T&    const_reference;
        typedef size_t      size_type;
        typedef ptrdiff_t   difference_type;

        // 构造函数
        allocator(){ return; }
        template <class U>
        allocator(const allocator<U>& c){}

        // rebind allocator of type U
        template <class U>
        struct rebind { typedef allocator<U> other; };

        // allocate，deallocate，construct和destroy函数均调用上面的实际实现
        // hint used for locality. ref.[Austern],p189
        pointer allocate(size_type n, const void* hint = 0) {
            return _allocate((difference_type)n, (pointer)0);
        }
        void deallocate(pointer p, size_type n) { _deallocate(p); }
        void construct(pointer p, const T& value) { _construct(p, value); }
        void destroy(pointer p) { _destroy(p); }

        pointer address(reference x) { return (pointer)&x; }
        const_pointer const_address(const_reference x) { return (const_pointer)&x; }

        size_type max_size() const { return size_type(UINT_MAX / sizeof(T)); }   
    };
} // end of namespace myalloc

#endif // _MYALLOC_

现在，我们可以使用自己编写的allocator来为vector分配空间：

#include <vector>
int main()
{
    std::vector<int, my_alloc::allocator<int> > v;
    // Do something;
    return 0;
}

然而，这样并不能通过编译（如果你使用的是GCC编译器），因为SGI STL的allocator并不完全符合STL规范，我们编写的符合规范的自然不能搭配使用，SGI STL版本的allocator具体实现在下文阐述。

3、SGI STL 的两种allocator

SGI STL 实现了两个allocator：一个是标准的std::allocator，标准就是对new和construtc的简单包装，为了兼容老版本而存在；另一个是特殊的std::alloc，分为两级配置，小于128kb直接向1级申请，2级维护空闲链表方便小内存的申请，不同于之前标准的allocator，这里不需要实例化一个空间配置器，类内声明了静态全局函数，只需要调用alloc的静态函数就行了。

参考：https://zhuanlan.zhihu.com/p/34725232

二级配置器细节：https://blog.csdn.net/weixin_40673608/article/details/86703934

第三章：迭代器

1、allocator简述

迭代器是一种接口定义，作用是不暴露所指对象信息的前提下能够遍历所指数据结构的能力。实现方法是将原生指针用类进行一层包装(类似智能指针)，对一些操作符(最重要的重载*,->)进行重载，为了不暴露所指对象的信息，迭代器的实现由各个容器的设计者来实现，每种容器在内部以嵌套的方式定义专属的迭代器，接口相同，内部实现却不同，体现了泛型编程的概念。

template<typename T>  
class Iterator  
{  
public:  
    Iterator& operator++();  

    //...  

private:   
    T *m_ptr;  
};

对于不同的数据容器，以上Iterator类中的成员函数operator++的实现会各不相同，例如，对于数组的可能实现如下：

//对于数组的实现  
template<typename T>  
Iterator& operator++()  
{   
   ++m_ptr;   
   retrun *this;  
}

对于链表，它会有一个类似于next的成员函数用于获取下一个结点，其可能实现如下：

//对于链表的实现  
template<typename T>  
Iterator& operator++()  
{  
   m_ptr = m_ptr->next();//next()用于获取链表的下一个节点   
   return *this;  
}

在STL源码中，operator*返回对象的引用，operator->返回对象的指针；此处的对象即对应容器中的元素。

有关*和->运算符的解释：https://blog.csdn.net/pngynghay/article/details/42679757

2、traits编程技法

traits是模板编程里面的一个编程技法。可能因为不是面向对象的，所以算不上一种设计模式。虽然traits本身一般实现为模板（itrator_traites，以及__type_traits等），但和智能指针（auto_ptr）这种比较大众的模板相比，traits的用法又有点宽泛，所以感觉说traits是一种编程技法是比较保险的。

为什么需要traits？

下面是一个以迭代器为模板形参的函数模板：

template<typename Iterator>
void func(Iterator iter)
{
    //函数体
}

假如现在算法中需要声明一个变量，而变量的类型是迭代器所指对象的类型，应该怎么处理呢？

template<typename Iterator>
void func(Iterator iter)
{
    *Iterator var;//这样定义变量可以吗？
}

上面的代码是不可以通过编译的，虽然C++支持sizeof()，但是并不支持typeof()，就算是用到RTTI性质中的typeid()，获取到的也仅仅是类型的名字，因此不能直接用来声明变量。此时可以利用函数模板的参数类型推导机制解决问题，例如：

template<typename Iterator, typename T>
void func_impl(Iterator iter, T t)
{
    T temp;//这里就解决了问题
    //这里做原本func()的工作
}
 
template<typename Iterator>
void func(Iterator iter)
{
    func_impl(iter, *iter);//func的工作全部都移到func_impl里面了
}
 
int main(int argc, const char *argv[])
{
    int i;
    func(&i);
}

函数func作为对外接口，实际的操作却由函数func_impl执行，通过函数func_impl的参数类型推导，获取到Iterator指向对象的类型T，从而解决了问题。

在c++11中有了auto和decltype类型推导功能. 就不用用这么多间接层去实现其实很简单的需求了.直接像下面这样写就可以了.

template&lt;typename Iterator&gt;
auto func(Iterator iter)
    -&gt; decltype(*iter)
{
    return *iter;
}

traits是“萃取”还是“特性”的意思？侯捷说是“特性萃取机”和“榨汁机”。我的理解是，通过偏特化的机制，把特性信息过滤出来——榨汁。

3、迭代器最常用到的五种类型

value_type、difference_type、pointer、reference、iterator_category，任何开发者如果想将自己开发的容器与STL结合在一起，就一定要为自己开发的容器的迭代器定义这五种类型，这样都可以通过统一接口iterator_traits萃取出相应的类型，下面列出STL中iterator_traits的完整定义

tempalte<typename I>
struct iterator_traits
{
    typedef typename I::iterator_category iterator_category;
    typedef typename I::value_type value_type;
    typedef typeanme I:difference_type difference_type;
    typedef typename I::pointer pointer;
    typedef typename I::reference reference;
};

更详细的traits：https://blog.csdn.net/shudou/article/details/10270971

第一章 STL概述(六大组件)

第二章 空间配置器

1、allocator简述

2、allocator接口规范

3、SGI STL 的 两种allocator

第三章：迭代器

1、allocator简述

2、traits编程技法

3、迭代器最常用到的五种类型

第四章：序列容器

第二章空间配置器

3、SGI STL 的两种allocator