当前位置：

首页
/
IT
/
程序
/
Java
/
JAVA 基础第四章 - 集合框架 Collection 篇

JAVA 基础第四章 - 集合框架 Collection 篇

业内经常说的一句话是不要重复造轮子, 但是有时候, 只有自己造一个轮子了, 才会深刻明白什么样的轮子适合山路, 什么样的轮子适合平地!

往期章节:

JAVA 基础第一章 - 初识 java

JAVA 基础第二章 - java 三大特性: 封装, 继承, 多态

JAVA 基础第三章 - 类与对象, 抽象类, 接口

说起集合框架, 很多面试官在面试初级 javaer 的时候也是很喜欢问的一个知识点

我们先上一张图看看

从上面的关系图中, 我们可以看到从上往下分呢~ 最上面的是接口, 中间是抽象类, 最下面就是各个具体的实现类, 这个在我们上一章节中说到的抽象类与接口之间的关系的时候有提到过.

再从左往右看呢, 也是大致分三块, Iterator,Collection,Map 三个最顶级的接口, 但是最主要的部分还是 Collection,Map 2 个接口, 而 Iterator 更多的像是附加产品.

Collection

我们先看看 Collection 接口, 从这个接口往下, 他的子接口有 List,Set,Queue.

List

List 的具体实现类有 ArrayList,LinkedList,Vector.

ArrayList

顾名思义, 是一个 "数组型" 的集合, 对于数组, 我们应该知道的是, 数组在定义的时候就确定了大小, 不可更改. 那优点就是数组的元素是通过索引访问的, 效率较高, 因为数组是一块连续的存储空间.

所以呢, ArrayList 就是在数组的基础上, 增加了可以改变大小的接口 (方法), 如 add ,remove 等方法, 方便我们去操作修改当前集合中的数据元素, 当集合中新添加的数据超过了当前的存储空间大小时

会申请一个新的存储空间, 然后将这些已有的数据拷贝过去, 再添加新的数据 , 扩容后的集合的大小等于扩容前集合的 1.5 倍.

当我们删除一个元素的时候, 会将当前被删除元素之后的元素统一向前移动一位, 然后将最后的一位元素置为 null, 以便于 gc 回收.

所以, 如果我们对一个 ArrayList 有频繁的增删操作, 这样对性能是一个极大的损耗.

ArrayList 的数据存储结构示意图如下:

假设上图中每一个黄色的格子都代表一个 ArrayList 的存储空间

步骤 1: 在我们第一次调用 add 方法增加一个元素 1 的时候, 那么 list 会直接扩容为默认的大小 10, 我们也可以在调用 ArrayList 构造函数的时候传入参数, 指定初始化空间大小;

步骤 2: 我们再继续添加数据, 直到添加到 11 时, 会判断当前的存储空间放不下要增加的数据了, 这个时候会继续扩容, 之后再放入数据 11;

步骤 3: 在这一步, 我们决定删除数据 2,2 的下标为 1(数组的下标都是从 0 开始), 也就是调用 remove 方法;

注意: 当我们调用 size 方法获取到的是实际的存储的数据大小, 而不是整个 ArrayList 获得的存储空间大小, 例如 , 步骤 2 中调用 size 方法返回的会是 11, 而不是 15.

LinkedList

从这个名字上, 我们也可以大概知道, link 是关联的意思. LinkedList 和 ArrayList 不同的一点是, 他实现了 Deque 接口这是一个双向链表的接口.

我们先看下存储结构示意图:

如上图中所示, 每一个节点都是一个 Node 对象, 其中每个 Node 都有三个属性, item 实际存储的数据元素, 如上图中的绿色格子, next 和 prev, 这样就构成了一个链表结构.

而要注意的是 next 和 prev 也是一个 Node 对象, 而 Node 是 LinkedList 中的静态内部类. 如下图中代码所示:

在这个链表中还存在 2 个属性 first 和 last, 分别用于存放整个集合链表中的头和尾, 如果只有一个元素, 那么 first 和 last 就指向同一个元素.

数据的添加

当我们在链表中添加一个元素的时候, 最后一个元素的 null 位置会设置引用新的 Node 节点, 然后新添加的节点的 prev 会指向前一个元素的位置

我们从 LinkedList 源码中做一些简单的分析

/**
      * Appends the specified element to the end of this list.
      *
      * <p>This method is equivalent to {@link #addLast}.
      *
      * @param e element to be appended to this list
      * @return {@code true} (as specified by {@link Collection#add})
      */
     public boolean add(E e) {
         linkLast(e);
         return true;
     }

如上所示, 从 "Appends the specified element to the end of this list." 这句注释中, 我们就大致可以明白其意, 当我们调用 add 方法增加元素的时候, 默认是在末尾追加数据.

这个时候 add 方法中会调用 linkLast 方法, 具体代码如下:

/**
      * Links e as last element.
      */
     void linkLast(E e) {
         final Node<E> l = last;
         final Node<E> newNode = new Node<>(l, e, null);
         last = newNode;
         if (l == null)
             first = newNode;
         else
             l.next = newNode;
         size++;
         modCount++;
     }

上述代码中, 首先会将当前的 last 赋给 l, 然后新建一个 Node 对象, 传入新添加的数据, 以及将当前集合中的 last 赋值给新添加节点的 prev 属性.

然后将新建的对象赋值给 last, 之后再判断最开始的 last, 也就是当前的 l 是否为 null, 如果是 null, 也就代表集合是空的, 这是第一个元素, 那么就把它赋给 frist, 否则, 那么就说明已经有元素存在了, 让上一个元素的 next 指向当前新建的对象.

最后再进行调整大小等操作.

数据添加的操作示意图如下:

数据的删除

下面我们再来看一下, LinkedList 的删除操作, 也就是我们默认调用 remove 方法.

源码如下所示:

/**
      * Retrieves and removes the head (first element) of this list.
      *
      * @return the head of this list
      * @throws NoSuchElementException if this list is empty
      * @since 1.5
      */
     public E remove() {
         return removeFirst();
     }

同样, 从 "Retrieves and removes the head (first element) of this list." 注释中, 我们大致可以明白, 大意是检索并移除 list 头部的元素.

在这个方法中直接调用了 removeFirst 方法, 下面我们看一下 removeFirst 代码:

/**
      * Removes and returns the first element from this list.
      *
      * @return the first element from this list
      * @throws NoSuchElementException if this list is empty
      */
     public E removeFirst() {
         final Node<E> f = first;
         if (f == null)
             throw new NoSuchElementException();
         return unlinkFirst(f);
     }

如上所示, 在这个代码中, 直接判断是不是存在 first, 也就是集合是不是空的, 不是那就继续调用 unlinkFirst 方法,

unlinkFirst 代码如下所示:

/**
      * Unlinks non-null first node f.
      */
     private E unlinkFirst(Node<E> f) {
         // assert f == first && f != null;
         final E element = f.item;
         final Node<E> next = f.next;
         f.item = null;
         f.next = null; // help GC
         first = next;
         if (next == null)
             last = null;
         else
             next.prev = null;
         size--;
         modCount++;
         return element;
     }

如上所示, 从 "Unlinks non-null first node f" 注释中, 可知, 解开非空的第一个节点的关联. 首先将 first 节点的 f.item 以及 f.next 设置为 null, 以便于 gc 回收. 在将 f.next 置为 null 之前赋值给了临时的 next.

然后判断 next 是否为 null, 如果是, 则说明后面没有元素了, 这是集合中的唯一一个元素, 将 last 也设置为 null; 否则, 将 next 中的 prev 设置为 null.

数据删除操作示意图如下:

所以呢, 当我们对 linkedList 进行增删操作的时候只需要对 2 个节点进行修改, 而对其他节点没有任何影响.

Vector

这个类和 ArrayList 基本相似, 不同的点在于他是线程安全的, 也就是在同一个时刻, 只能有一个线程访问 Vector; 另外 Vector 扩容不同于 ArrayList, 他每次扩容默认都是按 2 倍, 而 ArrayList 是 1.5 倍.

ArrayList,LinkedList, Vector 三者之间的异同

ArrayList 与 LinkedList 相比查询速度快, 增删速度慢.

所以如果只是查询, 建议用前者, 反之建议用后者, 因为后者再增删的时候, 只需要修改 2 个节点的 prev 和 next , 而不存在复制当前已有的元素到新的存储空间.

Vecor 和 ArrayList 基本相似, 区别是前者是线程安全的, 后者不是. 但是 2 个底层实现都是数组, LinkedList 底层实现是链表.

集合的遍历

经常用到有三种方式, 代码示意如下:

1    /* 第一种遍历方式
 2         for 循环的遍历方式
 3    */
 4         for (int i = 0; i <lists.size(); i++) {
 5             System.out.print(lists.get(i));
 6         }
 7  8
 9         /* 第二种遍历方式
10           foreach 的遍历方式
11         */
12         for (Integer list : lists) {
13             System.out.print(list);
14         }
15 16
17         /* 第三种遍历方式
18          Iterator 的遍历方式
19       */
20         for (Iterator<Integer> list = lists.iterator(); list.hasNext();) {
21             System.out.print(list.next());
22         }

for 循环效率高于 Iterator 循环, 高于 foreach 循环, 因为我们都知道他们的底层实现都是数组, 而 for 循环是通过下标查询是最适合的遍历方式; 而 foreach 循环是在 Iterator 基础上进行的, 所以最慢.

另外, 迭代器遍历方式, 适用于连续内存存储方式, 比如数组, ArrayList,Vector. 缺点是只能从头开始遍历, 优点是可以一边遍历一边删除.

for 循环这种方式遍历比较灵活, 可以指定位置开始遍历. 性能最高, 但有一个缺点就是遍历过程中不允许删除元素, 否则会抛 ConcurrentModificationException.

(注: 但是曾经发现在删除倒数第 2 个元素的时候, 并不会抛出异常, 详见 记一次 list 循环删除元素的突发事件!)
Set

Set 不允许包含相同的元素, 如果试图把两个值相同元素加入同一个集合中, add 方法返回 false.

Set 判断两个对象相同不是使用 == 运算符, 而是根据 equals 方法. 也就是说, 只要两个对象用 equals 方法比较返回 true,Set 就不会再存储第二个元素.

set 的实现类有 HashSet,TreeSet,LinkedHashSet

HashSet

不能保证元素的排列顺序; 不是线程安全的; 集合元素可以是 null, 但只能放入一个 null, 其他相同数据也只能有一份存在;

对于 HashSet 我们要知道的是, 他是依靠 HashMap 中的 key 去维护存放的数据, 所以 HashSet 的这些特性都是和 HashMap 的 key 相关的.

hashSet 默认构造函数代码如下:

/**
      * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
      * default initial capacity (16) and load factor (0.75).
      */
     public HashSet() {
         map = new HashMap<>();
     }

如上代码所示, 他是调用了 HashMap 的默认构造函数.

LinkedHashSet

LinkedHashSet 和 HashSet 的区别是前者是有序的, 也就是当你插入数据的时候会按顺序排放, 这样我们遍历时就可以按照之前插入的顺序获取数据.

和 HashSet 相似也是在构造函数中调用了 LinkedHashMap 构造方法, 代码如下所示:

/**
      * Constructs a new, empty linked hash set with the default initial
      * capacity (16) and load factor (0.75).
      */
     public LinkedHashSet() {
         super(16, .75f, true);
     }

因为 LinkedHashSet 继承了 HashSet, 所以他调用 super, 就是调用的 HashSet 的构造器, 在 HashSet 中再调用了 LinkedHashMap, 代码如下所示:

/**
      * Constructs a new, empty linked hash set.  (This package private
      * constructor is only used by LinkedHashSet.) The backing
      * HashMap instance is a LinkedHashMap with the specified initial
      * capacity and the specified load factor.
      *
      * @param      initialCapacity   the initial capacity of the hash map
      * @param      loadFactor        the load factor of the hash map
      * @param      dummy             ignored (distinguishes this
      *             constructor from other int, float constructor.)
      * @throws     IllegalArgumentException if the initial capacity is Less
      *             than zero, or if the load factor is nonpositive
      */
     HashSet(int initialCapacity, float loadFactor, boolean dummy) {
         map = new LinkedHashMap<>(initialCapacity, loadFactor);
     }
TreeSet

TreeSet 是 SortedSet 接口的唯一实现类, TreeSet 可以确保集合元素处于排序状态.

TreeSet 支持两种排序方式, 自然排序和定制排序, 其中自然排序为默认的排序方式. 向 TreeSet 中加入的应该是同一个类的对象.

自然排序不用多说, 那定制排序的意思就是, 我们可以自己通过实现 Comparator 接口覆写其中比较方法, 然后按照我们意愿进行排序, 比如自然排序是升序, 我们通过覆写这个排序方法, 可以修改成降序.

TreeSet 带比较器的构造函数代码如下:

/**
      * Constructs a new, empty tree set, sorted according to the specified
      * comparator.  All elements inserted into the set must be <i>mutually
      * comparable</i> by the specified comparator: {@code comparator.compare(e1,
      * e2)} must not throw a {@code ClassCastException} for any elements
      * {@code e1} and {@code e2} in the set.  If the user attempts to add
      * an element to the set that violates this constraint, the
      * {@code add} call will throw a {@code ClassCastException}.
      *
      * @param comparator the comparator that will be used to order this set.
      *        If {@code null}, the {@linkplain Comparable natural
      *        ordering} of the elements will be used.
      */
     public TreeSet(Comparator<? super E> comparator) {
         this(new TreeMap<>(comparator));
     }

TreeSet 默认构造函数代码如下:

/**
      * Constructs a new, empty tree set, sorted according to the
      * natural ordering of its elements.  All elements inserted into
      * the set must implement the {@link Comparable} interface.
      * Furthermore, all such elements must be <i>mutually
      * comparable</i>: {@code e1.compareTo(e2)} must not throw a
      * {@code ClassCastException} for any elements {@code e1} and
      * {@code e2} in the set.  If the user attempts to add an element
      * to the set that violates this constraint (for example, the user
      * attempts to add a string element to a set whose elements are
      * integers), the {@code add} call will throw a
      * {@code ClassCastException}.
      */
     public TreeSet() {
         this(new TreeMap<E,Object>());
     }

从上面的源码注释中, 我们大致可以明白, 其意是构造一个新的空的树形 set, 排序按照元素的自然顺序排序, 所有要插入到 set 中的元素必须实现 Comparable 接口, 同时, 这些元素还必须是互相可以比较的.

如果使用者尝试添加一个 string 类型的数据到 integer 类型的 set 中, 那么会抛出 ClassCastException 异常.

关于 Map 接口, 我们将在下一章节中做一个详细的分析

文中若有不正之处, 欢迎批评指正!

来源: https://www.cnblogs.com/JJJ1990/p/10109521.html

与本文相关文章

暂无,快来抢沙发吧！