Java HashMap 的工作原理

本文转载于: http://www.importnew.com/10620.html

面试的时候经常会遇见诸如: java 中的 HashMap 是怎么工作的, HashMap 的 get 和 put 内部的工作原理这样的问题本文将用一个简单的例子来解释下 HashMap 内部的工作原理首先我们从一个例子开始, 而不仅仅是从理论上, 这样, 有助于更好地理解, 然后, 我们来看下 get 和 put 到底是怎样工作的

我们来看个非常简单的例子有一个国家 (Country) 类, 我们将要用 Country 对象作为 key, 它的首都的名字 (String 类型) 作为 value 下面的例子有助于我们理解 key-value 对在 HashMap 中是如何存储的

1. Country.java
package org.arpit.javapostsforlearning;
public class Country {
 String name;
 long population;
 public Country(String name, long population) {
  super();
  this.name = name;
  this.population = population;
 }
 public String getName() {
  return name;
 }
 public void setName(String name) {
  this.name = name;
 }
 public long getPopulation() {
  return population;
 }
 public void setPopulation(long population) {
  this.population = population;
 }
 // If length of name in country object is even then return 31(any random number) and if odd then return 95(any random number).
 // This is not a good practice to generate hashcode as below method but I am doing so to give better and easy understanding of hashmap.
 @Override
 public int hashCode() {
  if(this.name.length()%2==0)
   return 31;
  else
   return 95;
 }
 @Override
 public boolean equals(Object obj) {
  Country other = (Country) obj;
   if (name.equalsIgnoreCase((other.name)))
   return true;
  return false;
 }
}

如果想了解更多关于 Object 对象的 hashcode 和 equals 方法的东西, 可以参考:

java 中的 hashcode()和 equals()方法

2.HashMapStructure.java(main class) import java.util.HashMap;
import java.util.Iterator;
public class HashMapStructure {
    /**
     * @author Arpit Mandliya
     */
    public static void main(String[] args) {
        Country india = new Country("India", 1000);
        Country japan = new Country("Japan", 10000);
        Country france = new Country("France", 2000);
        Country russia = new Country("Russia", 20000);
        HashMap countryCapitalMap = new HashMap();
        countryCapitalMap.put(india, "Delhi");
        countryCapitalMap.put(japan, "Tokyo");
        countryCapitalMap.put(france, "Paris");
        countryCapitalMap.put(russia, "Moscow");
        Iterator countryCapitalIter = countryCapitalMap.keySet().iterator(); //put debug point at this line
        while (countryCapitalIter.hasNext()) {
            Country countryObj = countryCapitalIter.next();
            String capital = countryCapitalMap.get(countryObj);
            System.out.println(countryObj.getName() + "----" + capital);
        }
    }
}

现在, 在第 23 行设置一个断点, 在项目上右击 ->调试运行 (debug as)->java 应用(java application) 程序会停在 23 行, 然后在 countryCapitalMap 上右击, 选择查看 (watch) 将会看到如下的结构:

从上图可以观察到以下几点:

有一个叫做 table 大小是 16 的 Entry 数组

这个 table 数组存储了 Entry 类的对象 HashMap 类有一个叫做 Entry 的内部类这个 Entry 类包含了 key-value 作为实例变量我们来看下 Entry 类的结构 Entry 类的结构:

static class Entry implements Map.Entry
{
        final K key;
        V value;
        Entry next;
        final int hash;
        ...//More code goes here
}   `

每当往 hashmap 里面存放 key-value 对的时候, 都会为它们实例化一个 Entry 对象, 这个 Entry 对象就会存储在前面提到的 Entry 数组 table 中现在你一定很想知道, 上面创建的 Entry 对象将会存放在具体哪个位置 (在 table 中的精确位置) 答案就是, 根据 key 的 hashcode()方法计算出来的 hash 值(来决定)hash 值用来计算 key 在 Entry 数组的索引

现在, 如果你看下上图中数组的索引 10, 它有一个叫做 HashMap$Entry 的 Entry 对象

我们往 hashmap 放了 4 个 key-value 对, 但是看上去好像只有 2 个元素!!! 这是因为, 如果两个元素有相同的 hashcode, 它们会被放在同一个索引上问题出现了, 该怎么放呢? 原来它是以链表 (LinkedList) 的形式来存储的(逻辑上)

上面的 country 对象的 key-value 的 hash 值是如何计算出来的

<code>Japan 的 Hash 值是 95, 它的长度是奇数

India 的 Hash 值是 95, 它的长度是奇数

Russia 的 Hash 值是 31, 它的长度是偶数

France, 它的长度是偶数

</code>

下图会清晰的从概念上解释下链表

所以, 现在假如你已经很好地了解了 hashmap 的结构, 让我们看下 put 和 get 方法

Put :

让我们看下 put 方法的实现:

/**
  * Associates the specified value with the specified key in this map. If the
  * map previously contained a mapping for the key, the old value is
  * replaced.
  *
  * @param key
  *            key with which the specified value is to be associated
  * @param value
  *            value to be associated with the specified key
  * @return the previous value associated with key, or null
  *         if there was no mapping for key. (A null return
  *         can also indicate that the map previously associated
  *         null with key.)
  */
public V put(K key, V value) {
    if (key == null) return putForNullKey(value);
    int hash = hash(key.hashCode());
    int i = indexFor(hash, table.length);
    for (Entry e = table[i]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }
    modCount++;
    addEntry(hash, key, value, i);
    return null;
}

现在我们一步一步来看下上面的代码

对 key 做 null 检查如果 key 是 null, 会被存储到 table[0], 因为 null 的 hash 值总是 0

key 的 hashcode()方法会被调用, 然后计算 hash 值 hash 值用来找到存储 Entry 对象的数组的索引有时候 hash 函数可能写的很不好, 所以 JDK 的设计者添加了另一个叫做 hash()的方法, 它接收刚才计算的 hash 值作为参数如果你想了解更多关于 hash()函数的东西, 可以参考: hashmap 中的 hash 和 indexFor 方法

indexFor(hash,table.length)用来计算在 table 数组中存储 Entry 对象的精确的索引

在我们的例子中已经看到, 如果两个 key 有相同的 hash 值(也叫冲突), 他们会以链表的形式来存储所以, 这里我们就迭代链表

如果在刚才计算出来的索引位置没有元素, 直接把 Entry 对象放在那个索引上

如果索引上有元素, 然后会进行迭代, 一直到 Entry->next 是 null 当前的 Entry 对象变成链表的下一个节点

如果我们再次放入同样的 key 会怎样呢? 逻辑上, 它应该替换老的 value 事实上, 它确实是这么做的在迭代的过程中, 会调用 equals()方法来检查 key 的相等性(key.equals(k)), 如果这个方法返回 true, 它就会用当前 Entry 的 value 来替换之前的 value

Get:

现在我们来看下 get 方法的实现:

/**
  * Returns the value to which the specified key is mapped, or {@code null}
  * if this map contains no mapping for the key.
  *
  *
  * More formally, if this map contains a mapping from a key {@code k} to a
  * value {@code v} such that {@code (key==null ? k==null :
  * key.equals(k))}, then this method returns {@code v}; otherwise it returns
  * {@code null}. (There can be at most one such mapping.)
  *
  *
  * A return value of {@code null} does not necessarily indicate that
  * the map contains no mapping for the key; its also possible that the map
  * explicitly maps the key to {@code null}. The {@link #containsKey
  * containsKey} operation may be used to distinguish these two cases.
  *
  * @see #put(Object, Object)
  */
public V get(Object key) {
    if (key == null) return getForNullKey();
    int hash = hash(key.hashCode());
    for (Entry e = table[indexFor(hash, table.length)]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) return e.value;
    }
    return null;
}

当你理解了 hashmap 的 put 的工作原理, 理解 get 的工作原理就非常简单了当你传递一个 key 从 hashmap 总获取 value 的时候:

对 key 进行 null 检查如果 key 是 null,table[0]这个位置的元素将被返回

key 的 hashcode()方法被调用, 然后计算 hash 值

indexFor(hash,table.length)用来计算要获取的 Entry 对象在 table 数组中的精确的位置, 使用刚才计算的 hash 值

在获取了 table 数组的索引之后, 会迭代链表, 调用 equals()方法检查 key 的相等性, 如果 equals()方法返回 true,get 方法返回 Entry 对象的 value, 否则, 返回 null

要牢记以下关键点:

HashMap 有一个叫做 Entry 的内部类, 它用来存储 key-value 对

上面的 Entry 对象是存储在一个叫做 table 的 Entry 数组中

table 的索引在逻辑上叫做桶(bucket), 它存储了链表的第一个元素

key 的 hashcode()方法用来找到 Entry 对象所在的桶

如果两个 key 有相同的 hash 值, 他们会被放在 table 数组的同一个桶里面

key 的 equals()方法用来确保 key 的唯一性

value 对象的 equals()和 hashcode()方法根本一点用也没有

来源: http://www.bubuko.com/infodetail-2497220.html

与本文相关文章

暂无,快来抢沙发吧！