ArrayList源码学习笔记（3）

2023-12-24 20:28:26

时隔两年，重新读ArrayList源码，轻松了很多，以问题的方式记录一下收获

装饰器模式
注释中提到ArrayList本身不是线程安全的，注释如下：

 * <p><strong>Note that this implementation is not synchronized.</strong>
 * If multiple threads access an <tt>ArrayList</tt> instance concurrently,
 * and at least one of the threads modifies the list structurally, it
 * <i>must</i> be synchronized externally.  (A structural modification is
 * any operation that adds or deletes one or more elements, or explicitly
 * resizes the backing array; merely setting the value of an element is not
 * a structural modification.)  This is typically accomplished by
 * synchronizing on some object that naturally encapsulates the list.
 *
 * If no such object exists, the list should be "wrapped" using the
 * {@link Collections#synchronizedList Collections.synchronizedList}
 * method.  This is best done at creation time, to prevent accidental
 * unsynchronized access to the list:<pre>
 *   List list = Collections.synchronizedList(new ArrayList(...));</pre>

如果要想做到线程安全，需要对某个对象加锁的方式来实现，实现应当如下

synchronized(ojb) {
    list.add(item);
}

如果没有这么做，可以使用Collections.synchronizedList对ArrayList包装，并且最好是在一开始定义list的时候就进行包装，避免有的地方使用了未包装的原始list，代码如下：

 List list = Collections.synchronizedList(new ArrayList(...));

类签名里既然继承了AbstractList，为什么还要写implements List

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable {

应该是作者写错了，后来没改回来只是觉得没有必要且保持和旧版本的一致了。参考博客以及 stackOverFlow问答

DEFAULTCAPACITY_EMPTY_ELEMENTDATA
这是在无参构造函数使用的存储数据，默认不分配数组且空数组也复用，这内存节省到极致了，值得学习。

 	/**
     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
    
	/**
     * Constructs an empty list with an initial capacity of ten.
     */
    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

存储结构：Object[] elementData
为什么使用Object？这个是java对于泛型的使用上有一些约束。如果直接创建T[]数组，会报错，因为编译器会进行类型擦除，并不能知道这个T类型是什么。所以干脆创建Object[]数组。（这个参考自 ArrayList 解析）
鉴于泛型擦除，list只能做编译期的类型校验，运行时是无法校验的，除非有类型强转。

	public static void main(String[] args) {
        List<Integer> list = new ArrayList<>();
        list.add(1);
        List list1 = list;
        list1.add("xx");
        System.out.println(list);
    }

输出：[1, xx]

elementData前的transient关键字
意思是序列化时忽略，writeObject和readObject单独实现。这两个方法必须声明为private，在java.io.ObjectStreamClass#getPrivateMethod()方法中通过反射获取到writeObject()这个方法。
elementData定义为transient的优势：自己根据size序列化真实的元素，而不是根据数组的长度序列化元素，减少了空间占用。
ensureExplicitCapacity直接进行了modCount++，我觉得不妥
源码如下，其实下面的if语句为false的时候，grow不会执行，也就不会对list进行修改，所以modCount理论上不应该增加。
结合add和remove方法来看，这么写是因为add和remove方法会调用ensureExplicitCapacity，所以将modCount++的动作下沉了。
但是public方法ensureCapacity，也调用了ensureExplicitCapacity，而不一定会产生结构修改，除非size需要调整。所以这里的语义不太合理了。

	/**
	* 对外提供的方法，可以通过调用这个方法，在要写入大批数据之前进行容量保障，避免出现频繁扩容
	**/
	public void ensureCapacity(int minCapacity) {
        int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
            // any size if not default element table
            ? 0
            // larger than default for default empty table. It's already
            // supposed to be at default size.
            : DEFAULT_CAPACITY;

        if (minCapacity > minExpand) {
            ensureExplicitCapacity(minCapacity);
        }
    }
    
    /**
	* 内部私有实现
	**/
	private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8
最大大小设置的是int最大值-8，一堆讨论为什么要减8的以及实际上容量也能设置为int最大值的，这个暂时跳过。
想到了另外一个问题，size的返回结果是int，那超出int怎么办？为什么不能设置为long？不能设置为long的原因很好理解，因为没必要，一般不会这么大，而且用long就会占用更多内存。那么超出int怎么办？没找到方法，或许自行设计一个复杂结构
扩容是newCapacity = oldCapacity + (oldCapacity >> 1)，每次扩50%
rangeCheckForAdd
针对add和addAll方法会多校验一下index<0，为什么remove不需要校验呢？不太理解
代码风格：a方法调用了b方法，b方法写在a方法后面，更容易阅读。
fastRemove方法
去掉了index校验，只供内部使用，真的是太细了。对性能压榨到了极致

	/**
	* 公共方法，删除指定index的元素，有range校验
	**/
	public E remove(int index) {
        rangeCheck(index);

        modCount++;
        E oldValue = elementData(index);

        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work

        return oldValue;
    }
	/**
	* 公共方法，删除指定对象，在查找到对象之后，获取其index，通过调用fastRemove进行删除
	**/
	public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true;
                }
        } else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true;
                }
        }
        return false;
    }
	/**
	* 私有方法，删除指定index的元素，只供内部使用，因此没有做range校验
	**/
	private void fastRemove(int index) {
        modCount++;
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work
    }

batchRemove用了读写双指针来实现数据删除过程中的就地挪动
有时候做算法题就会看到一些双指针解决的问题，jdk源码里就有相应实现

	/**
	* 删除给定集合的元素，它调用了batchRemove方法
	**/
	public boolean removeAll(Collection<?> c) {
        Objects.requireNonNull(c);
        return batchRemove(c, false);
    }
    /**
	* 保留给定集合的元素，它调用了batchRemove方法
	**/
    public boolean retainAll(Collection<?> c) {
        Objects.requireNonNull(c);
        return batchRemove(c, true);
    }
	/**
	* 批量删除方法，可以通过complement来控制传入的集合中的原始是需要删除还是保留
	* r、w双指针实现就地修改
	**/
    private boolean batchRemove(Collection<?> c, boolean complement) {
        final Object[] elementData = this.elementData;
        int r = 0, w = 0;
        boolean modified = false;
        try {
            for (; r < size; r++)
                if (c.contains(elementData[r]) == complement)
                    elementData[w++] = elementData[r];
        } finally {
            // Preserve behavioral compatibility with AbstractCollection,
            // even if c.contains() throws.
            if (r != size) {
                System.arraycopy(elementData, r,
                                 elementData, w,
                                 size - r);
                w += size - r;
            }
            if (w != size) {
                // clear to let GC do its work
                for (int i = w; i < size; i++)
                    elementData[i] = null;
                modCount += size - w;
                size = w;
                modified = true;
            }
        }
        return modified;
    }

modcount，是一个体系化的事情，是保证遍历的快速失败。需要保证每个影响正确性的地方都修改到，那么怎么保证呢？根据注释来看，是所有会导致list产生结构性变化的地方都需要修改modcount。然后确定方法中是否需要修改modCount就有根据了。

文章来源:https://blog.csdn.net/lijianqingfeng/article/details/135185422
本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若内容造成侵权/违法违规/事实不符，请联系我的编程经验分享网邮箱：veading@qq.com进行投诉反馈，一经查实，立即删除！