Netty 之 Zero-copy 的实现（上）

sf_wangchong 发布于2019-08-15 12:23 / 3237人阅读

摘要：维基百科中对的解释是零拷贝技术是指计算机执行操作时，不需要先将数据从某处内存复制到另一个特定区域。维基百科里提到的零拷贝是在硬件和操作系统层面的，而本文主要介绍的是在应用层面的优化。

维基百科中对 Zero-copy 的解释是

零拷贝技术是指计算机执行操作时，CPU不需要先将数据从某处内存复制到另一个特定区域。这种技术通常用于通过网络传输文件时节省CPU周期和内存带宽。

维基百科里提到的零拷贝是在硬件和操作系统层面的，而本文主要介绍的是Netty在应用层面的优化。不过需要注意的是，零拷贝并非字面意义上的没有内存拷贝，而是避免多余的拷贝操作，即使是系统层的零拷贝也有从设备到内存，内存到设备的数据拷贝过程。

Netty 的零拷贝体现在以下几个方面

ByteBuf 的 slice 操作并不会拷贝一份新的 ByteBuf 内存空间，而是直接借用原来的 ByteBuf ，只是独立地保存读写索引。

Netty 提供了 CompositeByteBuf 类，可以将多个 ByteBuf 组合成一个逻辑上的 ByteBuf 。

Netty 的 FileRegion 中包装了 NIO 的 FileChannel.transferTo()方法，该方法在底层系统支持的情况下会调用 sendfile 方法，从而在传输文件时避免了用户态的内存拷贝。

Netty 的 PooledDirectByteBuf 等类中封装了 NIO 的 DirectByteBuffer ，而 DirectByteBuffer 是直接在 jvm 堆外分配的内存，省去了堆外内存向堆内存拷贝的开销。

下面来简单介绍下这几种方式。

slice

以下以 AbstractUnpooledSlicedByteBuf 为例讲解 slice 的零拷贝原理，至于内存池化的实现 PooledSlicedByteBuf ，因为内存池要通过引用计数来控制内存的释放，所以代码里会出现很多与本文主题无关的逻辑，这里就不拿来举栗子了。

// 切片ByteBuf的构造函数，其中字段adjustment为切片ByteBuf相对于被切片ByteBuf的偏移
// 量，两个ByteBuf共用一块内存空间,字段buffer为实际存储数据的ByteBuf
AbstractUnpooledSlicedByteBuf(ByteBuf buffer, int index, int length) {
    super(length);
    checkSliceOutOfBounds(index, length, buffer);//检查slice是否越界
    
    if (buffer instanceof AbstractUnpooledSlicedByteBuf) {
        // 如果被切片ByteBuf也是AbstractUnpooledSlicedByteBuf对象
        this.buffer = ((AbstractUnpooledSlicedByteBuf) buffer).buffer;
        adjustment = ((AbstractUnpooledSlicedByteBuf) buffer).adjustment + index;
    } else if (buffer instanceof DuplicatedByteBuf) {
        // 如果被切片ByteBuf为DuplicatedByteBuf对象，则
        // 用unwrap得到实际存储数据的ByteBuf赋值buffer
        this.buffer = buffer.unwrap();
        adjustment = index;
    } else {
        // 如果被切片ByteBuf为一般ByteBuf对象，则直接赋值buffer
        this.buffer = buffer;
        adjustment = index;
    }

    initLength(length);
    writerIndex(length);
}

以上为 AbstractUnpooledSlicedByteBuf 类的构造函数，比较简单，就不详细介绍了。

下面来看看 AbstractUnpooledSlicedByteBuf 对 ByteBuf 接口的实现代码，以 getBytes 方法为例：

@Override
public ByteBuf getBytes(int index, ByteBuffer dst) {
    checkIndex0(index, dst.remaining());//检查是否越界
    unwrap().getBytes(idx(index), dst);
    return this;
}

@Override
public ByteBuf unwrap() {
    return buffer;
}

private int idx(int index) {
    return index + adjustment;
}

这是 AbstractUnpooledSlicedByteBuf 重载的 getBytes 方法，可以看到 AbstractUnpooledSlicedByteBuf 是直接在封装的 ByteBuf 上取的字节，但是重新计算了索引，加上了相对偏移量。

CompositeByteBuf

在有些场景里，我们的数据会分散在多个 ByteBuf 上，但是我们又希望将这些 ByteBuf 聚合在一个 ByteBuf 里处理。这里最直观的想法是将所有 ByteBuf 的数据拷贝到一个 ByteBuf 上，但是这样会有大量的内存拷贝操作，产生很大的CPU开销。

而 CompositeByteBuf 可以很好地解决这个问题，正如名字一样，这是一个复合 ByteBuf ，内部由很多的 ByteBuf 组成，但 CompositeByteBuf 给它们做了一层封装，可以直接以 ByteBuf 的接口操作它们。

/**
 * Precondition is that {@code buffer != null}.
 */
private int addComponent0(boolean increaseWriterIndex, int cIndex, ByteBuf buffer) {
    assert buffer != null;
    boolean wasAdded = false;
    try {
        // 检查新增的component的索引是否合法
        checkComponentIndex(cIndex);

        // buffer的长度
        int readableBytes = buffer.readableBytes();

        // No need to consolidate - just add a component to the list.
        @SuppressWarnings("deprecation")
        // 统一为大端ByteBuf
        Component c = new Component(buffer.order(ByteOrder.BIG_ENDIAN).slice());
        if (cIndex == components.size()) {
            // 如果索引等于components的大小，则加在components尾部
            wasAdded = components.add(c);
            if (cIndex == 0) {
                // 如果components中只有一个元素
                c.endOffset = readableBytes;
            } else {
                // 如果components中有多个元素
                Component prev = components.get(cIndex - 1);
                c.offset = prev.endOffset;
                c.endOffset = c.offset + readableBytes;
            }
        } else {
            // 如果新的ByteBuf是插在components中间
            components.add(cIndex, c);
            wasAdded = true;
            if (readableBytes != 0) {
                // 如果components的大小不为0,则依次更新cIndex之后的
                // 所有components的offset和endOffset
                updateComponentOffsets(cIndex);
            }
        }
        if (increaseWriterIndex) {
            // 如果要更新writerIndex
            writerIndex(writerIndex() + buffer.readableBytes());
        }
        return cIndex;
    } finally {
        if (!wasAdded) {
            // 如果没添加成功，则释放ByteBuf
            buffer.release();
        }
    }
}

这是添加一个新的 ByteBuf 的逻辑，核心是 offset 和 endOffset ，分别指代一个 ByteBuf 在 CompositeByteBuf 中开始和结束的索引，它们唯一标记了这个 ByteBuf 在 CompositeByteBuf 中的位置。

弄清楚了这个，我们会发现上面的代码无外乎做了两件事：

把 ByteBuf 封装成 Component 加到 components 合适的位置上

使 components 里的每个 Component 的 offset 和 endOffset 值都正确

下面来看看 CompositeByteBuf 对 ByteBuf 接口的实现代码，同样以 getBytes 方法为例：

@Override
public CompositeByteBuf getBytes(int index, ByteBuf dst, int dstIndex, int length) {
    // 查索引是否越界
    checkDstIndex(index, length, dstIndex, dst.capacity());
    if (length == 0) {
        return this;
    }

    // 用二分搜索查找index对应的Component在components中的索引
    int i = toComponentIndex(index);
    // 循环读直至length为0
    while (length > 0) {
        Component c = components.get(i);
        ByteBuf s = c.buf;
        int adjustment = c.offset;
        // 取length和ByteBuf剩余字节数中的较小值
        int localLength = Math.min(length, s.capacity() - (index - adjustment));
        // 开始索引为index - c.offset，而不是0
        s.getBytes(index - adjustment, dst, dstIndex, localLength);
        index += localLength;
        dstIndex += localLength;
        length -= localLength;
        i ++;
    }
    return this;
}

/**
 * Return the index for the given offset
 */
public int toComponentIndex(int offset) {
    checkIndex(offset);

    for (int low = 0, high = components.size(); low <= high;) {
        int mid = low + high >>> 1;
        Component c = components.get(mid);
        if (offset >= c.endOffset) {
            low = mid + 1;
        } else if (offset < c.offset) {
            high = mid - 1;
        } else {
            return mid;
        }
    }

    throw new Error("should not reach here");
}

可以看到 CompositeByteBuf 在处理 index 时是先将其转换成对应 Component 在 components 中的索引，以及在 Component 中的偏移，然后从这个 Component 的这个偏移开始，往后循环取字节，直到读完。

NOTE：这里有个小trick，因为 components 是有序排列的，所以 toComponentIndex 做索引转换时没有直接遍历，而是用的二分查找。

今天写得有点累了，这里留个坑，下一篇再填上。

云服务器 GPU云服务器 webrtc在qt上的实现服务器上开发代码实现 Netty 云主机之游戏服务器的发展

文章版权归作者所有，未经允许请勿转载,若此文章存在违规行为，您可以联系管理员删除。

转载请注明本文地址：https://www.ucloud.cn/yun/67904.html

Netty 之 Zero-copy 的实现（下）

摘要：系统调用返回，产生了第四次上下文切换。现在这个方法不仅减少了上下文切换，而且消除了参与的数据拷贝。上一篇说到了 CompositeByteBuf ，这一篇接着上篇的讲下去。 FileRegion 让我们先看一个Netty官方的example // netty-netty-4.1.16.Finalexamplesrcmainjavaio ettyexamplefileFileServe...

endiat 2019-08-15 12:27 评论0 收藏0
对于 Netty ByteBuf 的零拷贝(Zero Copy) 的理解

摘要：根据对的定义即所谓的就是在操作数据时不需要将数据从一个内存区域拷贝到另一个内存区域因为少了一次内存的拷贝因此的效率就得到的提升在层面上的通常指避免在用户态与内核态之间来回拷贝数据例如提供的系统调用它可以将一段用户空间内存映射到内根据 Wiki 对 Zero-copy 的定义: Zero-copy describes computer operations in which the C...

ConardLi 2019-08-16 10:27 评论0 收藏0
Netty源码解析

摘要：一旦某个事件触发，相应的则会被调用，并进行处理。事实上，内部的连接处理协议编解码超时等机制，都是通过完成的。开启源码之门理解了的事件驱动机制，我们现在可以来研究的各个模块了。 Netty是什么大概用Netty的，无论新手还是老手，都知道它是一个网络通讯框架。所谓框架，基本上都是一个作用：基于底层API，提供更便捷的编程模型。那么通讯框架到底做了什么事情呢？回答这个问题并不太容易，我们...

_Suqin 2019-08-19 11:41 评论0 收藏0
Netty 源码分析之二贯穿Netty 的大动脉 ── ChannelPipeline (一)

摘要：目录源码之下无秘密做最好的源码分析教程源码分析之番外篇的前生今世的前生今世之一简介的前生今世之二小结的前生今世之三详解的前生今世之四详解源码分析之零磨刀不误砍柴工源码分析环境搭建源码分析之一揭开神秘的红盖头源码分析之一揭开神秘的红盖头客户端目录源码之下无秘密 ── 做最好的 Netty 源码分析教程 Netty 源码分析之番外篇 Java NIO 的前生今世 Java NI...

tunny 2019-08-14 14:59 评论0 收藏0