CanalServerWithEmbedded

含多个instance，成员变量canalInstances记录了instance名称与实例映射关系

同server不出现相同的instance名称：Map结构

public class CanalServerWithEmbedded extends AbstractCanalLifeCycle implements CanalServer, CanalService {
    private Map<String, CanalInstance> canalInstances;
    private CanalInstanceGenerator     canalInstanceGenerator;
}

canal详解

每个client对应一个instance，启动时指定destination（instance名称）所以canalserverwithembedded处理请求的参数都有clientIdentity，从中取destination，这样可获取对应的canalinstance：

canal client通过destination找出canal server对应的canal iinstance

public void subscribe(ClientIdentity clientIdentity) throws CanalServerException {
    // ClientIdentity表示Canal Client客户端，从中可以获取出客户端指定连接的Destination
    // 由于CanalServerWithEmbedded记录了每个Destination对应的Instance，可以获取客户端对应的Instance
    CanalInstance canalInstance = canalInstances.get(clientIdentity.getDestination());
    if (!canalInstance.getMetaManager().isStart()) {
        canalInstance.getMetaManager().start(); // 启动Instance的元数据管理器
    }
    canalInstance.getMetaManager().subscribe(clientIdentity); // 执行一下meta订阅
    Position position = canalInstance.getMetaManager().getCursor(clientIdentity);
    if (position == null) {
        position = canalInstance.getEventStore().getFirstPosition();// 获取一下store中的第一条
        if (position != null) {
            canalInstance.getMetaManager().updateCursor(clientIdentity, position); // 更新一下cursor
        }
    }
    // 通知下订阅关系变化
    canalInstance.subscribeChange(clientIdentity);
}

每个CanalInstance中包括了四个组件：EventParser、EventSink、EventStore、MetaManager。

服务端主要的处理方法包括get/ack/rollback，这三个方法都会用到Instance上面的几个内部组件，主要还是EventStore和MetaManager：

在这之前，要先理解EventStore的含义，EventStore是一个RingBuffer，有三个指针：Put、Get、Ack。

Put: Canal Server从MySQL拉取到数据后，放到内存中，Put增加
Get: 消费者（Canal Client）从内存中消费数据，Get增加
Ack: 消费者消费完成，Ack增加。并且会删除Put中已经被Ack的数据

canal详解

客户端通过canal server获取mysql binlog有几种方式（get方法和getWithoutAck）：

如果timeout为null，则采用tryGet方式，即时获取
如果timeout不为null
1. timeout为0，则采用get阻塞方式，获取数据，不设置超时，直到有足够的batchSize数据才返回
2. timeout不为0，则采用get+timeout方式，获取数据，超时还没有batchSize足够的数据，有多少返回多少

private Events<Event> getEvents(CanalEventStore eventStore, Position start, int batchSize, Long timeout,
                                TimeUnit unit) {
    if (timeout == null) {
        return eventStore.tryGet(start, batchSize); // 即时获取
    } else if (timeout <= 0){
        return eventStore.get(start, batchSize); // 阻塞获取
    } else {
        return eventStore.get(start, batchSize, timeout, unit); // 异步获取
    }
}

注意：EventStore的实现采用了类似Disruptor的RingBuffer环形缓冲区。RingBuffer的实现类是MemoryEventStoreWithBuffer

get方法和getWithoutAck方法的区别是：

get方法会立即调用ack
getWithoutAck方法不会调用ack

EventStore

假设环形缓冲区填满后无可用slot，put操作阻塞直到被消费掉，源码中大小=16M

public class MemoryEventStoreWithBuffer extends AbstractCanalStoreScavenge implements CanalEventStore<Event>, CanalStoreScavenge {
    private static final long INIT_SQEUENCE = -1;
    private int               bufferSize    = 16 * 1024;
    private int               bufferMemUnit = 1024;                         // memsize的单位，默认为1kb大小
    private int               indexMask;
    private Event[]           entries;

    // 记录下put/get/ack操作的三个下标
    private AtomicLong        putSequence   = new AtomicLong(INIT_SQEUENCE); // 代表当前put操作最后一次写操作发生的位置
    private AtomicLong        getSequence   = new AtomicLong(INIT_SQEUENCE); // 代表当前get操作读取的最后一条的位置
    private AtomicLong        ackSequence   = new AtomicLong(INIT_SQEUENCE); // 代表当前ack操作的最后一条的位置

    // 启动EventStore时，创建指定大小的缓冲区，Event数组的大小是16*1024
    // 也就是说算个数的话，数组可以容纳16000个事件。算内存的话，大小为16MB
    public void start() throws CanalStoreException {
        super.start();
        indexMask = bufferSize - 1;
        entries = new Event[bufferSize];
    }

    // EventParser解析后，会放入内存中（Event数组，缓冲区）
    private void doPut(List<Event> data) {
        long current = putSequence.get(); // 取得当前的位置，初始时为-1，第一个元素为-1+1=0
        long end = current + data.size(); // 最末尾的位置，假设Put了10条数据，end=-1+10=9
        // 先写数据，再更新对应的cursor,并发度高的情况，putSequence会被get请求可见，拿出了ringbuffer中的老的Entry值
        for (long next = current + 1; next <= end; next++) {
            entries[getIndex(next)] = data.get((int) (next - current - 1));
        }
        putSequence.set(end);
    } 
}

Put是生产数据，Get是消费数据，Get一定不会超过Put。比如Put了10条数据，Get最多只能获取到10条数据。但有时候为了保证Get处理的速度，Put和Get并不会相等。
可以把Put看做是生产者，Get看做是消费者。生产者速度可以很快，消费者则可以慢慢地消费。比如Put了1000条，而Get我们只需要每次处理10条数据。

仍然以前面的示例来说明Get的流程，初始时current=-1，假设Put了两批数据一共15条，maxAbleSequence=14，而Get的BatchSize假设为10。
初始时next=current=-1，end=-1。通过startPosition，会设置next=0。最后end又被赋值为9，即循环缓冲区[0,9]一共10个元素。

private Events<Event> doGet(Position start, int batchSize) throws CanalStoreException {
    LogPosition startPosition = (LogPosition) start;

    long current = getSequence.get();
    long maxAbleSequence = putSequence.get();
    long next = current;
    long end = current;
    // 如果startPosition为null，说明是第一次，默认+1处理
    if (startPosition == null || !startPosition.getPostion().isIncluded()) { // 第一次订阅之后，需要包含一下start位置，防止丢失第一条记录
        next = next + 1;
    }

    end = (next + batchSize - 1) < maxAbleSequence ? (next + batchSize - 1) : maxAbleSequence;
    // 提取数据并返回
    for (; next <= end; next++) {
        Event event = entries[getIndex(next)];
        if (ddlIsolation && isDdl(event.getEntry().getHeader().getEventType())) {
            // 如果是ddl隔离，直接返回
            if (entrys.size() == 0) {
                entrys.add(event);// 如果没有DML事件，加入当前的DDL事件
                end = next; // 更新end为当前
            } else {
                // 如果之前已经有DML事件，直接返回了，因为不包含当前next这记录，需要回退一个位置
                end = next - 1; // next-1一定大于current，不需要判断
            }
            break;
        } else {
            entrys.add(event);
        }
    }
    // 处理PositionRange，然后设置getSequence为end
    getSequence.compareAndSet(current, end)
}

ack操作的上限是Get，假设Put了15条数据，Get了10条数据，最多也只能Ack10条数据。Ack的目的是清空缓冲区中已经被Get过的数据

public void ack(Position position) throws CanalStoreException {
    cleanUntil(position);
}

public void cleanUntil(Position position) throws CanalStoreException {
    long sequence = ackSequence.get();
    long maxSequence = getSequence.get();

    boolean hasMatch = false;
    long memsize = 0;
    for (long next = sequence + 1; next <= maxSequence; next++) {
        Event event = entries[getIndex(next)];
        memsize += calculateSize(event);
        boolean match = CanalEventUtils.checkPosition(event, (LogPosition) position);
        if (match) {// 找到对应的position，更新ack seq
            hasMatch = true;

            if (batchMode.isMemSize()) {
                ackMemSize.addAndGet(memsize);
                // 尝试清空buffer中的内存，将ack之前的内存全部释放掉
                for (long index = sequence + 1; index < next; index++) {
                    entries[getIndex(index)] = null;// 设置为null
                }
            }

            ackSequence.compareAndSet(sequence, next)
        }
    }
}

rollback回滚方法的实现则比较简单，将getSequence回退到ack位置

public void rollback() throws CanalStoreException {
    getSequence.set(ackSequence.get());
    getMemSize.set(ackMemSize.get());
}

canal详解

谢谢https://blog.****.net/varyall/article/details/79208574的分享，差不多copy过来的，为了自己学习

canal详解

CanalServerWithEmbedded

EventStore

相关推荐