MySQL中一条简单SQL的加锁分析
一、问题引入
看如下一条sql语句:
# table T (id int, name varchar(20)) delete from T where id = 10;
MySQL在执行的过程中,是如何加锁呢?
再看下面这条语句:
select * from T where id = 10;
那这条语句呢?其实这其中包含太多知识点了。要回答这两个问题,首先需要了解一些知识。
二、相关知识回顾
2.1 多版本并发控制
在MySQL默认存储引擎InnoDB中,实现的是基于多版本的并发控制协议——MVCC(Multi-Version Concurrency Control)(注:与MVVC相对的,是基于锁的并发控制,Lock-Based Concurrency Control)。
其中MVCC最大的好处是:读不加锁,读写不冲突。在读多写少的OLTP应用中,读写不冲突是非常重要的,极大的提高了系统的并发性能,在现阶段,几乎所有的RDBMS,都支持MVCC。其实,MVCC就一句话总结:同一份数据临时保存多个版本的一种方式,进而实现并发控制。
2.2 当前读和快照读
在MVCC并发控制中,读操作可以分为两类:快照读与当前读。
- 快照读(简单的select操作):读取的是记录中的可见版本(可能是历史版本),不用加锁。这你就知道第二个问题的答案了吧。
- 当前读(特殊的select操作、insert、delete和update):读取的是记录中最新版本,并且当前读返回的记录都会加上锁,这样保证了了其他事务不会再并发修改这条记录。
2.3 聚集索引
也叫做聚簇索引。在InnoDB中,数据的组织方式就是聚簇索引:完整的记录,储存在主键索引中,通过主键索引,就可以获取记录中所有的列。
2.4 最左前缀原则
也就是最左优先,这条原则针对的是组合索引和前缀索引,理解:
1、在MySQL中,进行条件过滤时,是按照向右匹配直到遇到范围查询(>,<,between,like)就停止匹配,比如说a = 1 and b = 2 and c > 3 and d = 4 如果建立(a, b, c, d)顺序的索引,d是用不到索引的,如果建立(a, b, d, c)索引就都会用上,其中a,b,d的顺序可以任意调整。
2、= 和 in 可以乱序,比如 a = 1 and b = 2 and c = 3 建立(a, b, c)索引可以任意顺序,MySQL的查询优化器会优化索引可以识别的形式。
2.5 两阶段锁
传统的RDMS加锁的一个原则,就是2PL(Two-Phase Locking,二阶段锁)。也就是说锁操作分为两个阶段:加锁阶段和解锁阶段,并且保证加锁阶段和解锁阶段不想交。也就是说在一个事务中,不管有多少条增删改,都是在加锁阶段加锁,在 commit 后,进入解锁阶段,才会全部解锁。
2.6 隔离级别
MySQL/InnoDB中,定义了四种隔离级别:
- Read Uncommitted:可以读取未提交记录。此隔离级别不会使用。
- Read Committed(RC):针对当前读,RC隔离级别保证了对读取到的记录加锁(记录锁),存在幻读现象。
- Repeatable Read(RR):针对当前读,RR隔离级别保证对读取到的记录加锁(记录锁),同时保证对读取的范围加锁,新的满足查询条件的记录不能够插入(间隙锁),不存在幻读现象。
- Serializable:从MVCC并发控制退化为基于锁的并发控制。不区别快照读和当前读,所有的读操作都是当前读,读加读锁(S锁),写加写锁(X锁)。在该隔离级别下,读写冲突,因此并发性能急剧下降,在MySQL/InnoDB中不建议使用。
2.7 Gap锁和Next-Key锁
在InnoDB中完整行锁包含三部分:
- 记录锁(Record Lock):记录锁锁定索引中的一条记录。
- 间隙锁(Gap Lock):间隙锁要么锁住索引记录中间的值,要么锁住第一个索引记录前面的值或最后一个索引记录后面的值。
- Next-Key Lock:Next-Key锁时索引记录上的记录锁和在记录之前的间隙锁的组合。
三、案例分析过程
SQL1: select * from t1 where id = 10;(不加锁。因为MySQL是使用多版本并发控制的,读不加锁。)
SQL2: delete from t1 where id = 10;(需根据多种情况进行分析)
假设t1表上有索引,执行计划一定会选择使用索引进行过滤 (索引扫描),根据以下组合,来进行分析。
- 组合一:id列是主键,RC隔离级别
- 组合二:id列是二级唯一索引,RC隔离级别
- 组合三:id列是二级非唯一索引,RC隔离级别
- 组合四:id列上没有索引,RC隔离级别
- 组合五:id列是主键,RR隔离级别
- 组合六:id列是二级唯一索引,RR隔离级别
- 组合七:id列是二级非唯一索引,RR隔离级别
- 组合八:id列上没有索引,RR隔离级别
- 组合九:Serializable隔离级别
注:在前面八种组合下,也就是RC,RR隔离级别下,SQL1:select操作均不加锁,采用的是快照读,因此在下面的讨论中就忽略了,主要讨论SQL2:delete操作的加锁。
组合一: id主键 + RC
id是主键,Read Committed隔离级别,给定SQL:delete from t1 where id = 10; 只需要将主键上,id = 10的记录加上X锁即可。如下图所示:
结论:id是主键时,此SQL只需要在id=10这条记录上加X锁即可。
示例:
#准备数据 mysql> create table t1 (id int,name varchar(10)); mysql> alter table t1 add primary key (id); mysql> insert into t1 values(1,'a'),(4,'c'),(7,'b'),(10,'a'),(20,'d'),(30,'b'); mysql> select * from t1; +----+------+ | id | name | +----+------+ | 1 | a | | 4 | c | | 7 | b | | 10 | a | | 20 | d | | 30 | b | +----+------+ 6 rows in set (0.00 sec) 会话1 mysql> select @@tx_isolation; +----------------+ | @@tx_isolation | +----------------+ | READ-COMMITTED | +----------------+ 1 row in set, 1 warning (0.00 sec) mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> delete from t1 where id=10; Query OK, 1 row affected (0.00 sec) 会话2 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> select * from t1; +----+------+ | id | name | +----+------+ | 1 | a | | 4 | c | | 7 | b | | 10 | a | | 20 | d | | 30 | b | +----+------+ 6 rows in set (0.00 sec) mysql> update t1 set name='a1' where id=10; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> update t1 set name='a1' where id=11; Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 mysql> update t1 set name='a1' where id=7; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 从示例中可以看到会话1执行的delete操作,只对id=10加了X锁。
组合二:id唯一索引 + RC
id不是主键,而是一个Unique的二级索引键值。那么在RC隔离级别下,delete from t1 where id = 10; 需要加什么锁呢?见下图:
此组合中,id是unique索引,而主键是name列。此时,加锁的情况由于组合一有所不同。由于id是unique索引,因此delete语句会选择走id列的索引进行where条件的过滤,在找到id=10的记录后,首先会将unique索引上的id=10索引记录加上X锁,同时,会根据读取到的name列,回主键索引(聚簇索引),然后将聚簇索引上的name = ‘d’ 对应的主键索引项加X锁。
为什么聚簇索引上的记录也要加锁?试想一下,如果并发的一个SQL,是通过主键索引来更新:update t1 set id = 100 where name = 'd';此时,如果delete语句没有将主键索引上的记录加锁,那么并发的update就会感知不到delete语句的存在,违背了同一记录上的更新/删除需要串行执行的约束。
示例:
准备数据 mysql> create table t1 (id int,name varchar(10)); Query OK, 0 rows affected (0.06 sec) mysql> ALTER TABLE test.t1 ADD UNIQUE INDEX idx_id (id); Query OK, 0 rows affected (0.07 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name); Query OK, 0 rows affected (0.11 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> insert into t1 values(1,'f'),(2,'zz'),(3,'b'),(5,'a'),(6,'c'),(10,'d'); Query OK, 6 rows affected (0.01 sec) Records: 6 Duplicates: 0 Warnings: 0 会话1 mysql> begin; Query OK, 0 rows affected (0.01 sec) mysql> delete from t1 where id=10; Query OK, 1 row affected (0.00 sec) 会话2 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> select * from t1; +------+------+ | id | name | +------+------+ | 1 | f | | 2 | zz | | 3 | b | | 5 | a | | 6 | c | | 10 | d | +------+------+ 6 rows in set (0.00 sec) mysql> update t1 set id =100 where name='d'; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> update t1 set id =100 where name='c'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update t1 set id =101 where name='a'; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Warnings: 0
结论:若id列是unique列,其上有unique索引。那么SQL需要加两个X锁,一个对应于id unique索引上的id = 10的记录,另一把锁对应于聚簇索引上的[name=’d’,id=10]的记录。
组合三:id非唯一索引 + RC
id列是一个普通索引。假设delete from t1 where id = 10; 语句,仍旧选择id列上的索引进行过滤where条件,那么此时会持有哪些锁?同样见下图:
由上图可以看出,首先,id列索引上,满足id = 10查询的记录,均加上X锁。同时,这些记录对应的主键索引上的记录也加上X锁。与组合er的唯一区别,组合二最多只有一个满足条件的记录,而在组合三中会将所有满足条件的记录全部加上锁。
结论:若id列上有非唯一索引,那么对应的所有满足SQL查询条件的记录,都会加上锁。同时,这些记录在主键索引上也会加上锁。
示例:
准备数据 mysql> create table t1 (id int,name varchar(10)); mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name); mysql> alter table t1 add index idx_id (id); mysql> insert into t1 values(2,'zz'),(6,'c'),(10,'b'),(10,'d'),(11,'f'),(15,'a'); 会话1 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> delete from t1 where id=10; Query OK, 2 rows affected (0.00 sec) 会话2 mysql> select * from t1; +------+------+ | id | name | +------+------+ | 2 | zz | | 6 | c | | 10 | b | | 10 | d | | 11 | f | | 15 | a | +------+------+ 6 rows in set (0.00 sec) mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> update t1 set id=11 where name='b'; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> update t1 set id=11 where name='d'; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> update t1 set id=11 where name='f'; Query OK, 0 rows affected (0.00 sec) Rows matched: 1 Changed: 0 Warnings: 0 mysql> update t1 set id=11 where name='c'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
组合四:id无索引+RC
相对于前面的组合,该组合相对特殊,因为id列上无索引,所以在 where id = 10 这个查询条件下,没法通过索引来过滤,因此只能全表扫描做过滤。对于该组合,MySQL又会进行怎样的加锁呢?看下图:
由于id列上无索引,因此只能走聚簇索引,进行全表扫描。由图可以看出满足条件的记录只有两条,但是,聚簇索引上的记录都会加上X锁。但在实际操作中,MySQL进行了改进,在进行过滤条件时,发现不满足条件后,会调用 unlock_row 方法,把不满足条件的记录放锁(违背了2PL原则)。这样做,保证了最后满足条件的记录加上锁,但是每条记录的加锁操作是不能省略的。
结论:若id列上没有索引,MySQL会走聚簇索引进行全表扫描过滤。由于是在MySQl Server层面进行的。因此每条记录无论是否满足条件,都会加上X锁,但是,为了效率考虑,MySQL在这方面进行了改进,在扫描过程中,若记录不满足过滤条件,会进行解锁操作。同时优化违背了2PL原则。
示例:
准备数据 mysql> create table t1 (id int,name varchar(10)); mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name); mysql> insert into t1 values(5,'a'),(3,'b'),(10,'d'),(2,'f'),(10,'g'),(9,'zz'); 会话1 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> delete from t1 where id=10; Query OK, 2 rows affected (0.00 sec) 会话2 mysql> select * from t1; +------+------+ | id | name | +------+------+ | 5 | a | | 3 | b | | 10 | d | | 2 | f | | 10 | g | | 9 | zz | +------+------+ 6 rows in set (0.00 sec) mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> update t1 set id=6 where name='a'; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update t1 set id=6 where name='b'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update t1 set id=6 where name='d'; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> update t1 set id=6 where name='f'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update t1 set id=6 where name='g'; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> update t1 set id=6 where name='zz'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update t1 set id=6 where name='zzf'; Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 实验结果与推倒的结论不一致, 实验结果看出只锁住了id=10的两行。
组合五:id主键+RR
id列是主键列,Repeatable Read隔离级别,针对delete from t1 where id = 10; 这条SQL,加锁与组合一:”id主键 + RC“一致。
结论:id是主键是,此SQL语句只需要在id = 10这条记录上加上X锁即可。
示例:
mysql> create table t1 (id int,name varchar(10)); mysql> alter table t1 add primary key (id); mysql> insert into t1 values(1,'a'),(4,'c'),(7,'b'),(10,'a'),(20,'d'),(30,'b'); mysql> select * from t1; +----+------+ | id | name | +----+------+ | 1 | a | | 4 | c | | 7 | b | | 10 | a | | 20 | d | | 30 | b | +----+------+ 6 rows in set (0.00 sec) mysql> select @@tx_isolation; +-----------------+ | @@tx_isolation | +-----------------+ | REPEATABLE-READ | +-----------------+ 1 row in set, 1 warning (0.00 sec) 会话1 mysql> begin; Query OK, 0 rows affected (0.01 sec) mysql> delete from t1 where id=10; Query OK, 1 row affected (0.00 sec) 会话2 mysql> select * from t1; +----+------+ | id | name | +----+------+ | 1 | a | | 4 | c | | 7 | b | | 10 | a | | 20 | d | | 30 | b | +----+------+ 6 rows in set (0.00 sec) mysql> update t1 set name='a1' where id=10; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> update t1 set name='a1' where id=11; Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 mysql> update t1 set name='a1' where id=7; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
组合六:id唯一索引+RR
id唯一索引 + RR的加锁与id唯一索引,RC一致。两个X锁,id唯一索引满足条件的记录上一个,对应的聚簇索引上的记录一个。
示例:
准备数据 mysql> create table t1 (id int,name varchar(10)); mysql> ALTER TABLE test.t1 ADD UNIQUE INDEX idx_id (id); mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name); mysql> insert into t1 values(1,'f'),(2,'zz'),(3,'b'),(5,'a'),(6,'c'),(10,'d'); 会话1 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> delete from t1 where id=10; Query OK, 1 row affected (0.01 sec) 会话2 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> select * from t1; +------+------+ | id | name | +------+------+ | 1 | f | | 2 | zz | | 3 | b | | 5 | a | | 6 | c | | 10 | d | +------+------+ 6 rows in set (0.00 sec) mysql> update t1 set id =100 where name='d'; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> update t1 set id =100 where name='c'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update t1 set id =101 where name='a'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
组合七:id不唯一索引+RR
在组合一到组合四中,隔离级别是Read Committed下,会出现幻读情况,但是在该组合Repeatable Read级别下,不会出现幻读情况,这是怎么回事呢?而MySQL又是如何给上述语句加锁呢?看下图:
该组合和组合三看起来很相似,但差别很大,在改组合中加入了一个间隙锁(Gap锁)。这个Gap锁就是相对于RC级别下,RR级别下不会出现幻读情况的关键。实质上,Gap锁不是针对于记录本身的,而是记录之间的Gap。所谓幻读,就是同一事务下,连续进行多次当前读,且读取一个范围内的记录(包括直接查询所有记录结果或者做聚合统计), 发现结果不一致(标准档案一般指记录增多, 记录的减少应该也算是幻读)。
那么该如何解决这个问题呢?如何保证多次当前读返回一致的记录,那么就需要在多个当前读之间,其他事务不会插入新的满足条件的记录并提交。为了实现该结果,Gap锁就应运而生。
如图所示,有些位置可以插入新的满足条件的记录,考虑到B+树的有序性,满足条件的记录一定是具有连续性的。因此会在 [4, b], [10, c], [10, d], [20, e] 之间加上Gap锁。
Insert操作时,如insert(10, aa),首先定位到 [4, b], [10, c]间,然后插入在插入之前,会检查该Gap是否加锁了,如果被锁上了,则Insert不能加入记录。因此通过第一次当前读,会把满足条件的记录加上X锁,还会加上三把Gap锁,将可能插入满足条件记录的3个Gap锁上,保证后续的Insert不能插入新的满足 id = 10 的记录,也就解决了幻读问题。
而在组合五,组合六中,同样是RR级别,但是不用加上Gap锁,在组合五中id是主键,组合六中id是Unique键,都能保证唯一性。一个等值查询,最多只能返回一条满足条件的记录,而且新的相同取值的记录是无法插入的。
结论:在RR隔离级别下,id列上有非唯一索引,对于上述的SQL语句;首先,通过id索引定位到第一条满足条件的记录,给记录加上X锁,并且给Gap加上Gap锁,然后在主键聚簇索引上满足相同条件的记录加上X锁,然后返回;之后读取下一条记录重复进行。直至第一条出现不满足条件的记录,此时,不需要给记录加上X锁,但是需要给Gap加上Gap锁吗,最后返回结果。
示例:
准备数据 mysql> create table t1 (id int,name varchar(10)); mysql> ALTER TABLE test.t1 ADD PRIMARY KEY (name); mysql> alter table t1 add index idx_id (id); mysql> insert into t1 values(1,'a'),(4,'b'),(10,'c'),(20,'e'),(10,'d'); 会话1 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> delete from t1 where id=10; Query OK, 2 rows affected (0.00 sec) 会话2 mysql> select * from t1; +------+------+ | id | name | +------+------+ | 1 | a | | 4 | b | | 10 | c | | 10 | d | | 20 | e | +------+------+ 5 rows in set (0.00 sec) mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> insert into t1 values(6,'aa'); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 values(6,'bb'); Query OK, 1 row affected (0.01 sec) mysql> insert into t1 values(6,'cc'); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> insert into t1 values(7,'cc'); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> insert into t1 values(8,'cc'); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> insert into t1 values(9,'cc'); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> insert into t1 values(10,'cc'); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> insert into t1 values(11,'cc'); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction mysql> insert into t1 values(11,'ff'); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 values(11,'g'); Query OK, 1 row affected (0.00 sec)
组合八:id无索引+RR
该组合中,id列上无索引,只能进行全表扫描,那么该如何加锁,看下图:
如图,可以看出这是一个很恐怖的事情,全表每条记录要加X锁,每个Gap加上Gap锁,如果表上存在大量数据时,又是什么情景呢?这种情况下,这个表,除了不加锁的快照读,其他任何加锁的并发SQL,均不能执行,不能更新,删除,插入,这样,全表锁死。
当然,和组合四一样,MySQL进行了优化,就是semi-consistent read。semi-consistent read开启的情况下,对于不满足条件的记录,MySQL会提前放锁,同时Gap锁也会释放。而semi-consistent read是如何触发:要么在Read Committed隔离级别下;要么在Repeatable Read隔离级别下,设置了 innodb_locks_unsafe_for_binlog 参数。
示例:
准备数据 mysql> create database cgwtest; mysql> CREATE TABLE `t` ( -> `id` int(11) NOT NULL AUTO_INCREMENT, -> `d` int(11) DEFAULT NULL, -> PRIMARY KEY (`id`) -> ) ENGINE=InnoDB; //mysql> insert into t1 values(5,'a'),(3,'b'),(10,'d'),(2,'f'),(10,'g'),(9,'zz'); mysql> insert into t values(1,1),(5,5),(10,10); 会话1 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> delete from t where id=5; Query OK, 1 rows affected (0.00 sec) 会话2 mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> select * from t; +----+------+ | id | d | +----+------+ | 1 | 1 | | 5 | 5 | | 10 | 10 | +----+------+ 3 rows in set (0.00 sec) mysql> insert into t values(2,2); ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
1,delete过程:
/* Basic lock modes */ enum lock_mode { LOCK_IS = 0, /* intention shared */ LOCK_IX, /* intention exclusive */ LOCK_S, /* shared */ LOCK_X, /* exclusive */ LOCK_AUTO_INC, /* locks the auto-inc counter of a table in an exclusive mode */ LOCK_NONE, /* this is used elsewhere to note consistent read */ LOCK_NUM = LOCK_NONE, /* number of lock modes */ LOCK_NONE_UNSET = 255 };
ut_ad(gap_mode == LOCK_ORDINARY || gap_mode == LOCK_GAP || gap_mode == LOCK_REC_NOT_GAP); #define ut_ad(EXPR) ut_a(EXPR) /** Debug statement. Does nothing unless UNIV_DEBUG is defined. */调试断言
核心方法:
/** Sets a lock on a record. mostly due to we cannot reposition a record in R-Tree (with the nature of splitting) @param[in] pcur cursor @param[in] rec record @param[in] index index @param[in] offsets rec_get_offsets(rec, index) @param[in] sel_mode select mode: SELECT_ORDINARY, SELECT_SKIP_LOKCED, or SELECT_NO_WAIT @param[in] mode lock mode @param[in] type LOCK_ORDINARY, LOCK_GAP, or LOC_REC_NOT_GAP @param[in] thr query thread @param[in] mtr mtr @return DB_SUCCESS, DB_SUCCESS_LOCKED_REC, or error code */ UNIV_INLINE dberr_t sel_set_rec_lock(btr_pcur_t *pcur, const rec_t *rec, dict_index_t *index, const ulint *offsets, select_mode sel_mode, ulint mode, ulint type, que_thr_t *thr, mtr_t *mtr) { trx_t *trx; dberr_t err = DB_SUCCESS; const buf_block_t *block; block = btr_pcur_get_block(pcur); trx = thr_get_trx(thr); trx_mutex_enter(trx); ut_ad(trx_can_be_handled_by_current_thread(trx)); bool too_many_locks = (UT_LIST_GET_LEN(trx->lock.trx_locks) > 10000); trx_mutex_exit(trx); if (too_many_locks) { if (buf_LRU_buf_pool_running_out()) { return (DB_LOCK_TABLE_FULL); } } if (index->is_clustered()) { err = lock_clust_rec_read_check_and_lock( lock_duration_t::REGULAR, block, rec, index, offsets, sel_mode, static_cast<lock_mode>(mode), type, thr); } else { if (dict_index_is_spatial(index)) { if (type == LOCK_GAP || type == LOCK_ORDINARY) { ut_ad(0); ib::error(ER_IB_MSG_1026) << "Incorrectly request GAP lock " "on RTree"; return (DB_SUCCESS); } err = sel_set_rtr_rec_lock(pcur, rec, index, offsets, sel_mode, mode, type, thr, mtr); } else { err = lock_sec_rec_read_check_and_lock( lock_duration_t::REGULAR, block, rec, index, offsets, sel_mode, static_cast<lock_mode>(mode), type, thr); } } return (err); }
dberr_t lock_clust_rec_read_check_and_lock( const lock_duration_t duration, const buf_block_t *block, const rec_t *rec, dict_index_t *index, const ulint *offsets, const select_mode sel_mode, const lock_mode mode, const ulint gap_mode, que_thr_t *thr) { dberr_t err; ulint heap_no; ut_ad(rec_offs_validate(rec, index, offsets)); if (srv_read_only_mode || index->table->is_temporary()) { return (DB_SUCCESS); } heap_no = page_rec_get_heap_no(rec); if (heap_no != PAGE_HEAP_NO_SUPREMUM) { lock_rec_convert_impl_to_expl(block, rec, index, offsets);//隐示锁转显示锁 } DEBUG_SYNC_C("after_lock_clust_rec_read_check_and_lock_impl_to_expl"); lock_mutex_enter();//系统锁 if (duration == lock_duration_t::AT_LEAST_STATEMENT) { lock_protect_locks_till_statement_end(thr); } ut_ad(mode != LOCK_X || lock_table_has(thr_get_trx(thr), index->table, LOCK_IX)); ut_ad(mode != LOCK_S || lock_table_has(thr_get_trx(thr), index->table, LOCK_IS)); err = lock_rec_lock(false, sel_mode, mode | gap_mode, block, heap_no, index, thr); MONITOR_INC(MONITOR_NUM_RECLOCK_REQ); lock_mutex_exit(); ut_ad(lock_rec_queue_validate(false, block, rec, index, offsets)); DEBUG_SYNC_C("after_lock_clust_rec_read_check_and_lock"); ut_ad(err == DB_SUCCESS || err == DB_SUCCESS_LOCKED_REC || err == DB_LOCK_WAIT || err == DB_DEADLOCK || err == DB_SKIP_LOCKED || err == DB_LOCK_NOWAIT); return (err); }
/** Tries to lock the specified record in the mode requested. If not immediately possible, enqueues a waiting lock request. This is a low-level function which does NOT look at implicit locks! Checks lock compatibility within explicit locks. This function sets a normal next-key lock, or in the case of a page supremum record, a gap type lock. @param[in] impl if true, no lock is set if no wait is necessary: we assume that the caller will set an implicit lock @param[in] sel_mode select mode: SELECT_ORDINARY, SELECT_SKIP_LOCKED, or SELECT_NO_WAIT @param[in] mode lock mode: LOCK_X or LOCK_S possibly ORed to either LOCK_GAP or LOCK_REC_NOT_GAP @param[in] block buffer block containing the record @param[in] heap_no heap number of record @param[in] index index of record @param[in,out] thr query thread @return DB_SUCCESS, DB_SUCCESS_LOCKED_REC, DB_LOCK_WAIT, DB_DEADLOCK, DB_SKIP_LOCKED, or DB_LOCK_NOWAIT */ static dberr_t lock_rec_lock(bool impl, select_mode sel_mode, ulint mode, const buf_block_t *block, ulint heap_no, dict_index_t *index, que_thr_t *thr) { ut_ad(lock_mutex_own()); ut_ad(!srv_read_only_mode); /* Implicit locks are equivalent to LOCK_X|LOCK_REC_NOT_GAP, so we can omit creation of explicit lock only if the requested mode was LOCK_REC_NOT_GAP */ ut_ad(!impl || ((mode & LOCK_REC_NOT_GAP) == LOCK_REC_NOT_GAP)); /* We try a simplified and faster subroutine for the most common cases */ switch (lock_rec_lock_fast(impl, mode, block, heap_no, index, thr)) { case LOCK_REC_SUCCESS return (DB_SUCCESS); case LOCK_REC_SUCCESS_CREATED: return (DB_SUCCESS_LOCKED_REC); case LOCK_REC_FAIL: return ( lock_rec_lock_slow(impl, sel_mode, mode, block, heap_no, index, thr)); default: ut_error; } }
delete语句调用堆栈:
lock_rec_lock(bool impl, select_mode sel_mode, ulint mode, const buf_block_t * block, ulint heap_no, dict_index_t * index, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\lock\lock0lock.cc:1667) lock_clust_rec_read_check_and_lock(const lock_duration_t duration, const buf_block_t * block, const rec_t * rec, dict_index_t * index, const ulint * offsets, const select_mode sel_mode, const lock_mode mode, const ulint gap_mode, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\lock\lock0lock.cc:5701) sel_set_rec_lock(btr_pcur_t * pcur, const rec_t * rec, dict_index_t * index, const ulint * offsets, select_mode sel_mode, ulint mode, ulint type, que_thr_t * thr, mtr_t * mtr) (\root\mysql-8.0.20\storage\innobase\row\row0sel.cc:1184) row_search_mvcc(unsigned char * buf, page_cur_mode_t mode, row_prebuilt_t * prebuilt, ulint match_mode, const ulint direction) (\root\mysql-8.0.20\storage\innobase\row\row0sel.cc:5214) ha_innobase::general_fetch(ha_innobase * const this, uchar * buf, uint direction, uint match_mode) (\root\mysql-8.0.20\storage\innobase\handler\ha_innodb.cc:9949) ha_innobase::rnd_next(ha_innobase * const this, uchar * buf) (\root\mysql-8.0.20\storage\innobase\handler\ha_innodb.cc:10226) handler::ha_rnd_next(handler * const this, uchar * buf) (\root\mysql-8.0.20\sql\handler.cc:2966) TableScanIterator::Read(TableScanIterator * const this) (\root\mysql-8.0.20\sql\records.cc:423) Sql_cmd_delete::delete_from_single_table(Sql_cmd_delete * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_delete.cc:503) Sql_cmd_delete::execute_inner(Sql_cmd_delete * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_delete.cc:823) Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_select.cc:725) mysql_execute_command(THD * thd, bool first_level) (\root\mysql-8.0.20\sql\sql_parse.cc:3471) mysql_parse(THD * thd, Parser_state * parser_state) (\root\mysql-8.0.20\sql\sql_parse.cc:5306) dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (\root\mysql-8.0.20\sql\sql_parse.cc:1776) do_command(THD * thd) (\root\mysql-8.0.20\sql\sql_parse.cc:1274) handle_connection(void * arg) (\root\mysql-8.0.20\sql\conn_handler\connection_handler_per_thread.cc:302) pfs_spawn_thread(void * arg) (\root\mysql-8.0.20\storage\perfschema\pfs.cc:2854) libpthread.so.0!start_thread (未知源:0) libc.so.6!clone (未知源:0)
执行删除:
row_upd_clust_step(upd_node_t * node, que_thr_t * const thr) (\root\mysql-8.0.20\storage\innobase\row\row0upd.cc:2982) row_upd(upd_node_t * node, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0upd.cc:3175) row_upd_step(que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0upd.cc:3306) row_update_for_mysql_using_upd_graph(const unsigned char * mysql_rec, row_prebuilt_t * prebuilt) (\root\mysql-8.0.20\storage\innobase\row\row0mysql.cc:2347) row_update_for_mysql(const unsigned char * mysql_rec, row_prebuilt_t * prebuilt) (\root\mysql-8.0.20\storage\innobase\row\row0mysql.cc:2443) ha_innobase::delete_row(ha_innobase * const this, const uchar * record) (\root\mysql-8.0.20\storage\innobase\handler\ha_innodb.cc:9374) handler::ha_delete_row(handler * const this, const uchar * buf) (\root\mysql-8.0.20\sql\handler.cc:7894) Sql_cmd_delete::delete_from_single_table(Sql_cmd_delete * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_delete.cc:528) Sql_cmd_delete::execute_inner(Sql_cmd_delete * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_delete.cc:823) Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_select.cc:725) mysql_execute_command(THD * thd, bool first_level) (\root\mysql-8.0.20\sql\sql_parse.cc:3471) mysql_parse(THD * thd, Parser_state * parser_state) (\root\mysql-8.0.20\sql\sql_parse.cc:5306) dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (\root\mysql-8.0.20\sql\sql_parse.cc:1776) do_command(THD * thd) (\root\mysql-8.0.20\sql\sql_parse.cc:1274) handle_connection(void * arg) (\root\mysql-8.0.20\sql\conn_handler\connection_handler_per_thread.cc:302) pfs_spawn_thread(void * arg) (\root\mysql-8.0.20\storage\perfschema\pfs.cc:2854) libpthread.so.0!start_thread (未知源:0) libc.so.6!clone (未知源:0)
2,insert过程:
核心方法:
/** Checks if some other transaction has a conflicting explicit lock request in the queue, so that we have to wait. @return lock or NULL */ static const lock_t *lock_rec_other_has_conflicting( ulint mode, /*!< in: LOCK_S or LOCK_X, possibly ORed to LOCK_GAP or LOC_REC_NOT_GAP, LOCK_INSERT_INTENTION */ const buf_block_t *block, /*!< in: buffer block containing the record */ ulint heap_no, /*!< in: heap number of the record */ const trx_t *trx) /*!< in: our transaction */ { ut_ad(lock_mutex_own()); ut_ad(!(mode & ~(ulint)(LOCK_MODE_MASK | LOCK_GAP | LOCK_REC_NOT_GAP | LOCK_INSERT_INTENTION))); ut_ad(!(mode & LOCK_PREDICATE)); ut_ad(!(mode & LOCK_PRDT_PAGE)); RecID rec_id{block, heap_no}; const bool is_supremum = rec_id.is_supremum(); return (Lock_iter::for_each(rec_id, [=](const lock_t *lock) { return (!(lock_rec_has_to_wait(trx, mode, lock, is_supremum))); })); }
/** Checks if a lock request for a new lock has to wait for request lock2. @return true if new lock has to wait for lock2 to be removed */ UNIV_INLINE bool lock_rec_has_to_wait( const trx_t *trx, /*!< in: trx of new lock */ ulint type_mode, /*!< in: precise mode of the new lock to set: LOCK_S or LOCK_X, possibly ORed to LOCK_GAP or LOCK_REC_NOT_GAP, LOCK_INSERT_INTENTION */ const lock_t *lock2, /*!< in: another record lock; NOTE that it is assumed that this has a lock bit set on the same record as in the new lock we are setting */ bool lock_is_on_supremum) /*!< in: true if we are setting the lock on the 'supremum' record of an index page: we know then that the lock request is really for a 'gap' type lock */ { ut_ad(trx && lock2); ut_ad(lock_get_type_low(lock2) == LOCK_REC); const bool is_hp = trx_is_high_priority(trx); if (trx != lock2->trx && !lock_mode_compatible(static_cast<lock_mode>(LOCK_MODE_MASK & type_mode), lock_get_mode(lock2))) { /* If our trx is High Priority and the existing lock is WAITING and not high priority, then we can ignore it. */ if (is_hp && lock2->is_waiting() && !trx_is_high_priority(lock2->trx)) { return (false); } /* We have somewhat complex rules when gap type record locks cause waits */ if ((lock_is_on_supremum || (type_mode & LOCK_GAP)) && !(type_mode & LOCK_INSERT_INTENTION)) { /* Gap type locks without LOCK_INSERT_INTENTION flag do not need to wait for anything. This is because different users can have conflicting lock types on gaps. */ return (false); } if (!(type_mode & LOCK_INSERT_INTENTION) && lock_rec_get_gap(lock2)) { /* Record lock (LOCK_ORDINARY or LOCK_REC_NOT_GAP does not need to wait for a gap type lock */ return (false); } if ((type_mode & LOCK_GAP) && lock_rec_get_rec_not_gap(lock2)) { /* Lock on gap does not need to wait for a LOCK_REC_NOT_GAP type lock */ return (false); } if (lock_rec_get_insert_intention(lock2)) { /* No lock request needs to wait for an insert intention lock to be removed. This is ok since our rules allow conflicting locks on gaps. This eliminates a spurious deadlock caused by a next-key lock waiting for an insert intention lock; when the insert intention lock was granted, the insert deadlocked on the waiting next-key lock. Also, insert intention locks do not disturb each other. */ return (false); } return (true); } return (false); }
调用堆栈
lock_rec_other_has_conflicting(ulint mode, const buf_block_t * block, ulint heap_no, const trx_t * trx) (\root\mysql-8.0.20\storage\innobase\lock\lock0lock.cc:805) lock_rec_insert_check_and_lock(ulint flags, const rec_t * rec, buf_block_t * block, dict_index_t * index, que_thr_t * thr, mtr_t * mtr, ulint * inherit) (\root\mysql-8.0.20\storage\innobase\lock\lock0lock.cc:5291) btr_cur_ins_lock_and_undo(ulint flags, btr_cur_t * cursor, dtuple_t * entry, que_thr_t * thr, mtr_t * mtr, ulint * inherit) (\root\mysql-8.0.20\storage\innobase\btr\btr0cur.cc:2621) btr_cur_optimistic_insert(ulint flags, btr_cur_t * cursor, ulint ** offsets, mem_heap_t ** heap, dtuple_t * entry, rec_t ** rec, big_rec_t ** big_rec, que_thr_t * thr, mtr_t * mtr) (\root\mysql-8.0.20\storage\innobase\btr\btr0cur.cc:2841) row_ins_clust_index_entry_low(uint32_t flags, ulint mode, dict_index_t * index, ulint n_uniq, dtuple_t * entry, que_thr_t * thr, bool dup_chk_only) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:2515) row_ins_clust_index_entry(dict_index_t * index, dtuple_t * entry, que_thr_t * thr, bool dup_chk_only) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3095) row_ins_index_entry(dict_index_t * index, dtuple_t * entry, uint32_t & multi_val_pos, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3286) row_ins_index_entry_step(ins_node_t * node, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3424) row_ins(ins_node_t * node, que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3542) row_ins_step(que_thr_t * thr) (\root\mysql-8.0.20\storage\innobase\row\row0ins.cc:3666) row_insert_for_mysql_using_ins_graph(const unsigned char * mysql_rec, row_prebuilt_t * prebuilt) (\root\mysql-8.0.20\storage\innobase\row\row0mysql.cc:1585) row_insert_for_mysql(const unsigned char * mysql_rec, row_prebuilt_t * prebuilt) (\root\mysql-8.0.20\storage\innobase\row\row0mysql.cc:1715) ha_innobase::write_row(ha_innobase * const this, uchar * record) (\root\mysql-8.0.20\storage\innobase\handler\ha_innodb.cc:8530) handler::ha_write_row(handler * const this, uchar * buf) (\root\mysql-8.0.20\sql\handler.cc:7837) write_record(THD * thd, TABLE * table, COPY_INFO * info, COPY_INFO * update) (\root\mysql-8.0.20\sql\sql_insert.cc:2111) Sql_cmd_insert_values::execute_inner(Sql_cmd_insert_values * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_insert.cc:621) Sql_cmd_dml::execute(Sql_cmd_dml * const this, THD * thd) (\root\mysql-8.0.20\sql\sql_select.cc:725) mysql_execute_command(THD * thd, bool first_level) (\root\mysql-8.0.20\sql\sql_parse.cc:3471) mysql_parse(THD * thd, Parser_state * parser_state) (\root\mysql-8.0.20\sql\sql_parse.cc:5306) dispatch_command(THD * thd, const COM_DATA * com_data, enum_server_command command) (\root\mysql-8.0.20\sql\sql_parse.cc:1776)
结论:在Repeatable Read隔离级别下,如果进行全表扫描的当前读,那么会锁上表上的所有记录,并且所有的Gap加上Gap锁,杜绝所有的 delete/update/insert 操作。当然在MySQL中,可以触发 semi-consistent read来缓解锁开销与并发影响,但是semi-consistent read本身也会带来其他的问题,不建议使用。
组合九:Serializable
在最后组合中,对于上诉的删除SQL语句,加锁过程和组合八一致。但是,对于查询语句(比如select * from T1 where id = 10)来说,在RC,RR隔离级别下,都是快照读,不加锁。在Serializable隔离级别下,无论是查询语句也会加锁,也就是说快照读不存在了,MVCC降级为Lock-Based CC。
结论:在MySQL/InnoDB中,所谓的读不加锁,并不适用于所有的情况,而是和隔离级别有关。在Serializable隔离级别下,所有的操作都会加锁。
四、其它:
1. 数据库事务ACID特性
数据库事务的4个特性:
原子性(Atomic): 事务中的多个操作,不可分割,要么都成功,要么都失败; All or Nothing.
一致性(Consistency): 事务操作之后, 数据库所处的状态和业务规则是一致的; 比如a,b账户相互转账之后,总金额不变;
隔离性(Isolation): 多个事务之间就像是串行执行一样,不相互影响;
持久性(Durability): 事务提交后被持久化到永久存储.
2. 隔离性
其中 隔离性 分为了四种:
READ UNCOMMITTED:可以读取未提交的数据,未提交的数据称为脏数据,所以又称脏读。此时:幻读,不可重复读和脏读均允许;
READ COMMITTED:只能读取已经提交的数据;此时:允许幻读和不可重复读,但不允许脏读,所以RC隔离级别要求解决脏读;
REPEATABLE READ:同一个事务中多次执行同一个select,读取到的数据没有发生改变;此时:允许幻读,但不允许不可重复读和脏读,所以RR隔离级别要求解决不可重复读;
SERIALIZABLE: 幻读,不可重复读和脏读都不允许,所以serializable要求解决幻读;
3. 几个概念
脏读:可以读取未提交的数据。RC 要求解决脏读;
不可重复读:同一个事务中多次执行同一个select, 读取到的数据发生了改变(被其它事务update并且提交);
可重复读:同一个事务中多次执行同一个select, 读取到的数据没有发生改变(一般使用MVCC实现);RR各级级别要求达到可重复读的标准;
幻读:同一个事务中多次执行同一个select, 读取到的数据行发生改变。也就是行数减少或者增加了(被其它事务delete/insert并且提交)。SERIALIZABLE要求解决幻读问题;
这里一定要区分 不可重复读 和 幻读:
不可重复读的重点是修改:
同样的条件的select, 你读取过的数据, 再次读取出来发现值不一样了
幻读的重点在于新增或者删除:
同样的条件的select, 第1次和第2次读出来的记录数不一样
从结果上来看, 两者都是为多次读取的结果不一致。但如果你从实现的角度来看, 它们的区别就比较大:
对于前者, 在RC下只需要锁住满足条件的记录,就可以避免被其它事务修改,也就是 select for update, select in share mode; RR隔离下使用MVCC实现可重复读;
对于后者, 要锁住满足条件的记录及所有这些记录之间的gap,也就是需要 gap lock。
而ANSI SQL标准没有从隔离程度进行定义,而是定义了事务的隔离级别,同时定义了不同事务隔离级别解决的三大并发问题:
Isolation Level |
Dirty Read |
Unrepeatable Read |
Phantom Read |
Read UNCOMMITTED |
YES |
YES |
YES |
READ COMMITTED |
NO |
YES |
YES |
READ REPEATABLE |
NO |
NO |
YES |
SERIALIZABLE |
NO |
NO |
NO |
4. 数据库的默认隔离级别
除了MySQL默认采用RR隔离级别之外,其它几大数据库都是采用RC隔离级别。
但是他们的实现也是极其不一样的。Oracle仅仅实现了RC 和 SERIALIZABLE隔离级别。默认采用RC隔离级别,解决了脏读。但是允许不可重复读和幻读。其SERIALIZABLE则解决了脏读、不可重复读、幻读。
MySQL的实现:MySQL默认采用RR隔离级别,SQL标准是要求RR解决不可重复读的问题,但是因为MySQL采用了gap lock,所以实际上MySQL的RR隔离级别也解决了幻读的问题。那么MySQL的SERIALIZABLE是怎么回事呢?其实MySQL的SERIALIZABLE采用了经典的实现方式,对读和写都加锁。
5. MySQL 中RC和RR隔离级别的区别
MySQL数据库中默认隔离级别为RR,但是实际情况是使用RC 和 RR隔离级别的都不少。好像淘宝、网易都是使用的 RC 隔离级别。那么在MySQL中 RC 和 RR有什么区别呢?我们该如何选择呢?为什么MySQL将RR作为默认的隔离级别呢?
5.1 RC 与 RR 在锁方面的区别
1> 显然 RR 支持 gap lock(next-key lock),而RC则没有gap lock。因为MySQL的RR需要gap lock来解决幻读问题。而RC隔离级别则是允许存在不可重复读和幻读的。所以RC的并发一般要好于RR;
2> RC 隔离级别,通过 where 条件过滤之后,不符合条件的记录上的行锁,会释放掉(虽然这里破坏了“两阶段加锁原则”);但是RR隔离级别,即使不符合where条件的记录,也不会是否行锁和gap lock;所以从锁方面来看,RC的并发应该要好于RR;另外 insert into t select ... from s where 语句在s表上的锁也是不一样的。
2563 2048 + 512 + 3
LOCK_INSERT_INTENTION
/* Basic lock modes */
enum lock_mode {
LOCK_IS = 0, /* intention shared */
LOCK_IX, /* intention exclusive */
LOCK_S, /* shared */
LOCK_X, /* exclusive */
LOCK_AUTO_INC, /* locks the auto-inc counter of a table
in an exclusive mode */
LOCK_NONE, /* this is used elsewhere to note consistent read */
LOCK_NUM = LOCK_NONE, /* number of lock modes */
LOCK_NONE_UNSET = 255
};