# MySQL长事务导致的Table Metadata Lock

## 什么是Table Metadata Lock

在MySQL以前的版本中，存在这样一个bug：

> Description: If user1 has an active transaction on a table and then user2 drops this table, then user1 does COMMIT, then in the binlog we have something like: DROP TABLE t; BEGIN; INSERT INTO t ... ; COMMIT; which is wrong.

MySQL官方文档链接： <http://bugs.mysql.com/bug.php?id=989>

这个bug大致意思是说：当一个会话在主库执行DML操作还没提交时，另一个会话对同一个对象执行了`DDL`操作如`drop table`，而由于MySQL的`binlog`是基于事务提交的先后顺序进行记录的，因此在从库上应用时，就出现了先`drop table`，然后再向`table`中`insert`的情况，导致从库应用出错。

因此，MySQL在5.5.3版本后引入了`Metadata lock`，**只有在事务结束后才会释放Metadata lock，因此在事务提交或回滚前，是无法进行DDL操作的。**

MySQL官方文档位置： <http://dev.mysql.com/doc/refman/5.6/en/metadata-locking.html>

## 遇到的Metadata Lock问题

前几天时候，公司的开发同事让对一张表添加字段，由于该表数据量很大，因此使用了`pt-online-change-schema`工具进行字段的添加，在添加的过程中发现进度非常慢，通过`shop processlist`发现以及积累了大量的`Metadata lock`：**Waiting for table metadata lock**。

这些语句很明显是被添加字段的`DDL`所阻塞，但是`DDL`又是被谁阻塞了呢？

查询当前正在进行的事务：

```
mysql> select * from information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 7202
                 trx_state: RUNNING
               trx_started: 2016-07-20 23:02:53
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 0
       trx_mysql_thread_id: 52402
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 0
          trx_lock_structs: 0
     trx_lock_memory_bytes: 360
           trx_rows_locked: 0
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: READ COMMITTED
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 10000
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
```

发现一个正在运行的事务，从`trx_started`字段可以判断出，该事务已经运行了很久，一直没有结束，看来就是这个事务阻塞了添加字段的DDL语句。

根据查询到的`trx_started`时间以及`trx_mysql_thread_id`到MySQL的`general log`中查找，当然前提是开启了`general log`的功能，在`general`日志中对应的时间发现该`thread`执行了语句： `set autocommit=0;` 关闭了自动提交，再往下看，oh my god......下面居然是一堆`SELECT`语句！

好了，终于找到原因，`kill`掉先：

```
mysql> kill 52402;
Query OK, 0 rows affected (0.00 sec)
```

之后便可以正常执行下去了。

## 测试验证

为了进一步确认问题的原因并验证，进行模拟测试：

* 会话1：显式开启事务，执行`SELECT`：

```
mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test;
+----------+
| date     |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (0.00 sec)
```

* 会话2：对test表执行DDL：

```
mysql> alter table test add index `date`(`date`);
```

语句被阻塞，show processlist查看状态：

```
mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
| Id    | User        | Host      | db   | Command | Time   | State                                                                       | Info                                      |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
|    16 | system user |           | NULL | Connect | 540155 | Waiting for master to send event                                            | NULL                                      |
|    17 | system user |           | NULL | Connect | 529732 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL                                      |
| 51673 | root        | localhost | test | Sleep   |     55 |                                                                             | NULL                                      |
| 51681 | root        | localhost | test | Query   |      0 | init                                                                        | show processlist                          |
| 51683 | root        | localhost | test | Query   |     29 | Waiting for table metadata lock                                             | alter table test add index `date`(`date`) |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
5 rows in set (0.00 sec)
```

可以看到`alter table`语句的状态为`Waiting for table metadata lock`

* 会话3：对test表进行查询：

`mysql> select * from test;` 同样被阻塞：

```
mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
| Id    | User        | Host      | db   | Command | Time   | State                                                                       | Info                                      |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
|    16 | system user |           | NULL | Connect | 540305 | Waiting for master to send event                                            | NULL                                      |
|    17 | system user |           | NULL | Connect | 529882 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL                                      |
| 51673 | root        | localhost | test | Sleep   |    205 |                                                                             | NULL                                      |
| 51681 | root        | localhost | test | Query   |      0 | init                                                                        | show processlist                          |
| 51683 | root        | localhost | test | Query   |    179 | Waiting for table metadata lock                                             | alter table test add index `date`(`date`) |
| 51703 | root        | localhost | test | Query   |     18 | Waiting for table metadata lock                                             | select * from test                        |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
6 rows in set (0.00 sec)
```

接下来我们将 会话1 的事务提交，效果如下：

* 会话1：

```
mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from test;
+----------+
| date     |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (0.00 sec)

mysql> commit;
Query OK, 0 rows affected (0.00 sec)
```

* 会话2：

```
mysql> alter table test add index `date`(`date`);
Query OK, 0 rows affected (3 min 49.87 sec)
Records: 0  Duplicates: 0  Warnings: 0
```

* 会话3：

```
mysql> select * from test;
+----------+
| date     |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (1 min 8.27 sec)
```

* `show processlist`：

```
mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
| Id    | User        | Host      | db   | Command | Time   | State                                                                       | Info             |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
|    16 | system user |           | NULL | Connect | 540411 | Waiting for master to send event                                            | NULL             |
|    17 | system user |           | NULL | Connect | 529988 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL             |
| 51673 | root        | localhost | test | Sleep   |     55 |                                                                             | NULL             |
| 51681 | root        | localhost | test | Query   |      0 | init                                                                        | show processlist |
| 51683 | root        | localhost | test | Sleep   |    285 |                                                                             | NULL             |
| 51703 | root        | localhost | test | Sleep   |    124 |                                                                             | NULL             |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
6 rows in set (0.00 sec)
```

可以看到，当会话1提交事务后，会话2和会话3的语句便可以正常执行了，由于被阻塞的原因，因此执行时间分别为 ( 3 min 49.87 sec ) 、( 1 min 8.27 sec )

## 总结

* 对于纯SELECT操作来说，完全没有必要添加事务，MySQL的`innodb`是基于`MVCC`多版本控制，加事务没有任何意义
* 需要使用到事务时，也要尽量缩小事务的运行时间，一个事务中不要包含太多的语句


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://note.1810.tech/other/mysql/mysqlchang-shi-wu-dao-zhi-de-table-metadata-lock.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
