我有三张表;document, category 和 doc_cat_map(document 和 category 的映射表), document 和 category 的映射关系
存储在doc_cat_map中。
document 中的flag 标志着document 的状态,比如 flag=0 标志着document 的状态为active, flag =1 为inactive 等。
表结构如下:
document:+---------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| flag | tinyint(4) | YES | MUL | NULL | |
+---------------+------------------+------+-----+---------+----------------+category:
+---------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(255) | NO | MUL | NULL | |
| taxonomies_id | int(10) unsigned | NO | MUL | NULL | |
+---------------+------------------+------+-----+---------+----------------+doc_cat_map:
+-------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| category_id | int(10) unsigned | NO | MUL | NULL | |
| doc_id | int(10) unsigned | NO | MUL | NULL | |
+-------------+------------------+------+-----+---------+----------------+ 我现在想得到对应每个category 的 active 和 inactive 的 document 数目, 我的sql 是:
select d.flag as flag, c.id as cid, count(*) as num from category as c left join doc_cat_map as dcm on dcm.category_id = c.id left join document as d on d.id = dcm.doc_id group by cid, flag order by cid ASC;我的意思很明白:把 3个表join 然后group by cid, flag(根据 类别和文档标志分类)
explain 一下:
explain select d.flag as flag, c.id as cid, count(*) as num from category as c left join doc_cat_map as dcm on dcm.category_id = c.id left join document as d on d.id = dcm.doc_id group by cid, flag order by cid ASC;
得到结果是:
+----+-------------+-------+--------+--------------------------------------+--------------------+---------+---------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------------------+--------------------+---------+---------------------------+------+---------------------------------+
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 133 | Using temporary; Using filesort |
| 1 | SIMPLE | cdr | ref | category_doc_index,category_id_index | category_doc_index | 4 | wc2_classifier.c.id | 2521 | Using index |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | wc2_classifier.cdr.doc_id | 1 | |
+----+-------------+-------+--------+--------------------------------------+--------------------+---------+---------------------------+------+---------------------------------+在 extra 中有 using temporary; Using filesort
请问高手:如何优化这个查询?
存储在doc_cat_map中。
document 中的flag 标志着document 的状态,比如 flag=0 标志着document 的状态为active, flag =1 为inactive 等。
表结构如下:
document:+---------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| flag | tinyint(4) | YES | MUL | NULL | |
+---------------+------------------+------+-----+---------+----------------+category:
+---------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(255) | NO | MUL | NULL | |
| taxonomies_id | int(10) unsigned | NO | MUL | NULL | |
+---------------+------------------+------+-----+---------+----------------+doc_cat_map:
+-------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| category_id | int(10) unsigned | NO | MUL | NULL | |
| doc_id | int(10) unsigned | NO | MUL | NULL | |
+-------------+------------------+------+-----+---------+----------------+ 我现在想得到对应每个category 的 active 和 inactive 的 document 数目, 我的sql 是:
select d.flag as flag, c.id as cid, count(*) as num from category as c left join doc_cat_map as dcm on dcm.category_id = c.id left join document as d on d.id = dcm.doc_id group by cid, flag order by cid ASC;我的意思很明白:把 3个表join 然后group by cid, flag(根据 类别和文档标志分类)
explain 一下:
explain select d.flag as flag, c.id as cid, count(*) as num from category as c left join doc_cat_map as dcm on dcm.category_id = c.id left join document as d on d.id = dcm.doc_id group by cid, flag order by cid ASC;
得到结果是:
+----+-------------+-------+--------+--------------------------------------+--------------------+---------+---------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------------------+--------------------+---------+---------------------------+------+---------------------------------+
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 133 | Using temporary; Using filesort |
| 1 | SIMPLE | cdr | ref | category_doc_index,category_id_index | category_doc_index | 4 | wc2_classifier.c.id | 2521 | Using index |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | wc2_classifier.cdr.doc_id | 1 | |
+----+-------------+-------+--------+--------------------------------------+--------------------+---------+---------------------------+------+---------------------------------+在 extra 中有 using temporary; Using filesort
请问高手:如何优化这个查询?
把left join 改成 inner join, 效果是一样的, 这个从explain 就能看出来:
explain select d.flag as flag, c.id as cid, count(*) as num from categories as c inner join cat_doc_relation as cdr on cdr.category_id = c.id inner join documents as d on d.id = cdr.doc_id by cid, flag order by cid ASC;
得到的结果是:
+----+-------------+-------+--------+---------------------------------------------------+--------------------+---------+---------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------------------------+--------------------+---------+---------------------------+------+----------------------------------------------+
| 1 | SIMPLE | c | ALL | PRIMARY,taxonomies_id_index | NULL | NULL | NULL | 100 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | cdr | ref | category_doc_index,doc_id_index,category_id_index | category_doc_index | 4 | wc2_classifier.c.id | 2521 | Using index |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | wc2_classifier.cdr.doc_id | 1 | |
+----+-------------+-------+--------+---------------------------------------------------+--------------------+---------+---------------------------+------+----------------------------------------------+
基本上没有什么改变。关键是queuy id 为1的type蜕变成 ALL,而且我建的索引(category 那张表),它根本没法用, 是不是group by 的影响?
有没有解决方法?
连接字段上都建索引了。
三张表的索引结构如下:
show index from documents;
+-----------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-----------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| documents | 0 | PRIMARY | 1 | id | A | 282528 | NULL | NULL | | BTREE | |
| documents | 0 | tax_doc_index | 1 | tax_id | A | NULL | NULL | NULL | | BTREE | |
| documents | 0 | tax_doc_index | 2 | scd_id | A | NULL | NULL | NULL | YES | BTREE | |
| documents | 1 | doc_title_index | 1 | title | A | NULL | NULL | NULL | | BTREE | |
| documents | 1 | doc_flag | 1 | flag | A | NULL | NULL | NULL | YES | BTREE | |
+-----------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+show index from categories;
+------------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+------------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
| categories | 0 | PRIMARY | 1 | id | A | 128 | NULL | NULL | | BTREE | |
| categories | 0 | category_name_index | 1 | taxonomies_id | A | NULL | NULL | NULL | | BTREE | |
| categories | 0 | category_name_index | 2 | name | A | 128 | NULL | NULL | | BTREE | |
| categories | 1 | tax_id_index | 1 | taxonomies_id | A | 1 | NULL | NULL | | BTREE | |
+------------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+
show index from cat_doc_relation;
+------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| cat_doc_relation | 0 | PRIMARY | 1 | id | A | 252089 | NULL | NULL | | BTREE | |
| cat_doc_relation | 0 | category_doc_index | 1 | doc_id | A | NULL | NULL | NULL | | BTREE | |
| cat_doc_relation | 0 | category_doc_index | 2 | category_id | A | 252089 | NULL | NULL | | BTREE | |
| cat_doc_relation | 1 | category_id_index | 1 | category_id | A | 128 | NULL | NULL | | BTREE | |
+------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+该有的索引都有了 ,速度依然不快。
explian select d.flag as flag, c.id as cid from categories as c inner join cat_doc_relation as cdr on cdr.category_id = c.id inner join documents as d on d.id = cdr.doc_id by cid, flag ;
explian select d.flag as flag, c.id as cid from categories as c inner join cat_doc_relation as cdr on cdr.category_id = c.id inner join documents as d on d.id = cdr.doc_id by cid, flag order by cid ASC;
看一下差异。
count(*) as num, d.flag as flag from categories as c left join cat_doc_relation as cdr on cdr.category_id = c.id left join documents as d on d.id = cdr.doc_id where c.taxonomies_id = 1 group by cid, flag ;
+----+-------------+-------+--------+----------------------------------+-------------------+---------+---------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------+-------------------+---------+---------------------------+------+----------------------------------------------+
| 1 | SIMPLE | c | ALL | category_name_index,tax_id_index | NULL | NULL | NULL | 97 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | cdr | ref | category_id_index | category_id_index | 4 | wc2_classifier.c.id | 1969 | |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | wc2_classifier.cdr.doc_id | 1 | |
+----+-------------+-------+--------+----------------------------------+-------------------+---------+---------------------------+------+----------------------------------------------explain select c.id as cid, c.name as name, count(*) as num, d.flag as flag from categories as c left join cat_doc_relation as cdr on cdr.category_id = c.id left join documents as d on d.id = cdr.doc_id where c.taxonomies_id = 1 group by cid, flag order by cid ASC;
+----+-------------+-------+--------+----------------------------------+-------------------+---------+---------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------+-------------------+---------+---------------------------+------+----------------------------------------------+
| 1 | SIMPLE | c | ALL | category_name_index,tax_id_index | NULL | NULL | NULL | 97 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | cdr | ref | category_id_index | category_id_index | 4 | wc2_classifier.c.id | 1969 | |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | wc2_classifier.cdr.doc_id | 1 | |
+----+-------------+-------+--------+----------------------------------+-------------------+---------+---------------------------+------+----------------------------------------------+从 explain 的结果看,两者差异不大
事实上, order by cid, cid 是 category 的primary key, 肯定是索引了。这是什么原因?