分别有2个表:region1和region2 每个表中包含sampleID,chromosome,start,stop等四列;
需求是查找在两个表中满足 region1中的记录与region2中的记录位置交集(overlap)长度均达到或超过他们自身长度的50%以上的记录。下面是我写的sql语句,可以执行,但是速度比较慢。求优化sql
谢谢!
select region1.*,region2.*
from region1,region2 
where region1.sampleID=region2.sampleID and region1.chromosome=region2.chromosome
and

( #region1的头与region2的尾相交
(region1.`start` between region2.`start` and region2.`stop`) and (region2.`stop` between region1.`start` and region1.`stop`) 
and (region2.`stop`-region1.`start`+1)/(region2.`stop`-region2.`start`+1)>=0.5
 and (region2.`stop`-region1.`start`+1)/(region1.`stop` -region1.`start`+1)>=0.5
)
OR
(#region1的尾与region2的头相交
(region1.`stop` between region2.`start` and region2.`stop`) and (region2.`start` between region1.`start` and region1.`stop`) 
and (region1.`stop`-region2.`start`+1)/(region2.`stop`-region2.`start`+1)>=0.5
and (region1.`stop`-region2.`start`+1)/(region1.`stop` -region1.`start`+1)>=0.5

OR
(#region1落在region2中
(region1.`start` between region2.`start` and region2.`stop`) and (region1.`stop` between region2.`start` and region2.`stop`) 
and (region1.`stop`-region1.`start`+1)/(region2.`stop`-region2.`start`+1)>=0.5
and (region1.`stop`-region1.`start`+1)/(region1.`stop` -region1.`start`+1)>=0.5
)
OR
(#region2落在region1中
(region2.`start` between region1.`start` and region1.`stop`) and (region2.`stop` between region1.`start` and region1.`stop`) 
and (region2.`stop`-region2.`start`+1)/(region2.`stop`-region2.`start`+1)>=0.5
and (region2.`stop`-region2.`start`+1)/(region1.`stop` -region1.`start`+1)>=0.5
)
)