数据仓库中有种拉链算法来实现历史数据的存储,一般是通过hash比较来实现的,现在实际问题如下,请教各位,如果不用传统的拉链算法,还有什么别的办法实现这个功能。
    member表,字段  member_id,member_status,start_date,end_date
                     A,enabled,2010-03-04,2010-03-05
                     A,diabled,2010-03-06,2010-03-08
                     A,tbd,   2010-03-09,2010-04-15
                     A,delete, 2010-04-16,2010-06-07
                     A,enabled,2010-06-08,3000-12-31
                     B,diabled,2010-03-02,2010-03-08
                     B,enabled,2010-03-09,2010-04-01
                     B,unknown,2010-06-08,2010-04-10
                     B,enabled,2010-04-11,2010-04-30
                     B,tbd,    2010-05-01,3000-12-31
   product表,字段  product_id,member_id,product_status,start_date,end_date
                     001,A,online, 2010-03-04,2010-03-21
                     001,A,wait,   2010-03-22,2010-03-26
                     001,A,enabled,2010-03-26,2010-04-11
                     001,A,online, 2010-04-12,2010-06-05
                     001,A,online, 2010-06-06,3000-12-31
                     002,B,delete, 2010-03-02,2010-03-19
                     002,B,online, 2010-03-20,2010-04-12
                     002,B,delete, 2010-04-13,2010-06-03
                     002,B,enabled,2010-06-04,3000-12-31意思就是每个会员每段时间的状态不一样,产品每段时间状态也不一样,现在通过member_id关联得到产品所属会员的状态信息,同样以时间段来区分product_status和member_status,就像两条时间轴交叉。结果要求如下:
                     product_id,product_status,member_id,member_status,start_date,end_date
                     001,online,A,enabled,2010-03-04,2010-03-05
                     001,online,A,diabled,2010-03-06,2010-03-08
                     001,online,A,tbd,   2010-03-09,2010-03-21
                     001,wait,  A,tbd,    2010-03-22,2010-03-26
                     ……请问这个该如何实现?
    建表脚本如下:
create table tab_member(member_id varchar2(20),member_status varchar2(20),start_date date,end_date date);
create table tab_product(product_id varchar2(20),member_id varchar2(20),product_status varchar2(20),start_date date,end_date date);
                    
insert into tab_member  values('A','enabled',to_date('2010-03-04','yyyy-mm-dd'),to_date('2010-03-05','yyyy-mm-dd'));
insert into tab_member  values('A','diabled',to_date('2010-03-06','yyyy-mm-dd'),to_date('2010-03-08','yyyy-mm-dd'));
insert into tab_member  values('A','tbd'    ,to_date('2010-03-09','yyyy-mm-dd'),to_date('2010-04-15','yyyy-mm-dd'));
insert into tab_member  values('A','delete' ,to_date('2010-04-16','yyyy-mm-dd'),to_date('2010-06-07','yyyy-mm-dd'));
insert into tab_member  values('A','enabled',to_date('2010-06-08','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
insert into tab_member  values('B','diabled',to_date('2010-03-02','yyyy-mm-dd'),to_date('2010-03-08','yyyy-mm-dd'));
insert into tab_member  values('B','enabled',to_date('2010-03-09','yyyy-mm-dd'),to_date('2010-04-01','yyyy-mm-dd'));
insert into tab_member  values('B','unknown',to_date('2010-06-08','yyyy-mm-dd'),to_date('2010-04-10','yyyy-mm-dd'));
insert into tab_member  values('B','enabled',to_date('2010-04-11','yyyy-mm-dd'),to_date('2010-04-30','yyyy-mm-dd'));
insert into tab_member  values('B','tbd'    ,to_date('2010-05-01','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
insert into tab_product values('001','A','online' , to_date('2010-03-04','yyyy-mm-dd'),to_date('2010-03-21','yyyy-mm-dd'));
insert into tab_product values('001','A','wait'   , to_date('2010-03-22','yyyy-mm-dd'),to_date('2010-03-26','yyyy-mm-dd'));
insert into tab_product values('001','A','enabled', to_date('2010-03-26','yyyy-mm-dd'),to_date('2010-04-11','yyyy-mm-dd'));
insert into tab_product values('001','A','online' , to_date('2010-04-12','yyyy-mm-dd'),to_date('2010-06-05','yyyy-mm-dd'));
insert into tab_product values('001','A','online' , to_date('2010-06-06','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
insert into tab_product values('002','B','delete' , to_date('2010-03-02','yyyy-mm-dd'),to_date('2010-03-19','yyyy-mm-dd'));
insert into tab_product values('002','B','online' , to_date('2010-03-20','yyyy-mm-dd'),to_date('2010-04-12','yyyy-mm-dd'));
insert into tab_product values('002','B','delete' , to_date('2010-04-13','yyyy-mm-dd'),to_date('2010-06-03','yyyy-mm-dd'));
insert into tab_product values('002','B','enabled', to_date('2010-06-04','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
commit;
      

解决方案 »

  1.   

    -- try it ..
    SQL> select /*+ ordered use_hash(a b) */
      2         b.product_id,
      3         b.product_status,
      4         b.member_id,
      5         a.member_status,
      6         greatest(a.start_date,b.start_date) as start_date,
      7         least(a.end_date,b.end_date) as end_date
      8  from   tab_member a,
      9         tab_product b
     10  where  a.member_id = b.member_id
     11  and    (a.start_date between b.start_date and b.end_date
     12   or    b.start_date between a.start_date and a.end_date
     13   or    a.end_date between b.start_date and b.end_date
     14   or    b.end_date between a.start_date and a.end_date)
     15  order by 1,3,5,6
     16  ;
     
    PRODUCT_ID           PRODUCT_STATUS       MEMBER_ID            MEMBER_STATUS        START_DATE  END_DATE
    -------------------- -------------------- -------------------- -------------------- ----------- -----------
    001                  online               A                    enabled              2010-3-4    2010-3-5
    001                  online               A                    diabled              2010-3-6    2010-3-8
    001                  online               A                    tbd                  2010-3-9    2010-3-21
    001                  wait                 A                    tbd                  2010-3-22   2010-3-26
    001                  enabled              A                    tbd                  2010-3-26   2010-4-11
    001                  online               A                    tbd                  2010-4-12   2010-4-15
    001                  online               A                    delete               2010-4-16   2010-6-5
    001                  online               A                    delete               2010-6-6    2010-6-7
    001                  online               A                    enabled              2010-6-8    3000-12-31
    002                  delete               B                    diabled              2010-3-2    2010-3-8
    002                  delete               B                    enabled              2010-3-9    2010-3-19
    002                  online               B                    enabled              2010-3-20   2010-4-1
    002                  online               B                    enabled              2010-4-11   2010-4-12
    002                  delete               B                    enabled              2010-4-13   2010-4-30
    002                  delete               B                    tbd                  2010-5-1    2010-6-3
    002                  enabled              B                    tbd                  2010-6-4    3000-12-31
    002                  enabled              B                    unknown              2010-6-8    2010-4-10
    002                  online               B                    unknown              2010-6-8    2010-4-10
     
    18 rows selected
      

  2.   

    牛,LS的思想很简洁,结果应该也没问题,等我验证一下就OK。