研究2天了，这个压缩算法还没弄出来~~~~~~

搜到的。GIF-LZW
PDF-LZW
Warning数据源BitsPerComp
1,2,4,8(bits)
8(bits)（文本压缩）
PDF文件中的图像如果使用LZW算法，在打印输出时有问题，可能是打印驱动不支持。清除码的使用（FlushCode）
有
有
  结束码的使用（EndStream）
有
无
GIF图的压缩数据使用块结构进行存储，并且使用结束码，这样可以很方便结束结码。相反，PDF文本内容压缩内容则没有结束码。BITS顺序
87654321
12345678
两种压缩数据最大的不同点。这种不同点不像大小endian码那样不同，下面将有更为详细的说明。数据存储结构
有结构
无结构
GIF压缩数据结构为（数据Bits长度，数据块长度，数据内容，----）


  ?嗦两句：1.         LZW压缩码长度最长是12bits。超过这个长度将要重新开始编码解码。2.    在压缩数据中，数据开头和压缩码长度变换后，GIF和PDF都放置有清除码。查一下PDF文件中的LZWDecode压缩数据就可以了。3.    字典初始内容。内容与压缩数据源相关，如果数据源为3bits,则字典只有1<<3 + 1个大小，其中8为FlushCode，9为EndStreamCode。虽然pdf文本压缩内容中不使用EndStreamCode，但是这个编程还是保留没有做其它用处。（具体情况本人也没有做很确切的调查）字节顺序：这里以例子来讲：E1：PDF文本内容压缩（LZW）,E2：GIF(8BITS)数据源内容：(单位bit)E1：10000000,00000110,10000010,10……(简单回车换行)E2：00000000,00000001,…..（简单的一个清0码）其中pdf文件内容可以很自然地进行理解。但是E2内容则要进行如下变换：bits倒换：00000000,10000000,…读出一个压缩码：00000000,1再倒置码：100000000（清0码）不得不提：LZW压缩和解压缩已经极为成熟了，示意方法如下。//////////////////////////压缩//////////////////////////////////////////////////////Dictionary[j] ← all n single-character， j＝1, 2， …，nj ← n+1Prefix ← read first Character in Charstreamwhile((C ← next Character)!=NULL)BeginIf Prefix.C is in DictionaryPrefix ← Prefix.CelseCodestream ← cW for PrefixDictionary[j] ← Prefix.Cj ← n+1Prefix ← CendCodestream ← cW for Prefix////////////////////////解压缩/////////////////////////////////////////////////Dictionary[j] ← all n single-character， j＝1, 2， …，nj ← n+1cW ← first code from CodestreamCharsream ← Dictionary[cW]pW ← cWWhile((cW ← next Code word)!=NULL)BeginIf cW is in DictionaryCharstream ← Dictionary[cW]Prefix ← Dictionary[pW]cW ← first Character of Dictionary[cW]Dictionary[j] ← Prefix.cWj ← n+1pW ← cWelsePrefix ← Dictionary[pW]cW ← first Character of PrefixCharstream ← Prefix.cWDictionary[j] ← Prefix.CpW ← cWj ← n+1end

还有这个http://lzj0470.javaeye.com/blog/359390
lzw压缩算法原理及其java实现 LZW算法的Java模拟实现，package com.anywhere;
import java.io.*;/** 一个lzw 压缩算法的编码和译码的实现程序
* 压缩一个已有文件（sourcefile）到目标文件（targetfile），然后读取压缩的码；
* 此程序采用12位压缩码，词典作多可以存储2^12个词条；
* 生成的压缩码经过解压缩，可以恢复为原先文件；
*对文本文件的压缩率，大约为60%,尚不能支持中文的文件输入：）
* @author Lai Yongxuan     2003.3.12
* @version 1.0
*/
public class lzwCode
{
  /**
  @see Dictionary
  */
   Dictionary dic=new Dictionary();
   /**count1: the bytes of input file,count2:the bytes of output  file
   */
   int count1=0,count2=0;
   /** the max number of the dictionary;
   *this number can be add to the codebuf[] if
   * the file has only odd words to be treated ;
   */
   /** the input file  : character file or coding file
   */
   BufferedInputStream in;
   /** the output file: character file or coding file
   */
   BufferedOutputStream out;
   final short END=4095;

/**the entry of the class,and check the arguments first
@param args array of string arguments
  -c sourceFile [targetFile] 建立一个压缩文件
  -d sourceFile [targetFile]  解压缩一个文件
@return No return value
@exception No ecceptions thrown
*/
  public static void main(String []args)
    {
        if ( args.length<=1 || args.length>4 )
        {
            System.out.println("-c sourceFile [targetFile] [-dic]  建立一个压
缩文件\n");
            System.out.println("-d sourceFile [targetFile] [-dic]  解压缩一个
文件\n");
        }
        else if(! ( args[0].equals(new String("-c") )||args[0].equals(new
String("-d") )  )  )
        {
            System.out.println("-c sourceFile [targetFile]  建立一个压缩文件\
n");
            System.out.println("-d sourceFile [targetFile]  解压缩一个文件\n"
);
        }
        else if(args.length>=2)
        {
          lzwCode a=new lzwCode(args);
          a.run(args);

        }
        return ;
    }


/** the constuctor of the class of "lzwCode "
*@param args array of string arguments input at the main()
*
*
*/
    public lzwCode(String []args)
    {


        try{
                String f=new String();
                in =new BufferedInputStream(
                  new FileInputStream(
                   new File(args[1])));
                  if(args.length==3 && !args[2].equals(new String("-dic")))
                  {
                    f=args[2];
                  }
                  else
                  {
                    int i=args[1].lastIndexOf(new String(".") );
                    f=args[1].substring(0,i)+((args[0].equals("-c")
)?".lzw":".dlzw");
                  }
                 out=new BufferedOutputStream(
                     new FileOutputStream(
                      new File(f)));



          }//try
          catch(FileNotFoundException e )
              {
                System.err.println(e);
                    return;
              }

           catch(IOException e )
            {
                System.err.println(e);
                    return;
            }
    }
/** the entry of the process;
@param Srring args[]: array of string arguments input at the main()
         BufferedInputStream in: the input charstream file
         BufferedOutputStream out:the output code stream file
* @return No return value
*/public void run(String args[] )
{

     if(args[0].equals(new String("-c"))   )
          {
            code(in,out);
            }
            else
            {
            decode(in,out);
            }
  if(args[args.length-1].equals(new String("-dic") ))
       System.out.println(dic.toString ());

}/** input the charstream from a file,and output the code stream to anpther
file
* @param BufferedInputStream in: the input charstream file
         BufferedOutputStream out:the output code stream file
* @return No return value*
*/
  public void code(BufferedInputStream in,BufferedOutputStream out)
  {
    System.out.println("coding...\n"+ ".......\n");

    //a:the buffer byte read from the input file,then to be converted to
String
    //buf: the codestream to store in the code file
    //prefix :the pre_String of the dictory
    // the indexbuf[] is the index of dictionary to be converted in
    // the code file
    //str: the current charecter of the character input Stream
    byte a[]=new byte[1],buf[]=new byte[3];

    String prefix="",cur="";
    byte i=0;
    short indexbuf[]=new short[2];

    String str=null;
    try{
    short m=0;
    while(  (a[0]=(byte)in.read() )  != -1 )
      {
        cur=new String(a);// be converted
        count1++; // the number of bytes of  input file
        str=prefix;
        str=str.concat(cur);
        m=(short)dic.indexOf(str);

        if( m!=-1)//the prefix is in the dictionary,
        {
            prefix=str;
         }
        else//
        {

            if(i==0)//the first indexbuf,store in codebuf[]
            {
               indexbuf[0]=(short)dic.indexOf(prefix);
               i=1;
            }
            else// now have 2 index number,then ouput to the code file
            {
              indexbuf[1]=(short)dic.indexOf(prefix);
              zipOutput(out,indexbuf);

              count2+=3;//3 bytes stored to the code file
              i=0;
            }

            dic.add(str);
            prefix=cur;

        }//else


      }//while

    //  System.out.println("i="+i);
      if(i==(byte)1) //this is the case that the
               //input file has only odd index number to store
      {
        indexbuf[1]=END;//put a special index number
                        //(the max number of the dictionary) END to the
code file
        zipOutput(out,indexbuf);
        count2+=3;

      }

      dic.add(str);
      in.close ();
      out.close ();


      System.out.println("zip rate:"+(float)count2*100/count1+"% ");
     }catch(IOException e )
            {
                System.err.println(e);
                    return;
            }
       catch(OutDictionaryException e)
       {
               System.err.println(e);
                  return;
        }
              }
/** input the code stream from a file,and output the char stream to anpther
file
* @param BufferedInputStream in: the input code   file
         BufferedOutputStream out:the output charstream  stream file
* @return No return value
* @exception No return Exception
*
*
*/ public void decode(BufferedInputStream in,BufferedOutputStream out)
  {
    System.out.println("decoding...\n"+".......\n");

    short precode=0,curcode=0;
    String prefix=null;
    short i=0;
    short bufcode[]=new short[2];//2 code read from the code file
    boolean more=true;//indicate the end of the file or some error while
input the file



  //    DataOutputStream out2=new DataOutputStream(out);
    try{

    more=zipInput(in,bufcode);//first input 2 code
    if(more)
    {
     curcode=bufcode[0];
  // out2.writeChars(dic.getString(curcode));
     stringOut(out,dic.getString(curcode) );


    }
    else
     System.out.println("error in the beginning...");     while(more)
      {
         precode=curcode;


         if(i==0)
         {
          curcode=bufcode[1];
          i=1;
         }
        else
        {
            more=zipInput(in,bufcode);

            curcode=bufcode[0];
            if(bufcode[1]==END)
               {

                stringOut(out,dic.getString (bufcode[0] ));
                    break;
                }
             i=0;
        }


         if(curcode<dic.length())//if the prefix string can be found in the
dictory
         {
        //  out2.writeChars(dic.getString(curcode));
            stringOut(out,dic.getString(curcode) );
            prefix=dic.getString(precode);

            prefix+=(dic.getString(curcode)).substring(0,1);
            dic.add(prefix);

        }
        else
        {
            prefix=dic.getString(precode);
            prefix+=prefix.substring(0,1);
        //  out2.writeChars(prefix);
            stringOut(out,prefix );
            dic.add(prefix);


        }//else
      }//while


      in.close ();
      out.close ();

    }catch( OutDictionaryException e )
            {
               System.err.println(e);
                    return;
               }
  catch(IOException e)
    {
        System.err.println(e);
                    return;
    }

  }

/** output the index number of the dictionary  to the code stream;
ecah index is converted to 12 bit ;and output 2 short numbers at a
time
* @param  BufferedOutputStream out:the output charstream  stream file
          short index[]:the 2 short array to be converted to code form
* @return No return value
* @exception No return Exception
*
*
*/
  private void zipOutput(BufferedOutputStream out,short index[])
  {
    try{


    byte buf[]=new byte[3];

    buf[1]=(byte)(index[0]<<4);

    buf[0]=(byte)(index[0]>>4);

    buf[2]=(byte)index[1];
    buf[1]+=(byte)(index[1]>>8);

    out.write(buf,0,3);

    //out put the decoding
//  System.out.println(index[0]+"\t"+index[1]+"\t");

/*     short codebuf[]=new short[2];

    //codebuf[0]=(short)(buf[0]<<4);
    codebuf[0]=toRight(buf[0],4);
    codebuf[0]+=(short)(toRight(buf[1],0)>>4);

    //codebuf[1]=(short)buf[2];
      codebuf[1]=toRight(buf[2],0);
    //codebuf[1]=(byte)(buf[1]<<4);
    byte temp=(byte)(toRight(buf[1],4));

    codebuf[1]+=toRight(temp,4);


   // codebuf[1]+=(short)(buf[1]<<4);

    System.out.println("\t"+codebuf[0]+"\t"+codebuf[1]);
    */
     }catch( IOException e )
       {
            System.err.println(e);
                    return;
       }

  }

/** convert the  code stream to the file in the original way;
* each time deel with 3 bytes,and return  2 index number
* @param  BufferedOutputStream in :the input code  stream file
          short index[]:the 2 short array buffer of index of dictionary
* @return return loolean value:if not the end of file and the converted
code
          is right ,return true;else ,return false
* @exception No return Exception
*
*
*/
  private  boolean  zipInput(BufferedInputStream in,short codebuf[])
  {
    byte buf[]=new byte[3],temp;
    //int intbuf[]=new int[3],temp;
    short le=(short)dic.length();
    try{

    if(in.read(buf,0,3)!=3)
     {
         System.out.println("the end of the file!");
         return false;
      }
    //codebuf[0]=(short)(buf[0]<<4);
    codebuf[0]=toRight(buf[0],4);
    codebuf[0]+=(short)(toRight(buf[1],0)>>4);

    //codebuf[1]=(short)buf[2];
    codebuf[1]=toRight(buf[2],0);
    //codebuf[1]=(byte)(buf[1]<<4);
    temp=(byte)(toRight(buf[1],4));
    codebuf[1]+=toRight(temp,4);
  //  System.out.println(codebuf[0]+"\t"+codebuf[1]);


    if(codebuf[0]<-1 ||codebuf[1]<-1)
      {
        System.out.println("erroring while getting the code
:"+codebuf[0]+"\t"+codebuf[1]);
        System.out.println(dic);
        return false;
      }
    //System.out.println(codebuf[0]+"\t"+codebuf[1]);
}
  catch(IOException e )
        {
             System.err.println(e);
                  return false;
        }
  return true;
}

/**converte a byte number,to the short form;and
  * shift a byte n bits to the right;and reglect whether
  *&the byte is positive or negective
  *@param byte:the byte you want to shift
  *            int :the bits you shift to the right
  *@return int :the result of the shifted
*/ private short toRight(byte buf,int n)
{
    short s=0;
    for(short i=7;i>=0;i--)
    {
        if( ( (1L<<i)&buf )!=0 )
          s+=(short)(1L<<(i+n));
    }
    return s;
} /**output the String to a file,but in a form of "byte" way;
* in order to be ecactly as the oririnal file ,i deel with
*  the file in bytes form
*@param BufferedOutputStream out:the output file
*       String str:the buf of String to be output
*/ private  void stringOut(BufferedOutputStream out,String str)
   {
      byte a[]=str.getBytes();
      try{
      out.write(a,0,str.length());
    }
    catch(IOException e )
        {
             System.err.println(e);

        }

  }
}

//Dictionary.java  package com.anywhere;
import java.util.*;
/**the Exception to indicate that the dictionary is too large
*/
class OutDictionaryException extends Exception
      {
        public  String toString()
            {
                return (super.toString ()+"out of the dictionary size!!");
            }
      }
/**
a dictonry that  contains at most 2^12 words,and should be inited
at the beginning; it can be looked up,can be added and return the size
@author :Lai Yongxuan  2002.3.10
@version :1.0
*/
public class Dictionary
{
    /** the container of the dictionary,use ArrayList
    *@see java.util.ArrayList
    */
    ArrayList ar=new ArrayList();

    /**the constuctor of the class,and put the 128 ASCII to the dictionary
    */
    public Dictionary()
    {
      // byte i[]=new byte[1];
       char c[]=new char[1];
       for( c[0]=0;c[0]<128;c[0]++)
       {

         ar.add(new String(c));

       }
    }
    /**return the index number of the word in the dictionary
    */
    public int indexOf(String a)
    {
        return ar.indexOf(a);
    }
    /**add a string to the dictionary
    @param String :the word to be added
    @return NO returned value
    @Exception OutDictionaryException is thrown if the dictionary is too
        large ,it only can contain 4096(2^12) words at most
    */
    public void add (String a) throws OutDictionaryException
    {

       if( length()<4096)
           ar.add(a);
       else
       {

          throw(new OutDictionaryException());

       }
   }
    /** the size  of the dictionary
    */
    public int length()
    {

       return (short)ar.size();
    }    public String toString()
    {
        Integer le=new Integer(length() );

        String str="size of the dictionary: "+le.toString ()+"\n";
        for(int i=0;i<length();i++)
          str+=new String(i+": "+(String)ar.get(i)+"\t");
       return str;
    }
    /** return the word by the index pointor
    */
    public String getString(short i)
    {
        return (String)ar.get(i);
    }
    /** only to test the dictionary
    */
    public static void main(String []args )
    {
        Dictionary a=new Dictionary();
    /* try{

        for(int i=128;i<6000;i++)
        {
            a.add(new String("i am a student") );
        }

      }
      catch(Exception e)
      {

        System.err.println (e.toString());

        }*/
       System.out.println(a);
    }
}

这是搜到的原理= =1.基本原理
    首先建立一个字符串表，把每一个第一次出现的字符串放入串表中，并用一个数字来表示，这个数字与此字符串在串表中的位置有关，并将这个数字存入压缩文件中，如果这个字符串再次出现时，即可用表示它的数字来代替，并将这个数字存入文件中。压缩完成后将串表丢弃。如"print" 字符串，如果在压缩时用266表示，只要再次出现，均用266表示，并将"print"字符串存入串表中，在图象解码时遇到数字266，即可从串表中查出266所代表的字符串"print"，在解压缩时，串表可以根据压缩数据重新生成。2.实现方法
  A.初始化串表
    在压缩图象信息时，首先要建立一个字符串表，用以记录每个第一次出现的字符串。一个字符串表最少由两个字符数组构成，一个称为当前数组，一个称为前缀数组，因为在GIF文件中每个基本字符串的长度通常为2（但它表示的实际字符串长度可达几百甚至上千），一个基本字符串由当前字符和它前面的字符（也称前缀）构成。前缀数组中存入字符串中的首字符，当前数组存放字符串中的尾字符，其存入位置相同，因此只要确定一个下标，就可确定它所存贮的基本字符串，所以在数据压缩时，用下标代替基本字符串。一般串表大小为4096个字节（即2 的12次方），这意味着一个串表中最多能存贮4096个基本字符串，在初始化时根据图象中色彩数目多少，将串表中起始位置的字节均赋以数字，通常当前数组中的内容为该元素的序号（即下标），如第一个元素为0，第二个元素为1，第15个元素为14 ，直到下标为色彩数目加2的元素为止。如果色彩数为256，则要初始化到第258个字节，该字节中的数值为257。其中数字256表示清除码，数字257 为图象结束码。后面的字节存放文件中每一个第一次出现的串。同样也要音乐会前缀数组初始化，其中各元素的值为任意数，但一般均将其各位置1，即将开始位置的各元素初始化为0XFF，初始化的元素数目与当前数组相同，其后的元素则要存入每一个第一次出现的字符串了。如果加大串表的长度可进一步提高压缩效率，但会降低解码速度。  B.压缩方法
    了解压缩方法时，先要了解几个名词，一是字符流，二是代码流，三是当前码，四是当前前缀。字符流是源图象文件中未经压缩的图象数据；代码流是压缩后写入GIF 文件的压缩图象数据；当前码是从字符流中刚刚读入的字符；当前前缀是刚读入字符前面的字符。
GIF 文件在压缩时，不论图象色彩位数是多少，均要将颜色值按字节的单位放入代码流中，每个字节均表示一种颜色。虽然在源图象文件中用一个字节表示16色、4色、2色时会出现4位或更多位的浪费（因为用一个字节中的4位就可以表示16色），但用LZW 压缩法时可回收字节中的空闲位。在压缩时，先从字符流中读取第一个字符作为当前前缀，再取第二个字符作为当前码，当前前缀与当前码构成第一个基本字符串（如当前前缀为A，当前码为B则此字符串即为AB），查串表，此时肯定不会找到同样字符串，则将此字符串写入串表，当前前缀写入前缀数组，当前码写入当前数组，并将当前前缀送入代码流，当前码放入当前前缀，接着读取下一个字符，该字符即为当前码了，此时又形成了一个新的基本字符串（若当前码为C，则此基本字符串为BC），查串表，若有此串，则丢弃当前前缀中的值，用该串在串表中的位置代码（即下标）作为当前前缀，再读取下一个字符作为当前码，形成新的基本字符串，直到整幅图象压缩完成。由此可看出，在压缩时，前缀数组中的值就是代码流中的字符，大于色彩数目的代码肯定表示一个字符串，而小于或等于色彩数目的代码即为色彩本身。  C.清除码
    事实上压缩一幅图象时，常常要对串表进行多次初始化，往往一幅图象中出现的第一次出现的基本字符串个数会超过4096个，在压缩过程中只要字符串的长度超过了4096，就要将当前前缀和当前码输入代码流，并向代码流中加入一个清除码，初始化串表，继续按上述方法进行压缩。  D.结束码
    当所有压缩完成后，就向代码流中输出一个图象结束码，其值为色彩数加1，在256色文件中，结束码为257。  E.字节空间回收
    在GIF文件输出的代码流中的数据，除了以数据包的形式存放之外，所有的代码均按单位存贮，样就有效的节省了存贮空间。这如同4位彩色（16色）的图象，按字节存放时，只能利用其中的4位，另外的4位就浪费了，可按位存贮时，每个字节就可以存放两个颜色代码了。事实上在GIF 文件中，使用了一种可变数的存贮方法，由压缩过程可看出，串表前缀数组中各元素的值颁是有规律的，以256色的GIF文件中，第258-511元素中值的范围是0-510 ，正好可用9位的二进制数表示，第512-1023元素中值的范围是0-1022，正好可用10位的二进制数表示，第1024-2047 元素中值的范围是0-2046，正好用11位的二进制数表示，第2048-4095元素中值的范围是0-4094，正好用12位的二进制数表示。用可变位数存贮代码时，基础位数为图象色彩位数加1，随着代码数的增加，位数也在加大，直到位数超过为12（此时字符串表中的字符串个数正好为2 的12次方，即4096个）。其基本方法是：每向代码流加入一个字符，就要判别此字符所在串在串表中的位置（即下标）是否超过2的当前位数次方，一旦超过，位数加1。如在4位图象中，对于刚开始的代码按5位存贮，第一个字节的低5位放第一个代码，高三位为第二个代码的低3位，第二个字节的低2位放第二个代码的高两位，依次类推。对于8位（256色）的图象，其基础位数就为9，一个代码最小要放在两个字节。  F.压缩范围
    以下为256色GIF文件编码实例，如果留心您会发现这是一种奇妙的编码方法，同时为什么在压缩完成后不再需要串表，而且还在解码时根据代码流信息能重新创建串表。
字符串: 1,2,1,1,1,1,2,3,4,1,2,3,4,5,9,…
当前码: 2,1,1,1,1,2,3,4,1,2,3,4,5,9,…
当前前缀: 1,2,1,1,260,1,258,3,4,1,258,262,4,5,…
当前数组: 2,1,1, 1, 3,4,1, 4,5,9,…
数组下标: 258,259,260,261,262,263,264,265,266,267,…
代码流: 1,2,1,260,258,3,4,262,4,5,…    GIF文件作为一种重要的图形图象文件格式，尽管其编码规则极复杂，但其压缩效率是极高的，特别是对某些平滑过渡的图象的图形，压缩效果更好。同时由于其在压缩过程中的对图象信息能够完整的保存，在目前流行的电子图片及电子图书中得到了广泛的应用。

调试易

研究2天了，这个压缩算法还没弄出来~~~~~~

解决方案 »