Olá,
Estou seguindo a implementação do algoritmo de huffman para compactação de arquivos abaixo:
import java.util.*;
abstract class HuffmanTree implements Comparable<HuffmanTree> {
public int frequency; // the frequency of this tree
public HuffmanTree(int freq) { frequency = freq; }
// compares on the frequency
public int compareTo(HuffmanTree tree) {
return frequency - tree.frequency;
}
}
class HuffmanLeaf extends HuffmanTree {
public char value; // the character this leaf represents
public HuffmanLeaf(int freq, char val) {
super(freq);
value = val;
}
}
class HuffmanNode extends HuffmanTree {
public HuffmanTree left, right; // subtrees
public HuffmanNode(HuffmanTree l, HuffmanTree r) {
super(l.frequency + r.frequency);
left = l;
right = r;
}
}
public class HuffmanCode {
// input is an array of frequencies, indexed by character code
public static HuffmanTree buildTree(int[] charFreqs) {
PriorityQueue<HuffmanTree> trees = new PriorityQueue<HuffmanTree>();
// initially, we have a forest of leaves
// one for each non-empty character
for (int i = 0; i < charFreqs.length; i++)
if (charFreqs[i] > 0)
trees.offer(new HuffmanLeaf(charFreqs[i], (char)i));
assert trees.size() > 0;
// loop until there is only one tree left
while (trees.size() > 1) {
// two trees with least frequency
HuffmanTree a = trees.poll();
HuffmanTree b = trees.poll();
// put into new node and re-insert into queue
trees.offer(new HuffmanNode(a, b));
}
return trees.poll();
}
public static void printCodes(HuffmanTree tree, Stack<Character> prefix) {
assert tree != null;
if (tree instanceof HuffmanLeaf) {
HuffmanLeaf leaf = (HuffmanLeaf)tree;
// print out character and frequency
System.out.print(leaf.value + "\t" + leaf.frequency + "\t");
// print out code for this leaf, which is just the prefix
for (char bit : prefix)
System.out.print(bit);
System.out.println();
} else if (tree instanceof HuffmanNode) {
HuffmanNode node = (HuffmanNode)tree;
// traverse left
prefix.push('0');
printCodes(node.left, prefix);
prefix.pop();
// traverse right
prefix.push('1');
printCodes(node.right, prefix);
prefix.pop();
}
}
public static void main(String[] args) {
String arquivo = "C:/teste.txt";
FileInputStream fis = new FileInputStream(arquivo);
BufferedInputStream buffReader = new BufferedInputStream(fis);
DataInputStream data = new DataInputStream(buffReader);
byte[] b = new byte[fis.available()];
data.read(b);
int[] charFreqs = new int[256];
for (char c : new String(b).toCharArray())
{
charFreqs[c]++;
}
int[] charFreqs = new int[256];
// read each character and record the frequencies
for (char c : test.toCharArray())
charFreqs[c]++;
// build tree
HuffmanTree tree = buildTree(charFreqs);
// print out results
System.out.println("SYMBOL\tWEIGHT\tHUFFMAN CODE");
printCodes(tree, new Stack<Character>());
}
}
Dentro deste arquivo teste.txt há uma frase: “Testando Huffman no Java”.
E o mesmo me retorna o seguinte resultado com os caracteres, frequência e os binários:
Caracter Frequência Binários
H 1 0000
v 1 0001
f 2 001
t 1 0100
m 1 0101
3 011
n 3 100
o 2 1010
s 1 10110
J 1 10111
u 1 11000
T 1 11001
d 1 11010
e 1 11011
a 4 111
Até ai tudo certo.
Agora como posso juntar todos estes bytes em ordem de acordo com o texto para salvar em um arquivo compactado?
Por exemplo:
T e s t a n d o = 11001 11011 10110 0100 111 100 11010 1010
Testando = 1100111011101100100111100110101010
E apartir disso gerar um arquivo compactado.
Segui as referências deste site: http://rosettacode.org/wiki/Huffman_coding#Java
Obrigado!