E2compr 0.4 User Manual – Clusters

Wayback MachineAbout this captureCOLLECTED BY Organization: Alexa Crawls Starting in 1996, Alexa Internet has been donating their crawl data to the Internet Archive. Flowing in every day, these data are added to the Wayback Machine after an embargo period. Collection: Alexa Crawls DE Crawl data donated by Alexa Internet. This data is currently not publicly accessible TIMESTAMPSloadingGo to the first, previous, next, last section, table of contents.

E2compr works by conceptually dividing a file into sections of nblocks (where n is constant for a given file). We call these sections`clusters’, and n the `cluster size’. E2compr replaces non-holedata in a cluster with blocks of compressed data followed by one ormore holes. Suppose we have a file that looks like this (11 blocks ofdata), and that the cluster size is 4 blocks:

block# —— [0] data [1] data [2] data [3] data [4] data [5] data [6] data [7] data [8] data [9] data [10] data

If we want to compress the file, we cut the file into clusters andcompress every cluster individually:

block# —— [0] data <- cluster #0 [1] data [2] data [3] data [4] data <- cluster #1 [5] data [6] data [7] data [8] data <- cluster #2 [9] data [10] data

Suppose the cluster #0 compressed into 1 block, the cluster #1 into 2blocks and the cluster #3 into 2 blocks. We just replace theuncompressed blocks with the compressed one, at the same place, andremove the unneeded blocks. The file becomes like this:

block# bytes —— —– [0] compressed data <- cluster #0 [1] missing block [2] missing block [3] missing block [4] compressed data <- cluster #1 [5] compressed data [6] missing block [7] missing block [8] compressed data <- cluster #2 [9] compressed data [10] missing block

The file now uses only 5 blocks on the disk.

Go to the first, previous, next, last section, table of contents.

Leave a Reply

Your email address will not be published. Required fields are marked *