Tuesday, June 4, 2019
Modified Huffman Coding Schemes Information Technology Essay
circumscribed Huffman Coding Schemes Information Technology EssayCHAPTER 2Document compression is a digital process. Therefor, before compressing the entropy , information about the document should be known. The CCITT algorithms deals with a foliate of size 8.5 x 11 edge. The page is divided into even and vertical lines. These flat lines atomic number 18 known as hang in over lines .Dots per inch and picture elements per inch atomic number 18 2 standards for image resolution. A 8.5 x 11 inch page is 1728 x 2cc pixels . One tire line is 1728 pixel long .the normal resolution is 200 x 100 dpi and a fine resolution is 200 x 200 dpi.Figure 2.1Each pixel is represented by 1 bit , the number of pixel that go out form the in a amplyer place page is 3,801,600. Although s end uping this info through an ISDN line it get out take approximately 7 min. If the resolution of the page is increased , the measure taken by the transmission will increase. indeed it is not important to transfer every exact bit of the binary page information. The most putting surfacely encoding used for CCITT compression is Modified Huffman which is supported by all the fax compression techniques. Other options used ar Modified Read and Modified Modified Read. The fol diminisheding knock back gives an overview of these encoding/decoding techniques.CharacteristicsMHMRMMHCompression efficiencyGoodBetterBestStandardT.4T.4T.6Dimension1-D2-D2-D(extended)AlgorithmHuffman and RLESimilarities between cardinal successive linesMore efficient MRTable 2.1 Comparisons of MH, MR and MMR2.1.1 Modified HuffmanThe fax pages are contains many gallops of white and black pixels which makes RLE efficient for minimizing these run lengths. The efficiently compressed run lengths are then combined with Huffman coding . Thus an efficient and simple algorithm is achieved by combining RLE with Huffman coding and this is known as Modified Huffman. RLE consists of terminating and piece codes.MH codin g uses specified tables for terminating and makeup codes. Terminating codes represent shorter runs while the makeup codes represents the longer runs. The white and black pixel runs from 0 to 63 are represented by terminating codes while great than 63 are represented with makeup codes which mean than greater than 63 bit runs are defined in multiples of 64 bits which are formed by the terminating codes. These tables are given in chapter 4. a examine line represented with long runs gives a make code which is less than or equal to the pixel run and then the difference is given by the terminating code. The amounting example will help in understanding how it works. .There are three different personas of bit pattern in MH codingPixel information (selective information )FillEOLThe term Fill refers to the extra 0 bits that are added to a small entropy line which cloys the left space in the data. The Fill patterns brings highly compressed scan line to a preferred minimum scan line judg ment of conviction ( MSLT) , which makes it complete and transmittable. discover a transmission rate of 4800 bps with an MSLT 10ms so the minimum bit per scan line is 48 bits.1728 pixels scan line is compressed to 43 bit . 31 data bit + 12 EOL bits which in total is 43 bits. The left space is filled by 5 Fill bits given as followScan line 1728 pixelsEOLRLE code4B3W2B1719W12 bits43 bitsBit pattern00110101 011 1000 11 01100001011000 00000 000000000000131 data bits fill patren EOL 48 bits -Figure 2.2 Modified Huffman expressionIn addition to this another special(prenominal) bit pattern used in the MH coding is EOL . EOL are special bit patterns which view as several different identification function i.e.EOL at the get-go of the scan line indicate the start of the scan of lineEOL at the end of the scan line consist of 11 0s followed by a 1. It helps in stopping the error from one scan line penetrating into other scan lines and for severally one line is independently coded.At th e end of each page an RTC signal is given which holds six EOL patterns which identifies the end of page .MODIFIED READMR is as well as known as Modified Relative Element address designated (READ). MR exploits the correlation between successive lines . It is known that cardinal consecutive lines have a very high percentage of single pixel enactment due to a very high resolution of the images. utilise this phenomena, instead of scanning each scan line as done in MH, MR takes in account a reference line and then encodes each scan line that follows. In fact it is more than appropriate to say that MR is more interwoven MH algorithm.MR encoding encounters some(prenominal) MH and MR coding technique. The reference line is encoded development MH and the subsequent line is encoded using MR encoding until the conterminous reference line appears. The decision on how to encounter the next reference line is taken by a tilt K. The vale of K defines the resolution of the compression.MR is a 2-Dimensional algorithm. The think of of K defines the number of lines that uses 2-Dimensional phenomena, which K-1 lines. However the reference line using the MH algorithm is using 1-dimension. For a normal resolution of an image the value of K is set to 2 the refrence line is encoded every second scan line. Where as the value of K set to 4 will give a higher resolution because the reference line is MH encoded every 4 line , making it more complex and compressed. The following take to shows scan lines for both resolution of K set to 2 and 4.MHMRMHMR-2 scan lines-For normal resolutionk = 2 , 1 MH line, 1 MR lineMHMRMRMRMHMRMRMR4 scan linesFor higher resolutionk = 4, 1 MH line , 3 MR linesfigure 2.3 modified read social organisationThe advantage of having low resolution over high resolution is that the error prorogation into the subsequent line is reduced with lower number of dependent scan lines. However in MR encoding the value of K can be set as high as 24.The change betwe en two subsequent line i.e. the refrence line and the next scan line given by MR can be given as followreference line b1 b2Scan line a0 a1 a2figure 2.4 MR 2-D coding.The nodes that are given in the figure above are described as followa0 is start of changing element in the coding line which is also the reference for the next changing elementsa1 first transition on the coding linea2 second transition on the coding lineb1 first transition on the reference line on the right of the a0 , first opposite color transitionb2 first transition on the reference line.In the above figure the reference line is coded with the MH coding while the next scan line is coded with MR. Hence it can be seen that there are very shaver changer between both the scan line. MR takes advantage of the minor changes and encodes only the changing elements a0 , a1 and a2 instead of the complete scan line. There are three available encoding modes of MR , which decide on how to code these changing elments of the scan line with respect to the reference line. These modes arePass modeVertical mode swimming modeAs it is due to these different modes of MR which makes it more complex algorithm. These MR functional modes are discussed in detail in chapter 3. And then one can reffer backbone to this part to completely understand it. The structure of MR is given as followEOL +1 entropy1-DfillEOL+0Data1-DEOL+1Data1-DfillEOL +0Data1-DEOL +1EOL +1EOL +1EOL +1EOL +1EOL +1K = 2EOL+1 MH coding of next lineEOL+0 MR coding of next lineFILL Extra 0 bitsRTC End of page with 6 EOLsFigure 2.5 Structure of MR data in a pageModified Modified ReadITU-T Recommendation T.6 gives the Modified Modified Read or MMR encoding algorithm. MMR is an upgraded adjustment of the MR. They are both 2-Dimensional algorithms alone MMR is an Extended version of the 2-Dimension. The fundamentals of MMR are same(p) as MR except a few minor changes to the algorithm however the modes of MR i.e. pass mode , vertical mode and horizontal mo de are same for MMR encoding.The major change in the MMR with respect to MR is the K parameter . The MMR algorithm dose not use the K parameter and recurring reference line. Instead of these the MMR algorithm uses an imaginary scan line which consist of all white pixels which is the first line at the start of each page and a 2-Dimension line follows till the end of the page. This introduced scan line of all whites is the reference line alike the MR.The error propagation in MMR has a very high predictability because of the connected coding method of all the scan lines. Thus ECM is required for MMR to be enabled. ECM guaranties error free MMR algorithm. Thus MMR dose not require any EOL however a EOFB (end of facsimile block) is required at the end of page which is the same as RTC in MH. The formation of data in MMR and the EOFB block bit sequence is given as follow.Data2-DData2-DData2-DData2-DData2-DData2-DData2-DData2-DData2-DData2-DData2-DData2-DData2-DEOFBscan lines of pageEOFB bit sequence0000000000001 0000000000001Figure 2.6 Scan lines in MMR page denounceged cypher level FormatTagged word-painting File Format( ado) is purely a graphical format i.e. pix elated, bitmap or rasterized. pettifoggery is a common file format that is found in most imaging programs. This discussion here cover majorly the TIFF standard of ITU-T.6 which is the latest. T.6 includes all the specification of the earlier versions with miniature addition. TIFF is flexible and has good power rating but at the same time it is more complex. Extensibility of TIFF makes it more difficult to design and understand. TIFF is as known by its name a tagged file that holds the information about the image. TIFF structure is organized into three parts human body file header (IFH)Bit map data (black and white pixels)Image File Directory(IFD)IFHBitmap dataIFDEOBFigure 2.7 File organization of TIFFConsider an example of three TIFF images file structures. These three structures hold the same data i n possible three different formats. The IFH or the header of TIFF is the first in all the three arrangements. However in the first arrangement IFDs are been written first and then followed by the image data which is efficient if IFD data is needed to be read quickly. In the second structure the IFD is followed by its particular image which is the most common internal structure of the TIFF. In the last example the image data followed by its IFDs. This structure is applicable if the image data is available before the IFDs.HeaderIFD0IFD1IFD nImage 0Image 1Image nHeaderIFD 0Image 0IFD 1Image 1IFD nImage nHeaderImage 0Image 1Image 3IFD 0IFD 1IFD nFigure 2.8 Different TIFF structuresImage File HeaderA TIFF file header is an 8-byte which is the start of a TIFF file. The bytes are organized in the following orderThe first two bytes defines the byte order which is either little endian (II)or big endian (MM). The little endian byte order is that it starts from least significant bit and ends o n the most significant and big endian is vice verse.II = 4949HMM = 4D4DHThe third and fourth bytes hold the value 42H which is the definition for the TIFF fileThe next fourth bytes holds the offset value for the IFD. The IFD may be at any location after the header but must begin after a word boundary.Byte order42Byte offset for IFDFigure 2.9 IFH structureImage File DirectoryImage file directory (IFD) is a 12 byte file that holds information about the image including the color , type of compression, length, width, physical dimension, location of the data and other such information of the image.Before the IFD there is a 2 byte tag counter. This tag counter holds the number of IFD used. Which is followed by a 12 byte IFD and a four 0 bytes at the end of the last byte. Each IFD intromission has the following formatThe first two bytes of the IFD hold the identification field. This filed gives information what sign of the image it is pointing to. This is also know as the tag.The next tw o bytes gives the type of of the IFD i.e. short, long etcThe next four bytes hold the count for the defined tag typeThe last two bytes hold the offset value for the next IFD which is always an even number. However the next IFD starts by a word difference. This vale offset can point anywhere in the Image even after the image data.The IFD are sorted in ascending order according to the Tag number. Thus a TIFF field is a logical entity which consist of a tag number and its vallueTag entry count2-bytesTag 012 bytesTag 112 bytesTag n12 bytesNext IFD offset ornull bytes4 bytesFigure 2.10 IFD structureThe IFD is the introductory tag file that hold information about the image data in a complete TIFF file. The data is either found in the IFD or retrieved from an offset location pointed in the IFD. Due to offset value to other location instead of having a fixed value makes TIFF more complex. The offset values in TIFF are in three placeslast four bytes of the header which indicates the positio n of the first IFDLast four bytes of the IFD entry which offsets the next IFD.The last four bytes in the tag may contain an offset value to the data it represents or possibly the data its selffiguer 2.11CCITT convertThis type of compression is used for facsimile and document imaging files. It is a losses type of image compression. The CCITT ( International telegraph and telephone consultative committee) is an organization which provides standards for communication protocol for black and white images or telephone or other low data rate data lines. The standards given by ITU are T.4 and T.6. These standards are the CCITT group 3 and group 4 compression methods respectively. CCITT group compression algorithms are designed specifically for encoding 1 bit image. CCITT is a non adaptive compression algorithm. There are fixed tables that are used by CCITT algorithms. The coded values in these tables were taken from a reference of set of documents containing both text and graphics.The comp ression ratio obtained with CCITT algorithms is much more higher than quarter size of the original image. The compression ratio for a 200 x 200 dpi image achieved with group 3 is 51 to 81 which is much increased with group 4 that is up to 151 with the same image resolution. However the complexity of the algorithms increases with the ratio of its comparisons. Thus group 4 is much more complex than group 3.The CCITT algorithms are specifically designed for typed or handwritten scanned images, other images with composition different than that of target for CCITT will have different runs of black and white pixels. Thus such bi-level images compressed will not give the required results. The compression will be either to a minimum or even the compressed image will be greater in size than the original image. Such images at maximum can achieve a ratio of 31 which is very low if the time taken by the comparisons algorithms is very high.The CCITT has three algorithms for compressing bi level images,Group 3 one dimensionalGroup 3 two dimensionalGroup 4 two dimensionalEarlier when group 3 one dimensional was designed it was targeted for bi level , black and white data that was processed by the fax machines. Group 3 encoding and decoding has the tendency of being fast and has a reputation of having a very high compression ratio. The error correction inside a group 3 algorithm is done with the algorithm itself and no extra hardware is required. This is done with special data inside the group3 decoder. Group 3 makes muse off MH algorithm to encode.The MMR encoding has the tendency to be much more efficent. Hence group 4 has a very high percentage of compression as compared to group 3 , which is almost half the size of group 3 data but it is much more time consumed algorithm. The complexity of such an algorithm is much more higher than that of group 3 but they do not have any error detection which propagates the error how ever special hardware configuration will be required f or this purpose. Thus it makes it a poor survival for image transfer protocols.Document imaging system that stores these images have adopted CCITT compression algorithms to save disk spaces. However in age of good bear upon speeds and handful of memory CCITT encoded algorithms are still needed printing and viewing o data as done with adobe files. However the transmission of data through modems with lower data rates still require these algorithms.Group 3 One Dimensional (G31D)The main features of G31D are given as followG31D is a variation of the Huffman type encoding known as Modified Huffman encoding.The G31D encodes a bi-level image of black and white pixels with black pixels given by 1 and white with 0s in the bitmap.The G31D encodes the length of a same pixel run in a scan line with protean length binary codes.The variable length binary codes are take from pre defined tables separate for black and white pixels.The variable code tables are defined in T.4 and t.6 specification foe ITU-T. These tables are determined by taking a number of typed and handwritten documents. Which were statistically analyzed to the show the average frequency of these bi level pixels. It was inflexible that run length occurring more frequently were assigned small code will other were given bigger codes.As G31D is a MH coding arrangement which is explained earlier in the chapter so we will give some example of the coding is carried out for longer run of same pixels. The coded tables have continuous value from 0 to 63 which are single terminating codes while the greater are coded with addition of make up codes for the same pixels, only for the values that are not in the tables for a particular pixel. The code from 64 to 2623 will have one makeup code and one terminating code while greater than 2623 will have multiple makeup codes. Hence we have two types of tables one is from 0 to 63 and other from 64 till 2560. The later table is selected by statistical analysis as explained a bove.Consider a pixel run for 20 black . Hence it is less than the 63 coded mark in the table . We will look for the value of 20 in the black pixel table which is 00001101000. hence this will be the terminating code for the 20 black pixel run which is have the size of the original. Thus a ratio 21 is achieved.Let us take the value cxx which is greater than 63 and is not present in the statistically selected pixel run. Here we will need a make up code and a terminating code. The pixel run can be broken into 64 which is the highest in the tables for this pixel run and 57 which will give 120 pixel run120 = 64 + 5764 coded value is 1101157 coded value is 01011010hence 120 is 11011 the make up code and 01011010 terminating code as given in the figure 2.11a.Now consider a bigger run of black pixel which is 8800. This can be given a sum of 4 make up and one terminating code8800 = 2560 + 2560 + 2560 + 1088 + 32which is 000000011111, 00000001111, 000000011111, 0000001110101 and 000000110101 0so it can be given as shown in figure 2.11b110111011010Makeup code terminating code2.11a makeup and terminating codes for 120OOOOOOO11111OOOOOOO11111OOOOOOO11111OOOOO111O1O11101010makeup makeup makeup makeup terminatingfigure 2.11b makeup and terminating codes for 8800Group 3 Two Dimensional (G32D)Group 4 Two Dimensional (G42D)
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.