Image compression technology: How is it used and how can its efficiency be improved?

In this blog post, we will look at how digital image compression technology is used in various fields and how it can improve efficiency.

 

Digital images refer to photographs or pictures expressed in digital form. These digital images are composed of pixels, which are the smallest units of a digital image, and each pixel is assigned a value that represents its brightness, color, and other characteristics. Generally, the higher the number of pixels, the higher the resolution, but the larger the amount of data stored. Therefore, digital image compression technology is necessary to reduce the amount of data in order to efficiently store and transmit these digital images.
There are two types of digital image compression technology: lossless compression and lossy compression. Lossless compression does not use any methods that cause data loss during the compression process, so although the compression efficiency is lower, it is possible to restore the image to its original state. On the other hand, lossy compression removes redundant or unnecessary data, making it difficult to restore the image to its original state, but it achieves compression efficiency that is several times to several thousand times higher than lossless compression, and is therefore commonly used as a compression technology.
JPEG, which we commonly use, is a representative digital image file format that applies lossy compression technology. JPEG compression mainly involves preprocessing, DCT, quantization, and encoding. Each of these steps is designed to reduce the image size while maintaining quality close to the original.
First, in the preprocessing stage, the color model is changed and “sampling” is performed. First, the color model of the digital image is changed from RGB to YCbCr. The RGB model combines the three primary colors of light to express the color and brightness of pixels, while the changed YCbCr model separates the information into Y, which represents brightness, and Cb and Cr, which represent color information, to express the pixel information.
When the color model is changed from RGB to YCbCr, sampling is performed to extract only some of the values from the pixels. The human eye is sensitive to changes in brightness and relatively less sensitive to changes in color. Therefore, in sampling, all Y values representing brightness are extracted, and only some of the Cb and Cr values representing color information are extracted within the range where the human eye cannot perceive color changes.
This sampling is performed by extracting pixel information at a ratio of J:a:b from blocks that group pixels into fixed units. Here, J is the number of horizontal pixels in the pixel block, a is the number of pixels extracted from the first row of the pixel block, and b is the number of pixels extracted from the second row. For example, if color information is sampled at a ratio of 4:2:0, two color pieces of information are extracted from the first row of a pixel block with four horizontal pixels, and no color information is extracted from the second row. As a result, only two pieces of information out of the eight color pieces of information in the 4×2 block are extracted, reducing the amount of data.
After the preprocessing process, a conversion process called DCT is performed. DCT is a process of converting the information of sampled pixels into frequencies and representing them as data separated regularly according to the frequency range. Considering efficiency, DCT is performed in blocks of 8 pixels horizontally and 8 pixels vertically as the basic unit. When DCT is performed, low-frequency components, which indicate that the difference in information between adjacent pixels is small, are gathered at the top left of the matrix, and high-frequency components, which indicate that the difference is large, are gathered at the bottom right of the matrix and expressed as matrix values separated according to the frequency domain. The absolute value of the separated low-frequency components is larger than that of the high-frequency components.
Next, the quantization process is performed. In the quantization process, the matrix values obtained by DCT are divided by a predetermined constant and then rounded. At this time, the matrix values of the low-frequency components are divided by a small constant and then rounded, but the matrix values of the high-frequency components are divided by a large constant to make their values zero and then rounded. This is to reduce the data volume by reducing the absolute value of the low-frequency components and removing the high-frequency components, considering that the human eye is sensitive to low-frequency components but less sensitive to high-frequency components.
Finally, the encoding process is performed. Encoding is the process of expressing the matrix values that have undergone quantization as binary codes. Huffman encoding is typically used in this process. Huffman coding assigns fewer bits to frequently occurring data and more bits to infrequently occurring data. As a result, the Huffman coding process reduces the amount of digital image data without losing any data.
Digital image compression technology enables data to be stored and transmitted efficiently through these processes and is used in various fields. For example, in the medical field, it is important to compress high-resolution images without loss for transmission. In addition, lossy compression technology is widely used for fast image transmission and storage space savings on the Internet. Such compression technology helps to make digital images more efficient and will continue to evolve in the future.

 

About the author

Writer

I'm a "Cat Detective" I help reunite lost cats with their families.
I recharge over a cup of café latte, enjoy walking and traveling, and expand my thoughts through writing. By observing the world closely and following my intellectual curiosity as a blog writer, I hope my words can offer help and comfort to others.