digital image processing

{{short description|Algorithmic processing of digitally-represented images}}

{{About|mathematical processing of digital images|artistic processing of images|Image editing|compression algorithms|Image compression}}

{{Use dmy dates|date=January 2022}}

{{redirect-distinguish|Image processing|Analog image processing}}

Digital image processing is the use of a digital computer to process digital images through an algorithm.{{Cite journal |doi = 10.1109/MSP.2018.2832195|title = What is a Signal? [Lecture Notes]|journal = IEEE Signal Processing Magazine|volume = 35|issue = 5|pages = 175–177|year = 2018|last1 = Chakravorty|first1 = Pragnan| bibcode=2018ISPM...35e.175C |s2cid = 52164353}}{{cite book | last=Gonzalez | first=Rafael | title=Digital image processing | publisher=Pearson | location=New York, NY | year=2018 | isbn=978-0-13-335672-4 | oclc=966609831 }} As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions (perhaps more), digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers;{{Cite journal |last1=Nagornov |first1=Nikolay N. |last2=Lyakhov |first2=Pavel A. |last3=Bergerman |first3=Maxim V. |last4=Kalita |first4=Diana I. |date=2024 |title=Modern Trends in Improving the Technical Characteristics of Devices and Systems for Digital Image Processing |journal=IEEE Access |volume=12 |pages=44659–44681 |doi=10.1109/ACCESS.2024.3381493 |issn=2169-3536|doi-access=free |bibcode=2024IEEEA..1244659N }} second, the development of mathematics (especially the creation and improvement of discrete mathematics theory);{{Cite journal |last1=Yamni |first1=Mohamed |last2=Daoui |first2=Achraf |last3=Abd El-Latif |first3=Ahmed A. |date=February 2024 |title=Efficient color image steganography based on new adapted chaotic dynamical system with discrete orthogonal moment transforms |url=https://linkinghub.elsevier.com/retrieve/pii/S0378475424000351 |journal=Mathematics and Computers in Simulation |volume=225 |pages=1170–1198 |language=en |doi=10.1016/j.matcom.2024.01.023|url-access=subscription }} and third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.{{Cite journal |last=Hung |first=Che-Lun |date=2020-05-28 |title=Computational Algorithms on Medical Image Processing |url=https://www.eurekaselect.com/180828/article |journal=Current Medical Imaging |language=en |volume=16 |issue=5 |pages=467–468 |doi=10.2174/157340561605200410144743|pmid=32484080 |url-access=subscription |doi-access=free }}

History

{{See|Digital image#History|Digital imaging#History}}

Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s, at Bell Laboratories, the Jet Propulsion Laboratory, Massachusetts Institute of Technology, University of Maryland, and a few other research facilities, with application to satellite imagery, wire-photo standards conversion, medical imaging, videophone, character recognition, and photograph enhancement.Azriel Rosenfeld, Picture Processing by Computer, New York: Academic Press, 1969 The purpose of early image processing was to improve the quality of the image. It was aimed for human beings to improve the visual effect of people. In image processing, the input is a low-quality image, and the output is an image with improved quality. Common image processing include image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They used image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the Sun and the environment of the Moon. The impact of the successful mapping of the Moon's surface map by the computer has been a success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the Moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the Moon.{{Cite book|title=Digital image processing|last=Gonzalez, Rafael C.|date=2008|publisher=Prentice Hall|others=Woods, Richard E. (Richard Eugene), 1954–|isbn=978-0-13-168728-8|edition= 3rd|location=Upper Saddle River, N.J.|pages=23–28|oclc=137312858}}

The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in real-time, for some dedicated problems such as television standards conversion. As general-purpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest.

=Image sensors=

{{Main|Image sensor}}

The basis for modern image sensors is metal–oxide–semiconductor (MOS) technology,{{cite book |last1=Williams |first1=J. B. |title=The Electronics Revolution: Inventing the Future |date=2017 |publisher=Springer |isbn=978-3-319-49088-5 |pages=245–8 |url=https://books.google.com/books?id=v4QlDwAAQBAJ&pg=PA245}} invented at Bell Labs between 1955 and 1960,{{Cite patent|number=US2802760A|title=Oxidation of semiconductive surfaces for controlled diffusion|gdate=1957-08-13|invent1=Lincoln|invent2=Frosch|inventor1-first=Derick|inventor2-first=Carl J.|url=https://patents.google.com/patent/US2802760A}}{{Cite journal |last1=Frosch |first1=C. J. |last2=Derick |first2=L |date=1957 |title=Surface Protection and Selective Masking during Diffusion in Silicon |url=https://iopscience.iop.org/article/10.1149/1.2428650 |journal=Journal of the Electrochemical Society |language=en |volume=104 |issue=9 |pages=547 |doi=10.1149/1.2428650|url-access=subscription }}{{Cite journal |last=KAHNG |first=D. |date=1961 |title=Silicon-Silicon Dioxide Surface Device |url=https://doi.org/10.1142/9789814503464_0076 |journal=Technical Memorandum of Bell Laboratories |pages=583–596 |doi=10.1142/9789814503464_0076 |isbn=978-981-02-0209-5|url-access=subscription }}{{Cite book |last=Lojek |first=Bo |title=History of Semiconductor Engineering |date=2007 |publisher=Springer-Verlag Berlin Heidelberg |isbn=978-3-540-34258-8 |location=Berlin, Heidelberg |page=321}}{{Cite journal |last1=Ligenza |first1=J.R. |last2=Spitzer |first2=W.G. |date=1960 |title=The mechanisms for silicon oxidation in steam and oxygen |url=https://linkinghub.elsevier.com/retrieve/pii/0022369760902195 |journal=Journal of Physics and Chemistry of Solids |language=en |volume=14 |pages=131–136 |bibcode=1960JPCS...14..131L |doi=10.1016/0022-3697(60)90219-5|url-access=subscription }}{{cite book |last1=Lojek |first1=Bo |title=History of Semiconductor Engineering |date=2007 |publisher=Springer Science & Business Media |isbn=9783540342588 |page=120}} This led to the development of digital semiconductor image sensors, including the charge-coupled device (CCD) and later the CMOS sensor.

The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in 1969.{{Cite book | title = Scientific charge-coupled devices | author = James R. Janesick | publisher = SPIE Press | year = 2001 | isbn = 978-0-8194-3698-6 | pages = 3–4 | url = https://books.google.com/books?id=3GyE4SWytn4C&pg=PA3 }} While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting.{{cite journal|last1=Boyle|first1=William S|last2=Smith|first2=George E.|date=1970|title=Charge Coupled Semiconductor Devices|journal=Bell Syst. Tech. J.|volume=49|issue=4|pages=587–593|doi=10.1002/j.1538-7305.1970.tb01790.x|bibcode=1970BSTJ...49..587B }}

The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-1980s. This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling reaching smaller micron and then sub-micron levels.{{cite book |last1=Fossum |first1=Eric R. |author1-link=Eric Fossum |title= Charge-Coupled Devices and Solid State Optical Sensors III |series=Proceedings of the SPIE |volume=1900 |date=12 July 1993 |doi=10.1117/12.148585 |pages=2–14 |editor1-last=Blouke |editor1-first=Morley M.|citeseerx=10.1.1.408.6558 |bibcode=1993SPIE.1900....2F |chapter=Active pixel sensors: Are CCDS dinosaurs? |s2cid=10556755 }}{{cite web |last1=Fossum |first1=Eric R. |s2cid=18831792 |author1-link=Eric Fossum |title=Active Pixel Sensors |website=Eric Fossum |year=2007 |url=http://ericfossum.com/Publications/Papers/Active%20Pixel%20Sensors%20LASER%20FOCUS.pdf |archive-url=https://web.archive.org/web/20190829162855/http://ericfossum.com/Publications/Papers/Active%20Pixel%20Sensors%20LASER%20FOCUS.pdf |archive-date=2019-08-29 |url-status=live}} The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985.{{cite journal |last1=Matsumoto |first1=Kazuya |last2=Nakamura |first2=Tsutomu |last3=Yusa |first3=Atsushi |last4=Nagai |first4=Shohei |display-authors=1|date=1985 |title=A new MOS phototransistor operating in a non-destructive readout mode |journal=Japanese Journal of Applied Physics |volume=24 |issue=5A |page=L323|doi=10.1143/JJAP.24.L323 |bibcode=1985JaJAP..24L.323M |s2cid=108450116 }} The CMOS active-pixel sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993.{{cite journal |last1=Fossum |first1=Eric R. |author1-link=Eric Fossum |last2=Hondongwa |first2=D. B. |title=A Review of the Pinned Photodiode for CCD and CMOS Image Sensors |journal=IEEE Journal of the Electron Devices Society |date=2014 |volume=2 |issue=3 |pages=33–43 |doi=10.1109/JEDS.2014.2306412 |doi-access=free }} By 2007, sales of CMOS sensors had surpassed CCD sensors.{{cite news |title=CMOS Image Sensor Sales Stay on Record-Breaking Pace |url=http://www.icinsights.com/news/bulletins/CMOS-Image-Sensor-Sales-Stay-On-RecordBreaking-Pace/ |access-date=6 October 2019 |work=IC Insights |date=8 May 2018 |archive-url=https://web.archive.org/web/20190621180401/http://www.icinsights.com/news/bulletins/CMOS-Image-Sensor-Sales-Stay-On-RecordBreaking-Pace/ |archive-date=21 June 2019 |url-status=live }}

MOS image sensors are widely used in optical mouse technology. The first optical mouse, invented by Richard F. Lyon at Xerox in 1980, used a 5{{nbsp}}μm NMOS integrated circuit sensor chip.{{cite book |last1=Lyon |first1=Richard F. |title=Advances in Embedded Computer Vision |date=2014 |publisher=Springer |isbn=9783319093871 |pages=3–22 (3) |chapter=The Optical Mouse: Early Biomimetic Embedded Vision |author1-link=Richard F. Lyon |chapter-url=https://books.google.com/books?id=p_GbBQAAQBAJ&pg=PA3}}{{cite book |last1=Lyon |first1=Richard F. |title=VLSI Systems and Computations |date=August 1981 |publisher=Computer Science Press |isbn=978-3-642-68404-3 |editor1=H. T. Kung |pages=1–19 |chapter=The Optical Mouse, and an Architectural Methodology for Smart Digital Sensors |doi=10.1007/978-3-642-68402-9_1 |author1-link=Richard F. Lyon |editor2=Robert F. Sproull |editor3=Guy L. Steele |chapter-url=http://bitsavers.trailing-edge.com/pdf/xerox/parc/techReports/VLSI-81-1_The_Optical_Mouse.pdf |archive-url=https://web.archive.org/web/20140226021235/http://bitsavers.trailing-edge.com/pdf/xerox/parc/techReports/VLSI-81-1_The_Optical_Mouse.pdf |archive-date=2014-02-26 |url-status=live |s2cid=60722329}} Since the first commercial optical mouse, the IntelliMouse introduced in 1999, most optical mouse devices use CMOS sensors.{{cite web |last1=Brain |first1=Marshall |last2=Carmack |first2=Carmen |date=24 April 2000 |title=How Computer Mice Work |url=https://computer.howstuffworks.com/mouse4.htm |access-date=9 October 2019 |website=HowStuffWorks |language=en}}{{cite web |last1=Benchoff |first1=Brian |date=17 April 2016 |title=Building the First Digital Camera |url=http://hackaday.com/2016/04/17/building-the-first-digital-camera/ |access-date=30 April 2016 |website=Hackaday |quote=the Cyclops was the first digital camera}}

=Image compression=

{{Main|Image compression}}

An important development in digital image compression technology was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972.{{cite journal |last=Ahmed |first=Nasir |author-link=N. Ahmed |title=How I Came Up With the Discrete Cosine Transform |journal=Digital Signal Processing |date=January 1991 |volume=1 |issue=1 |pages=4–5 |doi=10.1016/1051-2004(91)90086-Z |bibcode=1991DSP.....1....4A |url=https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform |access-date=10 October 2019 |archive-url=https://web.archive.org/web/20160610013109/https://www.scribd.com/doc/52879771/DCT-History-How-I-Came-Up-with-the-Discrete-Cosine-Transform |archive-date=10 June 2016 |url-status=live |url-access=subscription }} DCT compression became the basis for JPEG, which was introduced by the Joint Photographic Experts Group in 1992.{{cite web |title=T.81 – Digital compression and coding of continuous-tone still images – requirements and guidelines |url=https://www.w3.org/Graphics/JPEG/itu-t81.pdf |publisher=CCITT |date=September 1992 |access-date=12 July 2019 |archive-url=https://web.archive.org/web/20190717052727/http://www.w3.org/Graphics/JPEG/itu-t81.pdf |archive-date=17 July 2019 |url-status=live }} JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format on the Internet.{{cite web |title=The JPEG image format explained |url=https://home.bt.com/tech-gadgets/photography/what-is-a-jpeg-11364206889349 |publisher=BT Group |first1= Joe |last1=Svetlik |access-date=5 August 2019 |date=31 May 2018 |archive-url=https://web.archive.org/web/20190805194553/https://home.bt.com/tech-gadgets/photography/what-is-a-jpeg-11364206889349 |archive-date=5 August 2019 |url-status=dead }} Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of digital images and digital photos,{{cite web |date=24 September 2013 |title=What Is a JPEG? The Invisible Object You See Every Day |url=https://www.theatlantic.com/technology/archive/2013/09/what-is-a-jpeg-the-invisible-object-you-see-every-day/279954/ |first1=Paul |last1=Caplan |url-access=subscription |url-status=live |archive-url=https://web.archive.org/web/20191009054159/https://www.theatlantic.com/technology/archive/2013/09/what-is-a-jpeg-the-invisible-object-you-see-every-day/279954/ |archive-date=9 October 2019 |access-date=13 September 2019 |website=The Atlantic}} with several billion JPEG images produced every day {{as of|2015|lc=y}}.{{cite news |last1=Baraniuk |first1=Chris |title=JPeg lockdown: Restriction options sought by committee |url=https://www.bbc.co.uk/news/technology-34538705 |access-date=13 September 2019 |publisher=BBC News|date=15 October 2015 |archive-url=https://web.archive.org/web/20191009193610/https://www.bbc.co.uk/news/technology-34538705 |archive-date=9 October 2019 |url-status=live }}

Medical imaging techniques produce very large amounts of data, especially from CT, MRI and PET modalities. As a result, storage and communications of electronic image data are prohibitive without the use of compression.{{Cite journal |last1=Nagornov |first1=Nikolay N. |last2=Lyakhov |first2=Pavel A. |last3=Valueva |first3=Maria V. |last4=Bergerman |first4=Maxim V. |date=2022 |title=RNS-Based FPGA Accelerators for High-Quality 3D Medical Image Wavelet Processing Using Scaled Filter Coefficients |s2cid-access=free |journal=IEEE Access |volume=10 |pages=19215–19231 |doi=10.1109/ACCESS.2022.3151361 |issn=2169-3536 |s2cid=246895876 |quote=Medical imaging systems produce increasingly accurate images with improved quality using higher spatial resolutions and color bit-depth. Such improvements increase the amount of information that needs to be stored, processed, and transmitted. |doi-access=free|bibcode=2022IEEEA..1019215N }}{{Cite journal |last1=Dhouib |first1=D. |last2=Naït-Ali |first2=A. |last3=Olivier |first3=C. |last4=Naceur |first4=M.S. |date=June 2021 |title=ROI-Based Compression Strategy of 3D MRI Brain Datasets for Wireless Communications |url=https://linkinghub.elsevier.com/retrieve/pii/S1959031820300853 |journal=IRBM |language=en |volume=42 |issue=3 |pages=146–153 |doi=10.1016/j.irbm.2020.05.001 |s2cid=219437400 |quote=Because of the large amount of medical imaging data, the transmission process becomes complicated in telemedicine applications. Thus, in order to adapt the data bit streams to the constraints related to the limitation of the bandwidths a reduction of the size of the data by compression of the images is essential.|url-access=subscription }} JPEG 2000 image compression is used by the DICOM standard for storage and transmission of medical images. The cost and feasibility of accessing large image data sets over low or various bandwidths are further addressed by use of another DICOM standard, called JPIP, to enable efficient streaming of the JPEG 2000 compressed image data.{{Cite journal |last1=Xin |first1=Gangtao |last2=Fan |first2=Pingyi |date=2021-06-11 |title=A lossless compression method for multi-component medical images based on big data mining |journal=Scientific Reports |language=en |volume=11 |issue=1 |pages=12372 |doi=10.1038/s41598-021-91920-x |issn=2045-2322|doi-access=free |pmid=34117350 |pmc=8196061 }}

=Digital signal processor (DSP)=

{{Main|Digital signal processor}}

Electronic signal processing was revolutionized by the wide adoption of MOS technology in the 1970s.{{cite book |last1=Grant |first1=Duncan Andrew |last2=Gowar |first2=John |title=Power MOSFETS: theory and applications |date=1989 |publisher=Wiley |isbn=978-0-471-82867-9 |page=1 |url=https://books.google.com/books?id=ZiZTAAAAMAAJ |quote=The metal–oxide–semiconductor field-effect transistor (MOSFET) is the most commonly used active device in the very large-scale integration of digital integrated circuits (VLSI). During the 1970s these components revolutionized electronic signal processing, control systems and computers.}} MOS integrated circuit technology was the basis for the first single-chip microprocessors and microcontrollers in the early 1970s,{{cite journal |last1=Shirriff |first1=Ken |title=The Surprising Story of the First Microprocessors |journal=IEEE Spectrum |date=30 August 2016 |volume=53 |issue=9 |pages=48–54 |publisher=Institute of Electrical and Electronics Engineers |doi=10.1109/MSPEC.2016.7551353 |s2cid=32003640 |url=https://spectrum.ieee.org/the-surprising-story-of-the-first-microprocessors |access-date=13 October 2019 |archive-url=https://web.archive.org/web/20191013012248/https://spectrum.ieee.org/tech-history/silicon-revolution/the-surprising-story-of-the-first-microprocessors |archive-date=13 October 2019 |url-status=live |url-access=subscription }} and then the first single-chip digital signal processor (DSP) chips in the late 1970s.{{cite web |title=1979: Single Chip Digital Signal Processor Introduced |url=https://www.computerhistory.org/siliconengine/single-chip-digital-signal-processor-introduced/ |website=The Silicon Engine |publisher=Computer History Museum |access-date=14 October 2019 |archive-url=https://web.archive.org/web/20191003072500/https://www.computerhistory.org/siliconengine/single-chip-digital-signal-processor-introduced/ |archive-date=3 October 2019 |url-status=live }}{{cite web |last1=Taranovich |first1=Steve |title=30 years of DSP: From a child's toy to 4G and beyond |url=https://www.edn.com/design/systems-design/4394792/30-years-of-DSP--From-a-child-s-toy-to-4G-and-beyond |website=EDN |access-date=14 October 2019 |date=27 August 2012 |archive-url=https://web.archive.org/web/20191014044347/https://www.edn.com/design/systems-design/4394792/30-years-of-DSP--From-a-child-s-toy-to-4G-and-beyond |archive-date=14 October 2019 |url-status=live }} DSP chips have since been widely used in digital image processing.

The discrete cosine transform (DCT) image compression algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for encoding, decoding, video coding, audio coding, multiplexing, control signals, signaling, analog-to-digital conversion, formatting luminance and color differences, and color formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as motion estimation, motion compensation, inter-frame prediction, quantization, perceptual weighting, entropy encoding, variable encoding, and motion vectors, and decoding operations such as the inverse operation between different color formats (YIQ, YUV and RGB) for display purposes. DCTs are also commonly used for high-definition television (HDTV) encoder/decoder chips.{{cite journal |last1=Stanković |first1=Radomir S. |last2=Astola |first2=Jaakko T. |title=Reminiscences of the Early Work in DCT: Interview with K.R. Rao |journal=Reprints from the Early Days of Information Sciences |date=2012 |volume=60 |url=http://ticsp.cs.tut.fi/reports/ticsp-report-60-reprint-rao-corrected.pdf |access-date=13 October 2019 |archive-url=https://web.archive.org/web/20191013204147/http://ticsp.cs.tut.fi/reports/ticsp-report-60-reprint-rao-corrected.pdf |archive-date=13 October 2019 |url-status=live }}

Tasks

Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means.

In particular, digital image processing is a concrete application of, and a practical technology based on:

Some techniques which are used in digital image processing include:

Digital image transformations

= Filtering =

Digital filters are used to blur and sharpen digital images. Filtering can be performed by:

  • convolution with specifically designed kernels (filter array) in the spatial domain{{Cite journal|last1=Zhang|first1=M. Z.|last2=Livingston|first2=A. R.|last3=Asari|first3=V. K.|date=2008|journal=International Journal of Computers and Applications|volume=30|issue=4|pages=298–308|doi=10.1080/1206212x.2008.11441909|title=A High Performance Architecture for Implementation of 2-D Convolution with Quadrant Symmetric Kernels|s2cid=57289814}}
  • masking specific frequency regions in the frequency (Fourier) domain

The following examples show both methods:{{cite book

|last = Gonzalez

|first = Rafael

|title = Digital Image Processing, 3rd

|publisher = Pearson Hall

|date = 2008

|isbn = 978-0-13-168728-8

}}

class="wikitable"
Filter type

! Kernel or mask

! Example

Original Image

| align="center" |

\begin{bmatrix}

0 & 0 & 0 \\

0 & 1 & 0 \\

0 & 0 & 0

\end{bmatrix}

| File:Affine Transformation Original Checkerboard.jpg

Spatial Lowpass

| align="center" |

\frac{1}{9}\times

\begin{bmatrix}

1 & 1 & 1 \\

1 & 1 & 1 \\

1 & 1 & 1

\end{bmatrix}

| File:Spatial Mean Filter Checkerboard.png

Spatial Highpass

| align="center" |

\begin{bmatrix}

0 & -1 & 0 \\

-1 & 4 & -1 \\

0 & -1 & 0

\end{bmatrix}

| File:Spatial Laplacian Filter Checkerboard.png

Fourier Representation

| Pseudo-code:

image = checkerboard

F = Fourier Transform of image

Show Image: log(1+Absolute Value(F))

| align="center"| File:Fourier Space Checkerboard.png

Fourier Lowpass

| align="center"| File:Lowpass Butterworth Checkerboard.png

| align="center"| File:Lowpass FFT Filtered checkerboard.png

Fourier Highpass

| align="center"| File:Highpass Butterworth Checkerboard.png

| align="center"| File:Highpass FFT Filtered checkerboard.png

== Image padding in Fourier domain filtering ==

Images are typically padded before being transformed to the Fourier space, the highpass filtered images below illustrate the consequences of different padding techniques:

class="wikitable"
Zero padded

! Repeated edge padded

File:Highpass FFT Filtered checkerboard.png

| File:Highpass FFT Replicate.png

Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding.

== Filtering code examples ==

MATLAB example for spatial domain highpass filtering.

img=checkerboard(20); % generate checkerboard

% ************************** SPATIAL DOMAIN ***************************

klaplace=[0 -1 0; -1 5 -1; 0 -1 0]; % Laplacian filter kernel

X=conv2(img,klaplace); % convolve test img with

% 3x3 Laplacian kernel

figure()

imshow(X,[]) % show Laplacian filtered

title('Laplacian Edge Detection')

= Affine transformations =

Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples:

class="wikitable"
Transformation Name

! Affine Matrix

! Example

Identity

| align="center" |

\begin{bmatrix}

1 & 0 & 0 \\

0 & 1 & 0 \\

0 & 0 & 1

\end{bmatrix}

| File:Checkerboard identity.svg

Reflection

| align="center" |

\begin{bmatrix}

-1 & 0 & 0 \\

0 & 1 & 0 \\

0 & 0 & 1

\end{bmatrix}

| File:Checkerboard reflection.svg

Scale

| align="center" |

\begin{bmatrix}

c_x=2 & 0 & 0 \\

0 & c_y=1 & 0 \\

0 & 0 & 1

\end{bmatrix}

| File:Checkerboard scale.svg

Rotate

| align="center" |

\begin{bmatrix}

\cos(\theta) & \sin(\theta) & 0 \\

-\sin(\theta) & \cos(\theta) & 0 \\

0 & 0 & 1

\end{bmatrix}

| File:Checkerboard rotate.svg where {{math|θ {{=}} {{sfrac|π|6}} {{=}}30°}}

Shear

| align="center" |

\begin{bmatrix}

1 & c_x=0.5 & 0 \\

c_y=0 & 1 & 0 \\

0 & 0 & 1

\end{bmatrix}

| File:Checkerboard shear.svg

To apply the affine matrix to an image, the image is converted to matrix in which each entry corresponds to the pixel intensity at that location. Then each pixel's location can be represented as a vector indicating the coordinates of that pixel in the image, {{math|[x, y]}}, where {{math|x}} and {{math|y}} are the row and column of a pixel in the image matrix. This allows the coordinate to be multiplied by an affine-transformation matrix, which gives the position that the pixel value will be copied to in the output image.

However, to allow transformations that require translation transformations, 3-dimensional homogeneous coordinates are needed. The third dimension is usually set to a non-zero constant, usually {{math|1}}, so that the new coordinate is {{math|[x, y, 1]}}. This allows the coordinate vector to be multiplied by a 3×3 matrix, enabling translation shifts. Thus, the third dimension, i.e. the constant {{math|1}}, allows translation.

Because matrix multiplication is associative, multiple affine transformations can be combined into a single affine transformation by multiplying the matrix of each individual transformation in the order that the transformations are done. This results in a single matrix that, when applied to a point vector, gives the same result as all the individual transformations performed on the vector {{math|[x, y, 1]}} in sequence. Thus a sequence of affine transformation matrices can be reduced to a single affine transformation matrix.

For example, 2-dimensional coordinates only permit rotation about the origin {{math|(0, 0)}}. But 3-dimensional homogeneous coordinates can be used to first translate any point to {{math|(0, 0)}}, then perform the rotation, and lastly translate the origin {{math|(0, 0)}} back to the original point (the opposite of the first translation). These three affine transformations can be combined into a single matrix—thus allowing rotation around any point in the image.{{Cite book|url=https://people.cs.clemson.edu/~dhouse/courses/401/notes/affines-matrices.pdf|title=Affine Transformations|last=House, Keyser|date=6 December 2016|website=Clemson|series=Foundations of Physically Based Modeling & Animation|publisher=A K Peters/CRC Press|isbn=978-1-4822-3460-2|access-date=26 March 2019|archive-url=https://web.archive.org/web/20170830052734/https://people.cs.clemson.edu/~dhouse/courses/401/notes/affines-matrices.pdf|archive-date=30 August 2017|url-status=live}}

= Image denoising with mathematical morphology =

Mathematical morphology (MM) is a nonlinear image processing framework that analyzes shapes within images by probing local pixel neighborhoods using a small, predefined function called a structuring element. In the context of grayscale images, MM is especially useful for denoising through dilation and erosion—primitive operators that can be combined to build more complex filters.

Suppose we have:

  • A discrete grayscale image:

f =

\begin{bmatrix}

45 & 50 & 65 \\

40 & 60 & 55 \\

25 & 15 & 5

\end{bmatrix}, \quad f : \Omega \rightarrow \mathbb{R}, \quad \Omega = \{0, 1, 2\}^2,

  • A structuring element:

B =

\begin{bmatrix}

1 & 2 & 1 \\

2 & 1 & 1 \\

1 & 0 & 3

\end{bmatrix}, \quad B : \mathcal{S} \rightarrow \mathbb{R}, \quad \mathcal{S} = \{-1, 0, 1\}^2.

Here, \mathcal{S} defines the neighborhood of relative coordinates (m, n) over which local operations are computed. The values of B(m, n) bias the image during dilation and erosion.

; Dilation : Grayscale dilation is defined as:

(f \oplus B)(i, j) =

\max_{(m, n) \in \mathcal{S}}

\Bigl\{

f(i+m, j+n) + B(m,n)

\Bigr\}.

:For example, the dilation at position {{math|(1, 1)}} is calculated as:

\begin{aligned}

(f \oplus B)(1,1) =

\max\!\Bigl(

&f(0,0)+B(-1,-1), &\;45+1;&\\

&f(1,0)+B( 0,-1), &\;50+2;&\\

&f(2,0)+B( 1,-1), &\;65+1;&\\

&f(0,1)+B(-1, 0), &\;40+2;&\\

&f(1,1)+B( 0, 0), &\;60+1;&\\

&f(2,1)+B( 1, 0), &\;55+1;&\\

&f(0,2)+B(-1, 1), &\;25+1;&\\

&f(1,2)+B( 0, 1), &\;15+0;&\\

&f(2,2)+B( 1, 1) &\;5+3

\Bigr) = 66.

\end{aligned}

; Erosion : Grayscale erosion is defined as:

(f \ominus B)(i,j) =

\min_{(m,n) \in \mathcal{S}}

\Bigl\{

f(i+m, j+n) - B(m,n)

\Bigr\}.

:For example, the erosion at position {{math|(1, 1)}} is calculated as:

\begin{aligned}

(f \ominus B)(1,1)=

\min\!\Bigl(

&f(0,0)-B(-1,-1), &\;45-1;&\\

&f(1,0)-B( 0,-1), &\;50-2;&\\

&f(2,0)-B( 1,-1), &\;65-1;&\\

&f(0,1)-B(-1, 0), &\;40-2;&\\

&f(1,1)-B( 0, 0), &\;60-1;&\\

&f(2,1)-B( 1, 0), &\;55-1;&\\

&f(0,2)-B(-1, 1), &\;25-1;&\\

&f(1,2)-B( 0, 1), &\;15-0;&\\

&f(2,2)-B( 1, 1) &\;5-3

\Bigr)

=2.

\end{aligned}

== Results ==

After applying dilation to f:

\begin{bmatrix}

45 & 50 & 65 \\

40 & 66 & 55 \\

25 & 15 & 5

\end{bmatrix}

After applying erosion to f:

\begin{bmatrix}

45 & 50 & 65 \\

40 & 2 & 55 \\

25 & 15 & 5

\end{bmatrix}

== Opening and Closing ==

MM operations, such as opening and closing, are composite processes that utilize both dilation and erosion to modify the structure of an image. These operations are particularly useful for tasks such as noise removal, shape smoothing, and object separation.

  • Opening: This operation is performed by applying erosion to an image first, followed by dilation. The purpose of opening is to remove small objects or noise from the foreground while preserving the overall structure of larger objects. It is especially effective in situations where noise appears as isolated bright pixels or small, disconnected features.

For example, applying opening to an image f with a structuring element B would first reduce small details (through erosion) and then restore the main shapes (through dilation). This ensures that unwanted noise is removed without significantly altering the size or shape of larger objects.

  • Closing: This operation is performed by applying dilation first, followed by erosion. Closing is typically used to fill small holes or gaps within objects and to connect broken parts of the foreground. It works by initially expanding the boundaries of objects (through dilation) and then refining the boundaries (through erosion).

For instance, applying closing to the same image f would fill in small gaps within objects, such as connecting breaks in thin lines or closing small holes, while ensuring that the surrounding areas are not significantly affected.

Both opening and closing can be visualized as ways of refining the structure of an image: opening simplifies and removes small, unnecessary details, while closing consolidates and connects objects to form more cohesive structures.

class="wikitable"
Structuring element

! Mask

! Code

! Example

Original Image

| None

| Use Matlab to read Original image

original = imread('scene.jpg');

image = rgb2gray(original);

[r, c, channel] = size(image);

se = logical([1 1 1 ; 1 1 1 ; 1 1 1]);

[p, q] = size(se);

halfH = floor(p/2);

halfW = floor(q/2);

time = 3; % denoising 3 times with all method

| File:Lotus free.jpg

Dilation

| align="center" |

\begin{bmatrix}

1 & 1 & 1 \\

1 & 1 & 1 \\

1 & 1 & 1

\end{bmatrix}

| Use Matlab to dilation

imwrite(image, "scene_dil.jpg")

extractmax = zeros(size(image), class(image));

for i = 1 : time

dil_image = imread('scene_dil.jpg');

for col = (halfW + 1): (c - halfW)

for row = (halfH + 1) : (r - halfH)

dpointD = row - halfH;

dpointU = row + halfH;

dpointL = col - halfW;

dpointR = col + halfW;

dneighbor = dil_image(dpointD:dpointU, dpointL:dpointR);

filter = dneighbor(se);

extractmax(row, col) = max(filter);

end

end

imwrite(extractmax, "scene_dil.jpg");

end

| File:Lotus free dil.jpg

Erosion

| align="center" |

\begin{bmatrix}

1 & 1 & 1 \\

1 & 1 & 1 \\

1 & 1 & 1

\end{bmatrix}

| Use Matlab to erosion

imwrite(image, 'scene_ero.jpg');

extractmin = zeros(size(image), class(image));

for i = 1: time

ero_image = imread('scene_ero.jpg');

for col = (halfW + 1): (c - halfW)

for row = (halfH +1): (r -halfH)

pointDown = row-halfH;

pointUp = row+halfH;

pointLeft = col-halfW;

pointRight = col+halfW;

neighbor = ero_image(pointDown:pointUp,pointLeft:pointRight);

filter = neighbor(se);

extractmin(row, col) = min(filter);

end

end

imwrite(extractmin, "scene_ero.jpg");

end

| File:Lotus free erosion.jpg

Opening

| align="center" |

\begin{bmatrix}

1 & 1 & 1 \\

1 & 1 & 1 \\

1 & 1 & 1

\end{bmatrix}

| Use Matlab to Opening

imwrite(extractmin, "scene_opening.jpg")

extractopen = zeros(size(image), class(image));

for i = 1 : time

dil_image = imread('scene_opening.jpg');

for col = (halfW + 1): (c - halfW)

for row = (halfH + 1) : (r - halfH)

dpointD = row - halfH;

dpointU = row + halfH;

dpointL = col - halfW;

dpointR = col + halfW;

dneighbor = dil_image(dpointD:dpointU, dpointL:dpointR);

filter = dneighbor(se);

extractopen(row, col) = max(filter);

end

end

imwrite(extractopen, "scene_opening.jpg");

end

| File:Lotus free opening.jpg

Closing

| align="center" |

\begin{bmatrix}

1 & 1 & 1 \\

1 & 1 & 1 \\

1 & 1 & 1

\end{bmatrix}

| Use Matlab to Closing

imwrite(extractmax, "scene_closing.jpg")

extractclose = zeros(size(image), class(image));

for i = 1 : time

ero_image = imread('scene_closing.jpg');

for col = (halfW + 1): (c - halfW)

for row = (halfH + 1) : (r - halfH)

dpointD = row - halfH;

dpointU = row + halfH;

dpointL = col - halfW;

dpointR = col + halfW;

dneighbor = ero_image(dpointD:dpointU, dpointL:dpointR);

filter = dneighbor(se);

extractclose(row, col) = min(filter);

end

end

imwrite(extractclose, "scene_closing.jpg");

end

| File:Lotus free closing.jpg

Applications

{{further|Digital imaging|Applications of computer vision}}

=Digital camera images=

Digital cameras generally include specialized digital image processing hardware – either dedicated chips or added circuitry on other chips – to convert the raw data from their image sensor into a color-corrected image in a standard image file format. Additional post processing techniques increase edge sharpness or color saturation to create more naturally looking images.

=Film=

Westworld (1973) was the first feature film to use the digital image processing to pixellate photography to simulate an android's point of view.[http://www.beanblossom.in.us/larryy/cgi.html A Brief, Early History of Computer Graphics in Film] {{webarchive |url=https://web.archive.org/web/20120717074134/http://www.beanblossom.in.us/larryy/cgi.html |date=17 July 2012 }}, Larry Yaeger, 16 August 2002 (last update), retrieved 24 March 2010 Image processing is also vastly used to produce the chroma key effect that replaces the background of actors with natural or artistic scenery.

=Face detection=

File:Face detection process V1.jpg

Face detection can be implemented with mathematical morphology, the discrete cosine transform (DCT), and horizontal projection.

General method with feature-based method

The feature-based method of face detection is using skin tone, edge detection, face shape, and feature of a face (like eyes, mouth, etc.) to achieve face detection. The skin tone, face shape, and all the unique elements that only the human face have can be described as features.

Process explanation

  1. Given a batch of face images, first, extract the skin tone range by sampling face images. The skin tone range is just a skin filter.
  2. Structural similarity index measure (SSIM) can be applied to compare images in terms of extracting the skin tone.
  3. Normally, HSV or RGB color spaces are suitable for the skin filter. E.g. HSV mode, the skin tone range is [0,48,50] ~ [20,255,255]
  4. After filtering images with skin tone, to get the face edge, morphology and DCT are used to remove noise and fill up missing skin areas.
  5. Opening method or closing method can be used to achieve filling up missing skin.
  6. DCT is to avoid the object with skin-like tone. Since human faces always have higher texture.
  7. Sobel operator or other operators can be applied to detect face edge.
  8. To position human features like eyes, using the projection and find the peak of the histogram of projection help to get the detail feature like mouth, hair, and lip.
  9. Projection is just projecting the image to see the high frequency which is usually the feature position.

=Improvement of image quality method=

Image quality can be influenced by camera vibration, over-exposure, gray level distribution too centralized, and noise, etc. For example, noise problem can be solved by smoothing method while gray level distribution problem can be improved by histogram equalization.

Smoothing method

In drawing, if there is some dissatisfied color, taking some color around dissatisfied color and averaging them. This is an easy way to think of Smoothing method.

Smoothing method can be implemented with mask and convolution. Take the small image and mask for instance as below.

image is

\begin{bmatrix}

2 & 5 & 6 & 5\\

3 & 1 & 4 & 6 \\

1 & 28 & 30 & 2 \\

7 & 3 & 2 & 2

\end{bmatrix}

mask is

\begin{bmatrix}

1/9 & 1/9 & 1/9 \\

1/9 & 1/9 & 1/9 \\

1/9 & 1/9 & 1/9

\end{bmatrix}

After convolution and smoothing, image is

\begin{bmatrix}

2 & 5 & 6 & 5\\

3 & 9 & 10 & 6 \\

1 & 9 & 9 & 2 \\

7 & 3 & 2 & 2

\end{bmatrix}

Observing image[1, 1], image[1, 2], image[2, 1], and image[2, 2].

The original image pixel is 1, 4, 28, 30. After smoothing mask, the pixel becomes 9, 10, 9, 9 respectively.

new image[1, 1] = \tfrac{1}{9} * (image[0,0]+image[0,1]+image[0,2]+image[1,0]+image[1,1]+image[1,2]+image[2,0]+image[2,1]+image[2,2])

new image[1, 1] = floor(\tfrac{1}{9} * (2+5+6+3+1+4+1+28+30)) = 9

new image[1, 2] = floor({\tfrac{1}{9} * (5+6+5+1+4+6+28+30+2)) = 10

new image[2, 1] = floor(\tfrac{1}{9} * (3+1+4+1+28+30+7+3+2)) = 9

new image[2, 2] = floor(\tfrac{1}{9} * (1+4+6+28+30+2+3+2+2)) = 9

Gray Level Histogram method

Generally, given a gray level histogram from an image as below. Changing the histogram to uniform distribution from an image is usually what we called histogram equalization.

File:Gray level histogram.jpg

File:Uniform distribution.jpg

In discrete time, the area of gray level histogram is \sum_{i=0}^{k}H(p_i)(see figure 1) while the area of uniform distribution is \sum_{i=0}^{k}G(q_i)(see figure 2). It is clear that the area will not change, so \sum_{i=0}^{k}H(p_i) = \sum_{i=0}^{k}G(q_i).

From the uniform distribution, the probability of q_i is \tfrac{N^2}{q_k - q_0} while the 0 < i < k

In continuous time, the equation is \displaystyle \int_{q_0}^{q} \tfrac{N^2}{q_k - q_0}ds = \displaystyle \int_{p_0}^{p}H(s)ds.

Moreover, based on the definition of a function, the Gray level histogram method is like finding a function f that satisfies f(p)=q.

class="wikitable"
Improvement method

! Issue

! Before improvement

! Process

! After improvement

Smoothing method

| noise

with Matlab, salt & pepper with 0.01 parameter is added
to the original image in order to create a noisy image.

| File:Helmet with noise.jpg

|

  1. read image and convert image into grayscale
  2. convolution the graysale image with the mask

\begin{bmatrix}

1/9 & 1/9 & 1/9 \\

1/9 & 1/9 & 1/9 \\

1/9 & 1/9 & 1/9

\end{bmatrix}

  1. denoisy image will be the result of step 2.

| File:Helmet without noise.jpg

Histogram Equalization

| Gray level distribution too centralized

| File:Cave scene before improvement.jpg

| Refer to the Histogram equalization

| File:Cave scene after improvement.jpg

Challenges

  1. Noise and Distortions: Imperfections in images due to poor lighting, limited sensors, and file compression can result in unclear images that impact accurate image conversion.
  2. Variability in Image Quality: Variations in image quality and resolution, including blurry images and incomplete details, can hinder uniform processing across a database.
  3. Object Detection and Recognition: Identifying and recognising objects within images, especially in complex scenarios with multiple objects and occlusions, poses a significant challenge.
  4. Data Annotation and Labelling: Labelling diverse and multiple images for machine recognition is crucial for further processing accuracy, as incorrect identification can lead to unrealistic results.
  5. Computational Resource Intensity: Accessing adequate computational resources for image processing can be challenging and costly, hindering progress without sufficient resources.

See also

References

{{Reflist}}

Further reading

  • {{cite book|author1=Solomon, C.J. |author2=Breckon, T.P. | title=Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab| year=2010| publisher=Wiley-Blackwell| doi=10.1002/9780470689776| isbn=978-0-470-84473-1}}
  • {{Cite book|author1=Wilhelm Burger |author2=Mark J. Burge |

title = Digital Image Processing: An Algorithmic Approach Using Java |

publisher = Springer |

year = 2007 |

url = http://www.imagingbook.com/ |

isbn=978-1-84628-379-6}}

  • {{Cite book|author1=R. Fisher |author2=K Dawson-Howe |author3=A. Fitzgibbon |author4=C. Robertson |author5=E. Trucco |

title=Dictionary of Computer Vision and Image Processing |

publisher=John Wiley |

year=2005 |

isbn=978-0-470-01526-1}}

  • {{Cite book|author1=Rafael C. Gonzalez |author2=Richard E. Woods |author3=Steven L. Eddins |

title=Digital Image Processing using MATLAB |

publisher=Pearson Education |

year=2004 |

isbn=978-81-7758-898-9}}

  • {{Cite book|

author=Tim Morris |

title=Computer Vision and Image Processing |

publisher=Palgrave Macmillan |

year=2004 |

isbn=978-0-333-99451-1}}

  • {{Cite book|

author=Vipin Tyagi |

title=Understanding Digital Image Processing |

publisher=Taylor and Francis CRC Press |

year=2018 |

isbn=978-11-3856-6842}}

  • {{Cite book|author1=Milan Sonka |author2=Vaclav Hlavac |author3=Roger Boyle |

title=Image Processing, Analysis, and Machine Vision |

publisher=PWS Publishing |

year=1999 |

isbn=978-0-534-95393-5}}

  • {{cite book | last1=Gonzalez | first1=Rafael C. | last2=Woods | first2=Richard E. | title=Digital image processing | publisher=Prentice Hall | publication-place=Upper Saddle River, N.J. | date=2008 | isbn=978-0-13-168728-8 | oclc=137312858}}
  • {{cite book | last=Kovalevsky | first=Vladimir | title=Modern algorithms for image processing: computer imagery by example using C# | publication-place=[New York, New York] | date=2019 | isbn=978-1-4842-4237-7 | oclc=1080084533}}