This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
knowledge:dip:morphological-image-analysis [2016/11/03 10:06] pkleczek [TODO] |
— (current) | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Morphological Image Analysis ====== | ||
- | //Morphological Image Analysis -- Principles and Applications//, Pierre Soille | ||
- | |||
- | ===== Erosion and Dilation ===== | ||
- | |||
- | |||
- | ==== Erosion ==== | ||
- | |||
- | The first question that may arise when we probe a set with a structuring element is //"Does the structuring element fit the set?"// The eroded set is the locus of points where the answer to this question is affirmative. | ||
- | |||
- | The eroded value at a given pixel $x$ is the **minimum** value of the image in the window defined by the structuring element when its origin is at $x$: | ||
- | $$[\varepsilon_{B}(f)](x) = \min_{b \in B} f(x+b)$$ | ||
- | |||
- | FIXME 81, 82 | ||
- | |||
- | ==== Dilation ==== | ||
- | |||
- | The dilation is the dual operator of the erosion and is based on the following question: //"Does the structuring element hit the set?"// The dilated set is the locus of points where the answer to this question is affirmative. | ||
- | |||
- | the dilated value at a given pixel $x$ is the **maximum** value of | ||
- | the image in the window defined by the structuring element when its origin | ||
- | is at $x$: | ||
- | $$[\delta{B}(f)](x) = \max_{b \in B} f(x+b)$$ | ||
- | |||
- | FIXME 83, 84 | ||
- | |||
- | ==== Basic morphological gradients ==== | ||
- | |||
- | Only symmetric structuring elements containing their origin are considered. By doing so, we make sure that the arithmetic difference is always nonnegative. | ||
- | |||
- | - arithmetic difference between the dilation and the erosion (//Beucher gradient//): $\rho_B = \delta_B - \varepsilon_B$ (= maximum variation of the grey level intensities within the neighbourhood) | ||
- | - arithmetic difference between the dilation and the original image (//half-gradient by dilation//, //external gradient//): $\rho_{B}^{+} = \delta_B - id$ | ||
- | - arithmetic difference between the original image and its erosion (//half-gradient by erosion//, //internal gradient//): $\rho_{B}^{-} = id - \varepsilon_B$ | ||
- | |||
- | The choice between internal or external gradient depends on the geometry and relative brightness of the objects to be extracted. For instance, an external gradient applied to a two or one pixel thick dark structure will provide a thin edge following the structure whereas an internal gradient will output a double edge (one on each side of the structure). | ||
- | |||
- | FIXME 102 | ||
- | |||
- | If the size of the SE is greater than 1, morphological gradients are referred | ||
- | to as //thick gradients//: $$\rho_{nB} = \delta_{nB} - \varepsilon_{nB}$$. \\ | ||
- | Thick gradients give the maximum variation of the function in a neighbourhood | ||
- | of size $n$. If the size $n$ equals the width $e$ of the transition between regions of homogeneous grey level, the thick gradient will output the contrast value $h$ between these regions. These gradients are therefore recommended when the transitions between objects are smooth. However, thick gradients output thick edges. | ||
- | |||
- | FIXME 88 | ||
- | |||
- | FIXME A combination of thick gradients of increasing size avoiding thick edges is presented in Sec. 4.6. | ||
- | |||
- | ===== Opening and Closing ===== | ||
- | |||
- | |||
- | ==== Opening ==== | ||
- | |||
- | The opening $\gamma$ of an image $f$ by a structuring element $B$ is denoted by $\gamma_{B}(f)$ and is defined as the erosion of $f$ by $B$ followed by the dilation with the reflected SE $B$: | ||
- | $$\gamma_{B}(f) = \delta_{\check{B}}[\varepsilon_{B}(f)]$$ | ||
- | |||
- | It is essential to consider the reflected SE for the dilation. Indeed, an erosion corresponds to an intersection of translations. It follows that a union of translations in the opposite direction (i.e., a dilation by the reflected SE) must be considered when attempting to recover the original image. | ||
- | |||
- | Geometric formulation in terms of SE fit using the question already introduced for the erosions: //"Does the structuring element fit the set?"// Each time the answer to this question is affirmative, the whole SE must be kept (for the erosion, it is the origin of the SE that is kept). | ||
- | |||
- | FIXME 120, 121 | ||
- | |||
- | The shape and size of the structuring element must be set according to the image structures that are to be extracted. For instance, if we are interested in removing all elongated objects while keeping disc shaped objects, the appropriate structuring element is a disc having a diameter larger than the width of the elongated objects. | ||
- | |||
- | ==== Closing ==== | ||
- | |||
- | The closing of an image $f$ by a structuring element $B$ is denoted by $\phi_{B}(f)$ and is defined as the dilation of $f$ with a structuring element $B$ followed by the erosion with the reflected structuring element $\check{B}$: | ||
- | $$\phi_{B}(f) = \varepsilon_{\check{B}}[\delta_{B}(f)]$$ | ||
- | |||
- | Using set formalism, we have the following question for defining a closing: //"Does the BE fit the background of the set?"// If yes, then all points of the SE belong to the complement of the closing of the set. | ||
- | |||
- | FIXME 123, 124 | ||
- | |||
- | ==== Area opening and closing ==== | ||
- | |||
- | area opening -- Removing all connected components whose area in number of pixels is smaller than a given threshold value $\lambda$: | ||
- | $$\gamma_{\lambda} = \bigvee_{i} \{ \gamma_{B_i} \text{is connected and } \text{card}(B_i) \geq \lambda \}$$ | ||
- | |||
- | area closing -- dual to area opening: | ||
- | $$\gamma_{\lambda} = \bigwedge_{i} \{ \phi_{B_i} \text{is connected and } \text{card}(B_i) \geq \lambda \}$$ | ||
- | |||
- | ==== Parametric opening and closing ==== | ||
- | |||
- | At least $\lambda$ pixels of the considered structuring element $B$ have to fit the foreground pixels (denoted as $\gamma_{B, \lambda}$) | ||
- | |||
- | It can be shown that the parametric opening is equivalent to the intersection (point-wise minimum operator $\wedge$) between the identity transformation and the dilation | ||
- | by $\check{B}$ of the rank filter $\zeta$ using $B$ as kernel and $n - \lambda + 1$ as rank (such an interpretation is easier to implement and performs much faster): | ||
- | $$\gamma_{B, \lambda} = id \wedge \delta_{\check{B}} \zeta_{B, n - \lambda + 1}$$ | ||
- | |||
- | Similarly, for parametric closing: | ||
- | $$\phi_{B, \lambda} = id \vee \varepsilon_{\check{B}} \zeta_{B, \lambda}$$ | ||
- | |||
- | Parametric openings and closings are very useful in practice because they are much more flexible and less sensitive to noise than the corresponding morphological openings and closings. | ||
- | |||
- | FIXME 129 | ||
- | |||
- | ==== Annular opening ==== | ||
- | |||
- | The annular opening of an image is defined as the intersection between the | ||
- | dilation of the image with a ring shaped SE and the original image: $\delta_{\bigcirc}(f) \wedge f$, where $\bigcirc$ is a ring-shaped structuring element. Since the ring SE does not contain its origin, the input image is not included in its dilation by the ring | ||
- | SE. | ||
- | |||
- | Annular openings are useful for extracting clusters in an image since isolated blobs are not covered by the dilation of other blobs. | ||
- | |||
- | FIXME 129 | ||
- | |||
- | ==== Convex hull closing ==== | ||
- | |||
- | FIXME 119 | ||
- | |||
- | Used to detect concavity regions. | ||
- | |||
- | ==== Top-hats ==== | ||
- | |||
- | The choice of a given morphological filter is driven by the available knowledge about the shape, size, and orientation of the structures we would like to filter. Morphological top-hats proceed //a contrario//. Indeed, the approach undertaken with top-hats consists in using knowledge about the shape characteristics that are **not shared** by the relevant image structures. An opening or closing with a SE that does not fit the relevant image structures is then used to remove them from the image. These structures are recovered through the arithmetic difference between the image and its opening or between the closing and the image. | ||
- | |||
- | It is sometimes easier to remove relevant image objects than trying to directly suppress the irrelevant objects. | ||
- | |||
- | === White top-hat (WTH) === | ||
- | |||
- | WTH of an image $f$ is the difference between the original | ||
- | image $f$ and its opening $\gamma$: | ||
- | $$\text{WTH}(f) = f - \gamma(f)$$ | ||
- | |||
- | Since the opening is an anti-extensive image transformation, the grey scale values of the white top-hat are always greater or equal to zero. | ||
- | |||
- | FIXME 136 | ||
- | |||
- | === Black top-hat (BTH) === | ||
- | |||
- | BTH of an image $f$ is the difference between its closing $\phi$ and the original | ||
- | image $f$: | ||
- | $$\text{BTH}(f) = \phi(f) - f$$ | ||
- | |||
- | Owing to the extensivity property of the closing operator, the values of the black top-hat images are always greater or equal to zero. | ||
- | |||
- | FIXME 137 (4.16, 4.17) | ||
- | |||
- | === Remarks === | ||
- | |||
- | In situations where the input image is corrupted by a high frequency noise signal, it must be filtered out before using top-hat transforms to avoid side effects. For example, a closing by a small SE should be considered before computing a white top-hat and an opening before a black top-hat. | ||
- | |||
- | === Applications === | ||
- | |||
- | If the image objects have all the same local contrast, i.e., if they are either all darker or brighter than the background, top-hat transforms can be used for mitigating illumination gradients. Indeed, a top-hat with a large isotropic structuring element acts as a high-pass filter. As the illumination gradient lies within the low frequencies of the image, it is removed by the top-hat. White top-hats are used for dark backgrounds and black top-hats for bright backgrounds. | ||
- | |||
- | FIXME 138 | ||
- | |||
- | If the contrast between the objects and the background is decreasing when the background is darkening, a better visual rendering may be obtained by dividing the input image by the closing (or opening). | ||
- | |||
- | FIXME 139 | ||
- | |||
- | In quality control applications where a series of objects are acquired at a fixed position, another solution consists in first capturing an image without any object and then perform the point-wise division of further image captures with this background image. | ||
- | |||
- | A simple neighbourhood-based morphological contrast operator can be obtained by computing in parallel the white and black top-hat of the image. The white top-hat is then added to the original image to enhance bright objects and the black top-hat is subtracted from the resulting image to enhance dark objects. We denote this top-hat contrast operator by $\kappa^{\text{TH}}$: | ||
- | $$\kappa^{\text{TH}} = id + WTH_{B} - BTH_{B} = 3id - \phi_{B} - \gamma_{B}$$ | ||
- | |||
- | The output values falling outside the dynamic range of the input image, i.e. $[t_{\min}, t_{\max}]$, are set to $t_{\min}$ or $t_{\max}$ depending on whether they fall below or above the dynamic range. | ||
- | |||
- | FIXME 141 | ||
- | |||
- | ==== Multiscale gradient ==== | ||
- | |||
- | Smooth edges detected using thick gradients are thick. Moreover, when the distance separating two boundaries of a region is smaller than the width of the SE, the resulting | ||
- | edges merge together. Both problems can be avoided by the morphological multiscale gradient. | ||
- | |||
- | The morphological gradient at scale $n$: | ||
- | $$\rho_{nB}^{\star} = \rho_{nB} \cdot T_{[1,t_{\max}]} \varepsilon_{(n-1)B}\text{WTH}_{nB} \rho_{nB}$$ | ||
- | (where $\cdot$ denotes the point-wise multiplication of two images) | ||
- | |||
- | FIXME 142 (4.22), 143 (4.23, 4.24) | ||
- | |||
- | The width of the transitions can be determined by analysing the output values of the gradient at each size $n$ since these values increase until the width of the transition is reached. If the width of the transition is smaller than the width of the object, there is of course no way to get a strong gradient value | ||
- | |||
- | The //non-parametric multiscale morphological gradient// $\rho^{\star}$, an edge map | ||
- | at all scales is obtained by computing the point-wise maximum between the $\rho_{nB}^{\star}$ for all $n$: | ||
- | $$\rho^{\star} = \bigvee_{nB} \rho_{nB}^{\star}$$ | ||
- | |||
- | |||
- | ==== An industrial application ==== | ||
- | |||
- | Theoretical models predicting the deformation of metal sheets during the stamping process are validated by comparing expected with actual deformations. Conventional deformation measurement methods consist in drawing a grid of lines on the metal sheet before stamping and matching this grid with the one observed on the stamped sheet. These images have the following characteristics: | ||
- | * The orientation of the grid pattern is a priori unknown, i.e. , arbitrary angle between grid lines and x-y axis of the image plane. | ||
- | * A stretching of a metal sheet in one direction is mostly counterbalanced by a shrinking in the opposite direction, so that areas of the original grid patterns are almost not changed by stamping processes. | ||
- | * Metallic reflections and grid damages while stamping lead to a weak signal to noise ratio. Consequently, automatic histogram thresholding techniques are not well suited to this kind of images. | ||
- | * It may happen that parts of the metal sheet in the field of view of the camera are not perpendicular to the optical axis of the camera. This may lead to illumination effects. | ||
- | |||
- | FIXME 145 | ||
- | |||
- | As the area of the grid within an image frame is known and not modified during stamping, one could automatically determine a threshold level for extracting the grid. But due to to the high level of noise and the inhomogeneous illumination, the input images must be filtered beforehand. Nevertheless, thresholded images after filtering still contain a lot of irrelevant information. The search of the two main directions of the grid will help us filtering the image along the two main directions of the grid. | ||
- | |||
- | \\ | ||
- | |||
- | * **Preliminary filtering.** First, small scale salt and pepper noise is removed using an opening with a square of size 1 followed by a closing with the same SE. The illumination function is then subtracted from the original image by a large black top-hat transformation. The complement of the black top-hat is considered for getting an image similar to the original image and not to its complement. | ||
- | |||
- | * **Determination of the two main directions.** The preliminary filtering allows us to use the same threshold value for the whole image. This threshold level is determined by the grey level whose value in the cumulative grey level frequency distribution equals $v$ (where $v$ is the ratio of the area of the grid to the area of the image frame). The resulting binary image is then used for finding the two main directions of the grid pattern. They are defined as the two maxima of the curve obtained by plotting the number of pixels remaining after the erosion of the grid by a pair of points while varying the orientation of these points. | ||
- | |||
- | * **Grid pattern extraction.** The thresholded image is then filtered with openings by line segments along the two main directions of the grid. This allows to extract two masks of grid lines (i.e., one for each direction) and to remove all irrelevant information. Closings with line segments allow then to connect disconnected grid lines. Additional filtering such as removal of holes are also performed. The union of the filtered lines in both directions provides us with a mask of the grid lines. | ||
- | ===== Hit-or-miss and Skeletons ===== | ||
- | |||
- | ==== Hit-or-miss transform ==== | ||
- | |||
- | In order to perform a hit-or-miss transform, the SE is set to every possible position of the image. At each position, the following question is considered //"Does the first set fit the foreground while, simultaneously, the second set misses it {i.e., fits the background)?"// If the answer is affirmative, then the image point matched by the origin of the SE is a point of the hit-or-miss transformation of the image. | ||
- | |||
- | Denotion: $\text{HMTB}_{B}(X)$, $X \circledast B$ | ||
- | |||
- | FIXME TODO | ||
- | |||
- | ===== TODO ===== | ||
- | |||
- | FIXME 155 |