Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

An adaptive approach for simultaneous classification of remote sensing scenes including rural and urban targets

An adaptive approach for simultaneous classification of remote sensing scenes including rural and... GEOLOGY, ECOLOGY, AND LANDSCAPES 2021, VOL. 5, NO. 3, 217–226 INWASCON https://doi.org/10.1080/24749508.2019.1706833 RESEARCH ARTICLE An adaptive approach for simultaneous classification of remote sensing scenes including rural and urban targets a b Letícia Sartorio and Daniel Zanotta a b Center for human and information sciences, University of Rio Grande, Rio Grande, Brazil; Advanced Visualization & Geoinformatics Laboratory, UNISINOS University, São Leopoldo, Brazil ABSTRACT ARTICLE HISTORY Received 2 October 2019 In this paper, an automatic adaptive image classification framework designed to operate in Accepted 12 December 2019 multiresolution scenes including rural and urban targets is proposed and tested. Traditional image analysis is commonly aimed to classify images using a single strategy and source of data KEYWORDS over the entire scene. Ideally, urban targets should predict specialized classification systems Classification; remote using high spatial resolution images, such as object-based image analysis and non-parametric sensing; image classifiers. Conversely, rural targets should be handled with the high-spectral resolution, pixel- segmentation; pattern based classification approaches, and parametric techniques. The formulation proposed in this recognition; landscape study starts by performing an prior separation of rural and urban areas by assuming Central Limit Theorem (CLT) establishments. Then, both kinds of targets are labelled in an automatic adaptive fashion, each one with proper data and method previously selected. One experiment performed using set of data composed by a high spatial resolution true-colour image and a multispectral image, as well as preselected classification techniques particularly adjusted for each case. Visual and quantitative assessing by two accuracy metrics testing the proposed approach versus traditional classification confirm the soundness of the proposed framework. Introduction analyzed individually for labelling, or at the object The increasing development of sophisticated remote level (object-based), where a set of pixels is previously sensing instruments has conferred considerable merged receiving a single label (Moosavi, Talebi, & improvements in the quality of images acquired from Shirmohammadi, 2014). The pixel-based classification space (Gholoobi & Kumar, 2015). Public administra- is restricted to use only the spectral information of tion has been facing a growing dependence of rapid pixels as the unique attribute, not considering any and reliable monitoring of an increasingly dynamic other aspect in the process (Weih & Riggan, 2010). and complex scenario. High spatial resolution imaging For many applications, this approach is able to retrieve sensors onboard satellites are one of the examples a thematic map showing the elements of interest which have allowed detailed land use and land cover throughout the image with reasonable precision. mapping at a high efficiency and relatively low cost However, in more complex applications, e.g. involving (Fisher, Eileen, James Dennedy-Frank, Kroeger, & small structures or well elaborated shapes, like urban Boucher, 2017), mainly over urban areas and other areas or data including detailed targets like some agri- complex environments. Missions that brought cultural areas or lithological mapping, the object-based advances in this direction are, in chronological order, classification is more suitable (Zhou, Troy, & Grove, IKONOS, QuickBird, RapidEye, Geoeye, WorldView 2008). The reason is that object-based approach is per- (Chuvieco, 2016), as well as aircraft and the recent formed in two basic steps: image segmentation, that unmanned aerial vehicle (UAV) images. The presence aims to group similar pixels in objects, and classifica- of complex ground targets is common in this type of tion, that aims to label the resulting objects (Whiteside, high spatial resolution images but can be adequately Boggs, & Maier, 2011). Working with objects allows the classified by modern computational techniques like analyst to explore not only radiometric information, as Support Vector Machines (SVM), Random Forests in pixel-based approach, but also attributes like texture, (RF), and, more recently, by Convolutional Neural shape, size, and context, improving the classification Networks (CNN) (Jensen, 2009). Indeed, these are process (Duro, Franklin, & Dublé, 2012). Indeed, the the most common image classification techniques object-based approach is able to take advantage of sur- recently found in the literature. rounding and circumstantial characteristics like rough- Automatic image classification can be performed at ness, neighborhood, size, and morphology of resulting the pixel level (pixel-based), where each pixel is objects. CONTACT Daniel Zanotta dzanotta@edu.unisinos.br UNISINOS University, São Leopoldo, Brazil © 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group on behalf of the International Water, Air & Soil Conservation Society(INWASCON). This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 218 L. SARTORIO AND D. ZANOTTA Due to the above-mentioned reasons, the latest con- same area: one true colour (RGB) low-spectral/high- sensus of the specialized literature indicates that urban spatial resolution and another high-spectral/low- targets (buildings, roads, trees, small waterbodies) pre- spatial resolution. The proposed technique, which sent in high spatial resolution images should be classi- will be thoroughly described in what follows, is based fied by objects (Bhaskaran, Paramananda, & on the automatic prior separation of urban and rural Ramnarayan, 2010; Ma et al., 2017), since detailed targets through the well-known Central Limit information about shape, texture, and context are very Theorem (CLT). The above mentioned prior identifi- important attributes to differentiate among targets these cation relies on the fact that most rural or natural standard attributes (Mather & Tso, 2016). Object-based elements (i.e., fields, forests, soils, rocks) present approach along with hierarchical and nonparametric classes with normal (Gaussian) probability density classifier (no assumption of probability distribution of function, whereas urban classes (i.e., buildings, roads, classes), is expected to be more effective in the urban and other city structures) generally present indescrib- scenes. Conversely, rural targets (fields, forests, medium able probability density function (Billingsley, 1995). and large waterbodies, rocks, and varied soils) oughta Once the two primary targets were parameterized, be better classified at pixel-level, with images including a maximum likelihood classifier can be used to iden- as many as spectral bands are possible (Aguirre- tify urban and rural pixels in the low spatial resolution Gutiérrez, Seijmonsbergen, & Duivenvoorden, 2012; image available. Then, in an adaptive fashion, our Ferreira, Zortea, Zanotta, Shimabukuro, & Souza strategy follows by assigning pre-identified urban Filho, 2016). Rural or natural targets present large areas to classification using high spatial resolution areas and subtle radiometric variations along distances, image data associated with object-based analysis, which allows correct description by even very low spa- whereas rural areas are assigned with low spatial reso- tial resolution images. Due to this reason, sensors lution images and associated to pixel-level analysis. designed to monitor these areas usually have many The combination of two simultaneous approaches spectral bands (Fisher et al., 2017), enabling detailed and images of different spatial and spectral resolutions description about the chemical composition of targets, aims at producing a robust tool for optimizing classi- which is crucial for lithological or vegetation mapping fication of the scenes covering complex and hetero- (Herold, Roberts, Gardner, & Dennison, 2004; geneous environments by mainly two important Lillesand, Kiefer, & Chipman, 2014). Furthermore, for reasons: (1) consideration of multi-source data, attri- rural targets, parametric classifiers (which assume well- butes, and classification strategy for separate areas, known probability distribution of classes) might esti- and (2) limiting the number of classes for each classi- mate the classes with greater efficiency, since it is fication problem, reducing overlap among classes. designed to work with only radiometric information, which is very abundant in multispectral images of med- ium to low spatial resolution (Whiteside et al., 2011). Materials and methods There are several strategies for improving and First stage: prior identification of urban and rural refine the classification of high spatial resolution and areas hyperspectral images (Zhao, Du, & Emery, 2017; Zhong et al., 2017; Zhong, Ma, Ong, Zhu, & Zhang, The hybrid classification proposed here assumes the 2017). Despite the recent improvements, the classifi- existence of two images covering the same region: cation of scenes simultaneous including urban and a low spatial resolution multispectral image, which rural targets remains a challenge for classifiers tradi- will be used to classify the rural part of the scene, tionally used for remote sensing image recognition. As and a high spatial resolution (not necessarily multi- can be noted this condition is very common and spectral) to classify the urban portion of the scene. For brings many challenges when dealing with mapping rural areas, it is very convenient to have heterogeneous areas. Analysts usually rely on the time- a multispectral image, with as many channels as it consuming and labor-intensive prior separation of the needs for a correct characterization of the natural different targets by visual interpretation followed by targets involved (Fisher et al., 2017). This assumption independent classification of each area. An alternative relies on the fact that the chemical composition is very is the selection of a unique method retrieving the best important to characterize the subtle differences among trade-off using the high spatial resolution image avail- spectral signatures. At the same time, for the urban able. However, the use of only one type of image data area, the mandatory image parameter is the pixel size, and classification strategy in these mixed scenarios not the number of channels (Myint, Gober, Brazel, hinders the optimization of the accuracy of the results. Grossman-Clarke, & Weng, 2011). For example, for The present study suggests an automatic adaptive buildings and roads, the shape and texture of objects framework to overcome the classification of scenes are crucial information for recognizing them. simultaneously including rural and urban classes. It We assume images with fine spatial registration assumes the use of two different images covering the (geometrical alignment) and radiometrically GEOLOGY, ECOLOGY, AND LANDSCAPES 219 corrected. The first stage is the core of the proposed (Blaschke, Lang, & Hay, 2008), where fp ,p ,..., p } 1 2 m technique. It uses the low spatial resolution image to are the proportions occupied by m targets. Then, the separate the hereafter called primary classes: urban ω spectral response r for each channel k can be U k and rural ω . It is important to stress that the aim at depicted as: Rj this stage is not to find the definitive classes of targets, r ¼ r p þ r p þ ... þ r p (3) k k1 1 k2 2 km m but only to identify the nature of targets on the scene, whether urban or rural. We understand as rural those where r is the spectral response of the pixel in the kth classes representing natural elements like waterbodies, spectral band, r is the pure spectral response of each km fields, rocks, bare soils, forests, crops, etc. Therefore, target present in the pixel in the kth spectral band. even in the early stages, rural class ω can assume Rj The above demonstration shows the multi-source more than one subclass j. nature of large pixels under images including varied We employ the Central Limit Theorem (CLT) to targets. Exploratory experiments have indeed proved perform the task of separating ω and ω by consid- U Rj our initial assumption regarding the expected Normal ering these to primary classes as two different popula- distributions of primary classes included in low spatial tions. CLT states that, given a set of sufficiently large resolution images. Since the aim of the proposed samples collected from a population with a finite level method is not to estimate the proportions of pixel’s of variance, the mean of all samples from the same compounds (unmixing), the number and nature of population will be approximately equal to the mean of targets in each pixel along the image can vary, but the population (Zhong et al., 2017). Furthermore, the causing any loss of validity. set of samples will approximate to a Normal/Gaussian The proposed method proceeds by collecting sam- distribution pattern, with variances approximately ples of pixels corresponding to the primary classes equal to the variance of the whole population divided directly over the low spatial resolution image (i.e., by each sample’s size n. Following this theorem, urban ω and as many rural as exists ω ). The statis- U Nj images with sufficiently large pixel sizes (low spatial tical information with normal distribution assumed resolution data) including contributions from many for these classes (e.g., vector means μ and covariance targets together are expected to have classes presenting matrix  ) can then be derived and used to feed Normal/Gaussian probability density distributions, parametric classification rules. As stated before, data which allow us to determine, in a prior fashion, the presenting this behaviour show a high level of differ- main nature of the pixels: whether urban or rural. entiation by probabilistic classifiers, such as maximum To better understand the proposed method and the likelihood, which can be expressed by the following adequacy of CLT to the related problem, let membership function (Eq.4), derived from Bayes x ; x ; ... ; x be a randomly selected sample of size 1 2 n theorem: n, a sequence of independent and identically distrib- ΦðÞ x ¼ Pðx jω Þ uted random variables drawn from distributions of c i i c 1 1 expected values given by µ and finite variances given ¼ exp  ðx  μ Þ   1 x  μ pffiffiffiffiffi pffiffiffiffiffiffiffiffi i c i c c by σ . Consider we are interested in the samples aver- 2π jj age X of these random variables. (4) x þ x þ ... þ x 1 n 1 2 n where ΦðÞ x is the probability density function of X ¼ ¼ x (1) c i n i i¼1 n n a pixel x belonging to class ω , which can initially i c The theorem implies that sample averages converge to assume urban (ω ) or rural (ω ), d is the dimension- U Rj the expected value µ as n→ ∞ and, for large enough size ality of the data, x the spectral response of pixel i, μ is n, the distribution of X is closer to the Normal distribu- the mean vector and  the covariance matrix of class tion with mean µ and variance σ /n.Asn approaches to ω . These samples are then used to train the supervised pffiffiffi classifier, which will later be used to determine the infinity, the random variables n(X -µ)convergeto primary classes over the entire scene studied. The a normal distribution with µ =0, NðÞ 0; σ . expected result is a mask separating rural and urban pffiffiffi 1 n d zone able to direct what kind of classifier is applied in n x  μ ! N 0; σ (2) i¼1 each area. The suggested mask can be built from the following rule: Thus, spectral generalization caused by pixels with large sizes is important and contributes to the proper opera- MðÞ i 2 ω if ΦðÞ x ¼ maxfg ΦðÞ x " cUfg ; Rj (5) c c i i tion of the proposed framework. To adapt the CLT for the real problem approached here, we transfer the con- where MðÞ i corresponds to the position of the pixel x cept of one-dimensional sample average X to the mul- in the mask M and ΦðÞ x is the membership vector n i tispectral response r of each pixel over the low spatial containing the ΦðÞ x values for each class. The urban c i resolution image. We assume each pixel’sresponse r as (ω ) and rural (ω ) classes proceed to the second i U Rj a linear combination of targets included inside it stage of the methodology. At this point, the user can 220 L. SARTORIO AND D. ZANOTTA consider performing a morphologic dilation of few causes overfitting of the training samples. RF is an pixels along the urban area perimeter to guarantee to alternative that avoids overfitting by averages multiple enclose of urban targets. The rationale behind this deep decision trees trained in different parts of the procedure is the assumption that it is even worth same training dataset (Hastie, Tibshirani, & including rural classes inside the urban area by mis- Friedman, 2009). take, instead of letting urban elements outside it to be The area in M recognized as rural can keep the wrongly classified as rural. original classes received at the primary stage, or can be classified again using the low spatial resolution image by a pixel-based approach and generalist classi- Second stage: adaptive classification system fication technique, such as Linear or Quadratic Discriminant Analysis (LDA, QDA) or Maximum As the literature suggests, due to the high-frequency Likelihood. The generalist classification technique to spectral behavior verified in urban areas, these sites operate on the low spatial resolution image is defined have shown better classification results when classified as G, whereas the classification technique chosen to by specialized techniques (Zanotta, Haertel, operate on the segmented high spatial resolution Shimabukuro, & Renno, 2014), which are able to image is defined as H. The adaptive classification of take into account many parameters and specificities the entire scene proceeds obeying the following rule: of targets (Lu, Hetrick, & Moran, 2010). At the same time, to prospect the ability to handle information x ) Hif MðÞ i 2 ω i U from many kinds simultaneously and to avoid multi- (6) x ) Gif MðÞ i 2 ω i Nj labelling of unique objects formed by groups of pixels, the most recent studies have suggested using object- where one pixel x is expected to be classified by high based approaches for urban classification (Blaschke spatial resolution image, as well as technique H, only if et al., 2008). Conversely, rural environments are it is recognized as urban area in the first stage more appropriately classified using detailed multispec- ðÞ MðÞ i 2 ω . Conversely, if the pixel x is recognized U i tral information, instead of data about the shape, tex- as rural in the first step (MðÞ i 2 ω ), then this element Rj ture, or spatial context of targets. Therefore, the large is expected to be classified by the low spatial resolution is the number of available spectral channels, the better data, operated by technique G.A flowchart of the is the recognition of the target. Many kinds of land proposed technique is presented in Figure 1. cover classes like vegetation, rocks, and soils present The resulting classification map is expected to pre- very similar characteristics, which are often differen- sent an improvement according to traditional meth- tiated only by detailed inspection of spectral signatures ods applying a single rule throughout the scene, (Dinis et al., 2010). ignoring the fact that it contains targets of distinct The portion of the high spatial resolution image M natures. As said before, the core of the proposed recognized as the urban area is then segmented and technique is to automatically exploit the advantages directed to the complementary classification step. The of each source of data and the potentials of proper segmentation process intends the aggregation of simi- classification tools for every specific environment. lar neighboring pixels to produce objects with Furthermore, the reduction in the number of classes improved attributes (Jensen & Lulla, 1987). For sake available for each classification system is expected to of simplicity, we chose the widely used region growing avoid overlaps and confusion among classes, improv- segmentation technique available in many image pro- ing overall classification results. cessing packages. Region growing starts by merging individual pixels using spectral similarity criteria, which can be more adequately classified by using not Results only their spectral attributes, but also texture, shape, Data description and spatial context features (Blaschke, Kelly, & Merschdorf, 2015). Resulting objects are then classi- In order to test and exemplify the performance of the fied using one of the techniques suitable for urban methodology suggested in this study, we performed one environments. The most popular approaches for this experiment with and area located in Cape Town, type of application are those which can handle many Western Cape Province, South Africa. The images are classes at the same time, while avoiding overfitting and geometrically and radiometrically/atmospherically cor- making optimized usage of the available attributes. rected. Standing for low spatial resolution data we Some modern examples are the hierarchical Random have a Landsat 8-OLI, acquired on 3 September 2013 Forests (RF) (Jiang, Wang, Yang, Xie, & Cheng, 2010), (Figure 2(a)). The image has 30 m spatial resolution for since it can manage different attributes according to the spectral bands used in the experiment (1–7). The the importance of each one to the specific problem high spatial resolution image came from GeoEye-1 addressed. Traditional decision trees classifiers tend to acquired on 31 July 2013, with 1.65 m spatial resolution learn highly irregular patterns, which frequently (Figure 2(b)). Two images with different resolutions GEOLOGY, ECOLOGY, AND LANDSCAPES 221 Figure 1. Flowchart of the proposed approach. (a) (b) (c) Figure 2. (a): true colour composition (4 3 2) of a Landsat 8-OLI subset covering Cape Town/South Africa, (b): the same subset imaged by GeoEye-1 in 3 2 1 true colour composition (c): Ground truth made with information from both images simultaneously (ground truth second stage). The colour palette refers to the ground truth image. Black areas refer to very ambiguous or too much mixed areas and were not considered in the accuracy assessment. covering the same area were used: one low spatial resolu- resolution image, but avoiding areas considered as tion multispectral image covering the whole study area, rural according to the first ground truth data. Then, and one high spatial resolution image covering at least this second ground truth related only to the detailed the urban spots. urban area was merged to the first (only rural areas) in order to produce the absolute ground truth, which was finally used to assess the performance of the entire Validation dataset classification. Ambiguous areas (black areas) were Reliable ground truth was prepared by expert visual not labelled and consequently disregarded during the interpretation to allow accuracy assessment, which accuracy assessment. was made in two steps: first, based only on the low spatial resolution image to test prior identification of Experiment with Landsat 8-OLI combined to primary targets by CLT (rural targets and generic Geoeye-1 urban), and second, based on the low and high spatial resolution images simultaneously to assess the final The selected area includes rocks surrounded by vege- classification result (rural and urban targets in fine tation and some portions of the urban area spread. detail). It is important to stress that the second ground Primary samples of forest, field, rocks, and urban areas truth was built by drawing only suburban classes were collected directly on the image. Based on the CLT (roofs, roads, trees, soils, etc.) directly over the high- (Zhong et al., 2017), the image received the primary 222 L. SARTORIO AND D. ZANOTTA classification by maximum likelihood considering shown in Figure 4(b). It is important to stress that this Normal statistical distribution of classes to separate initial map aimed to test only the ability to separate the primary targets selected on the scene and using between rural (forest, grass, and rocks) and urban area OLI-Landsat 8. The initial supposition of Normal dis- in terms of overall and average accuracies. This result tribution of primary classes was confirmed by analyz- retrieved an overall accuracy of 82.9% and an average ing the histograms of Figure 3, which also over plots accuracy of 84.0%. Qualitatively, it is also possible to the estimated probability density functions (dotted notice spatial correspondence between Figure 4(a,b). lines). For this experiment, bands 4 and 7 were suffi- Most importantly, the pre identification of the urban cient to separate all the four primary classes. area resulted in a high classification confidence, which The resulting mask M(i) separating the primary was greatly aided by the post-classification morpholo- classes is showed in Figure 4(a). As can be seen, con- gical dilation process. It is worth noting that, to avoid sidering the trade-off mentioned at the end of section further classification errors from the first to the second 2, we have performed a morphologic dilation of few stage, it is preferable to obtain an excess than an pixels along the perimeter of the urban area to expand absence of an urban area resulting from the first stage. it (rounded features at the edges of the urban area). The final classification map was then obtained by Proceeding to the second stage of the method refining the classification of the urban area through (adaptive classification), the high-resolution GeoEye- the high-resolution image (GeoEye-1). The end result 1 image was segmented by a basic region growing was validated using the fully ground truth present in technique only over the identified urban regions Figure 2c. The performance of the proposed metho- (grey colour in Figure 4(a)), and then classified using dology was compared with the traditional classifica- RF, the technique selected in this experiment to oper- tion, i.e., using only a single image and one ate on the high-resolution image. The areas identified classification technique. To allow comparison, we as rural (forest, grass, and rocks) were kept with the selected the same RF classification technique used to class resulting from the first stage. test the proposed technique, as well as an identical set Therefore, representative samples of roofs (clay, con- of class samples. The results obtained in terms of crete, and fibrocement), forests and paved roads were overall and average accuracies are presented in Table collected over the high-resolution GeoEye-1 image and 1. We see that for overall accuracy the scores achieved used for training purposes. The RF classifier was trained for the proposed method were significantly higher by using the C4.5 algorithm (Jensen & Lulla, 1987)with than those found by the traditional approach. Tables the available spectral image features. Finally, the classi- 2 and 3 present the confusion matrices computed for fication maps were merged to produce one final image both tested techniques. Figure 5 shows details in the for each classifier RF (Figure 4(c)). urban area of the classified images compared to the same parts in the Geoeye image. We can also visually verify that the changes pro- Analysis vided by the proposed approach caused punctual increases in the classification performance over the The classification map resulting from the first stage entire image and for all classes. However, due to the (Figure 4(a)) was validated using the reference data large difference between the number of pixels for each built by vectorization from the Landsat 8-OLI image, (a) (b) Figure 3. Primary classes histograms and corresponding probability density functions (dotted lines) for channels 4 and 7. The result shows urban class can be adequately separated in the first stage from the remaining rural classes by means of both combined channels. GEOLOGY, ECOLOGY, AND LANDSCAPES 223 (a) (b) (c) (d) Figure 4. (a): Primary classification mask resulting from the first stage, (b): Reference map for primary classes based on the low spatial resolution image by visual interpretation (ground truth first stage), (c): Final map using proposed adaptive classification and RF method for urban area, (d): Traditional object-based classification using only the Geoeye-1 image and RF method. class, the most important measure in this scenario is Table 1. Average and overall classification accuracies (%) measured using RF for the traditional classification paradigm the average accuracy, which was also higher for the and the proposed simultaneous methodology. proposed method. We can also see through the maps Method RF Traditional RF Simultaneous of Figure 4 that much of this result was due to the Overall 56.8 81.0 generalization caused by the classification at the first Average 54.1 67.7 stage, when providing fine separation between forest Table 2. Confusion matrix regarding experiment applying the proposed hybrid method. Reference Classified Grass Forest Soil Asphalt Clay roof Concrete roof Fibrocement roof ∑ (row total) Grass 46,141 3092 2884 787 2 15 0 52,921 Forest 14,504 58,435 204 3262 103 403 87 76,998 Soil 2568 129 27,516 578 0 0 0 30,791 Asphalt 1005 2789 0 557 44 367 138 4900 Clay roof 0 4 0 0 462 0 0 466 Concrete roof 3 134 0 91 280 2551 126 3185 Fibrocement roof 0 6 0 0 0 278 103 387 ∑ (column total) 64,221 64,589 30,604 5275 891 3614 454 169,648 224 L. SARTORIO AND D. ZANOTTA Table 3. Confusion matrix regarding experiment applying the traditional classification method (object-based). Reference Classified Grass Forest Soil Asphalt Clay roof Concrete roof Fibrocement roof ∑ (row total) Grass 4037 1058 381 441 8 22 0 5947 Forest 38,938 59,350 2346 1548 60 282 33 102,557 Soil 20,180 2295 27,829 480 134 231 74 51,223 Asphalt 786 362 11 2096 27 206 79 3567 Clay roof 0 76 0 21 480 2 0 579 Concrete roof 236 1363 37 682 180 2439 85 5022 Fibrocement roof 44 85 0 7 2 432 183 753 ∑ (column total) 64,221 64,589 30,604 5275 891 3614 454 169,648 and grass, which was not that efficient using image and quantitative results, especially in regions where data with limited spectral range (traditional pixel-based classifiers tend to fail: urban areas or areas approach). with high radiometric variability and areas where the object-based approach is more suitable. Conversely, rural areas including fields, soils, minerals, rocks, and Discussion trees could be correctly classified due to the sufficient We can certainly expect that using only the low number of spectral bands available in the low spatial spatial resolution image to classify the entire area, resolution image. the results over the urban area would be very poor Using the predictions of CLT, the elements corre- and inaccurate due to the presence of small parcels sponding to primary targets could be effectively sepa- and many kinds of materials in the cities. Conversely, rated using the probabilistic Gaussian maximum by using only the high spatial resolution image over likelihood classifier. The union of partially produced the entire region would certainly prevent the rural maps for each environment achieved an optimized area to be optimally classified. This is due because result, matching the advantages of both methods in rural/natural areas do need as many spectral bands as a single scene containing heterogeneous targets. As it is possible to obtain the best representation of the canbeseenin Figure 5, the urban area present spectral signature of the targets, which is the key to a similar classification provided by the traditional achieve the best classification result. approach. However, the absence of rural classes in The comparison of results produced by the pro- the problem could allow classification with some posed method with results generated using only the improvements, not showing rural targets in these object-based classification by RF has shown that the areas. On the other hand, the rural area, previously proposed approach presented encouraging qualitative (a) (b) (c) Figure 5. (a): GeoEye-1 in 3 2 1 true colour composition, (b) Final map using proposed adaptive classification and RF for urban area, (c): Traditional object-based classification using only the Geoeye-1 image and RF. All figures have details depicting parts of interest located on same areas. GEOLOGY, ECOLOGY, AND LANDSCAPES 225 classified by using the multispectral image and para- Blaschke, T., Kelly, M., & Merschdorf, H. (2015). Object- based image analysis: Evolution, history, state of the art, metric (maximum likelihood classifier) was more and future vision. In P. Thenkabail (Ed.), In remotely stable, showing less noisy like pixels, when compared sensed data characterization, classification, and accuracies to the traditional object-based approach. (pp. 277–294). Boca Raton, Florida: CRC Press. Maybe one of the major drawbacks of the proposed Blaschke, T., Lang, S., & Hay, G. (2008). Object-based image method is the first stage, when prior separation between analysis: Spatial concepts for knowledge-driven remote sensing applications. Springer Science & Business Media, rural and urban targets is proceeded. This crucial point Heidelberg, Berlin. can affect greatly affect the final results, once some urban Chuvieco, E. (2016). Fundamentals of satellite remote sen- areas can be wrongly confused with rural. This issue has sing: An environmental approach. Boca Raton, Florida: encouraged us to investigate alternative procedures to CRC Press. find the urban region with even more precision. Dinis, J., Navarro, A., Soares, F., Santos, T., Freire, S., Nighttime images as auxiliary data is one of the possibi- Fonseca, A., . . . Tenedó, J. (2010). Hierarchical object-based classification of dense urban areas by integrat- lities. Nighttime data register artificial light coming from ing high spatial resolution satellite images and Lidar eleva- the surface of the Earth, potentially indicating areas tion data. The International Archives of the Photogrammetry, covered by urban spots. Other possibilities are fixed Remote Sensing and Spatial Information Sciences, 38(4), 6. maps and Synthetic Aperture Radar (SAR) images. Duro, D. C., Franklin, S. E., & Dublé, M. G. (2012). The proposed method is a practical way for map- A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for ping heterogeneous areas reaching soundness classifi- the classification of agricultural landscapes using spot-5 cation results, overcoming sensor limitations Hrg imagery. Remote Sensing of Environment, 118, regarding spatial and spectral resolution. The main 259–272. advantages achieved by the proposed method include: Ferreira, M. P., Zortea, M., Zanotta, D. C., Shimabukuro, Y. E., & Souza Filho, C. R. (2016). (1) The optimization of the classification process Mapping tree species in tropical seasonal semi-deciduous forests with hyperspectral and multispec- by automatically selecting the most appropriate tral data. Remote Sensing of Environment, 179,66–78. base image and classification technique to dif- Fisher, J. R. B., Eileen, A., James Dennedy-Frank, P., ferent areas in the same problem (high spatial Kroeger, T., & Boucher, T. M. (2017). Impact of satellite resolution for urban and high-spectral resolu- imagery spatial resolution on land use classification accu- tion for rural). racy and modeled water quality. Remote Sensing in Ecology and Conservation. doi:10.1002/rse2.61 (2) The independence of available classes for dif- Gholoobi, M., & Kumar, L. (2015). Using object-based hier- ferent primary targets when the classification archical classification to extract land use land cover process is made in separate. As the rural areas classes from high-resolution satellite imagery in do not present the diversity of classes found in a complex urban area. Journal of Applied Remote urban sites, the classification tends to produce Sensing, 9(1), 096052—096052. more consistent results, since each problem is Hastie, T., Tibshirani, R., & Friedman, J. (2009). The ele- ments of statistical learning. New York, NY: Springer. posed with a separate set of samples. Herold, M., Roberts, D. A., Gardner, M. E., & Dennison, P. E. (2004). Spectrometry for urban area remote sensing—Development and analysis of a spectral Disclosure statement library from 350 to 2400 Nm. Remote Sensing of Environment, 91(3–4), 304–319. No potential conflict of interest was reported by the authors. Jensen, J. (2009). Remote sensing of the environment: An earth resource perspective (2th ed.). New Delhi, INDIA: Pearson Education India. ORCID Jensen, J. R., & Lulla, K. (1987). Introductory digital image processing: A remote sensing perspective. Milton Park, UK, Letícia Sartorio http://orcid.org/0000-0001-6936-9939 Taylor & Francis. Daniel Zanotta http://orcid.org/0000-0003-2959-6525 Jiang,L.,Wang,W., Yang, X.,Xie,N.,&Cheng,Y.(2010). Classification methods of remote sensing image based on deci- sion tree technologies, Paper presented at the International References Conference on Computer and Computing Technologies in Agriculture (353–358), Nanchang, China. Aguirre-Gutiérrez, J., Seijmonsbergen, A. C., & Lillesand, T., Kiefer, R. W., & Chipman, J. (2014). Remote Duivenvoorden, J. F. (2012). Optimizing land cover classifi- sensing and image interpretation. Hoboken, NJ: John cation accuracy for change detection, a combined Wiley & Sons. pixel-based and object-based approach in a mountainous Lu, D., Hetrick, S., & Moran, E. (2010). Land cover classifi- area in Mexico. Applied Geography, 34,29–37. cation in a complex urban-rural landscape with Bhaskaran, S., Paramananda, S., & Ramnarayan, M. (2010). Quickbird imagery. Photogrammetric Engineering\& Per-pixel and object-oriented classification methods for Remote Sensing, 76(10), 1159–1168. doi:0099-1112/ mapping urban features using Ikonos satellite Data. 10/7610–1159. Applied Geography, 30(4), 650–665. Ma, L., Li, M., Ma, X., Cheng, L., Du, P., & Liu, Y. (2017). Billingsley, P. (1995). Probability and measure. Wiley Series A review of supervised object-based land-cover image in Probability and Mathematical Statistics, Hoboken, NJ. 226 L. SARTORIO AND D. ZANOTTA classification. ISPRS Journal of Photogrammetry and Zanotta, D. C., Haertel, V., Shimabukuro, Y. E., & Remote Sensing, 130, 277–293. Renno, C. D. (2014). Linear spectral mixing model for Mather, P., & Tso, B. (2016). Classification methods for identifying potential missing endmembers in spectral remotely sensed data. Boca Raton, Florida: CRC press. mixture analysis. IEEE Transactions on Geoscience and Moosavi, V., Talebi, A., & Shirmohammadi, B. (2014). Remote Sensing, 52(5), 3005–3012. Producing a landslide inventory map using pixel-based Zhao, W.,Du,S.,&Emery, W. J.(2017). Object-based con- and object-oriented approaches optimized by Taguchi volutional neural network for high-resolution imagery method. Geomorphology, 204, 646–656. classification. IEEE Journal of Selected Topics in Applied Myint, S., Gober, P., Brazel, A., Grossman-Clarke, S., & Earth Observations and Remote Sensing, 10(7), 3386–3396. Weng,Q.(2011). Per-pixel Vs. object-based classification of Zhong, Y., Fei, F., Liu, Y., Zhao, B., Jiao, H., & Zhang, L. urban land cover extraction using high spatial resolution (2017). Satcnn: Satellite image dataset classification using imagery. Remote Sensing of Environment, 115(5), 1145–1161. Agile convolutional neural networks. Remote Sensing Weih, R. C., & Riggan, N. D. (2010). Object-based classifica- Letters, 8(2), 136–145. tion Vs. Pixel-based classification: Comparative impor- Zhong,Y.,Ma,A.,Ong,Y.,Zhu,Z., &Zhang,L. tance of multi-resolution imagery. The International (2017). Computational intelligence in optical remote Archives of the Photogrammetry, Remote Sensing and sensing image processing. Applied Soft Computing, Spatial Information Sciences, 38(4). 64,75–93. Whiteside, T. G., Boggs, G. S., & Maier, S. W. (2011). Zhou, W., Troy, A., & Grove, M. (2008). Object-based land Comparing object-based and pixel-based classifications cover classification and change analysis in the Baltimore for mapping savannas. International Journal of Applied metropolitan area using multitemporal high resolution Earth Observation and Geoinformation, 13(6), 884–893. remote sensing data. Sensors, 8(3), 1613–1636. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Geology Ecology and Landscapes Taylor & Francis

An adaptive approach for simultaneous classification of remote sensing scenes including rural and urban targets

Loading next page...
 
/lp/taylor-francis/an-adaptive-approach-for-simultaneous-classification-of-remote-sensing-jxKXRKzToV

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Taylor & Francis
Copyright
© 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group on behalf of the International Water, Air & Soil Conservation Society(INWASCON).
ISSN
2474-9508
DOI
10.1080/24749508.2019.1706833
Publisher site
See Article on Publisher Site

Abstract

GEOLOGY, ECOLOGY, AND LANDSCAPES 2021, VOL. 5, NO. 3, 217–226 INWASCON https://doi.org/10.1080/24749508.2019.1706833 RESEARCH ARTICLE An adaptive approach for simultaneous classification of remote sensing scenes including rural and urban targets a b Letícia Sartorio and Daniel Zanotta a b Center for human and information sciences, University of Rio Grande, Rio Grande, Brazil; Advanced Visualization & Geoinformatics Laboratory, UNISINOS University, São Leopoldo, Brazil ABSTRACT ARTICLE HISTORY Received 2 October 2019 In this paper, an automatic adaptive image classification framework designed to operate in Accepted 12 December 2019 multiresolution scenes including rural and urban targets is proposed and tested. Traditional image analysis is commonly aimed to classify images using a single strategy and source of data KEYWORDS over the entire scene. Ideally, urban targets should predict specialized classification systems Classification; remote using high spatial resolution images, such as object-based image analysis and non-parametric sensing; image classifiers. Conversely, rural targets should be handled with the high-spectral resolution, pixel- segmentation; pattern based classification approaches, and parametric techniques. The formulation proposed in this recognition; landscape study starts by performing an prior separation of rural and urban areas by assuming Central Limit Theorem (CLT) establishments. Then, both kinds of targets are labelled in an automatic adaptive fashion, each one with proper data and method previously selected. One experiment performed using set of data composed by a high spatial resolution true-colour image and a multispectral image, as well as preselected classification techniques particularly adjusted for each case. Visual and quantitative assessing by two accuracy metrics testing the proposed approach versus traditional classification confirm the soundness of the proposed framework. Introduction analyzed individually for labelling, or at the object The increasing development of sophisticated remote level (object-based), where a set of pixels is previously sensing instruments has conferred considerable merged receiving a single label (Moosavi, Talebi, & improvements in the quality of images acquired from Shirmohammadi, 2014). The pixel-based classification space (Gholoobi & Kumar, 2015). Public administra- is restricted to use only the spectral information of tion has been facing a growing dependence of rapid pixels as the unique attribute, not considering any and reliable monitoring of an increasingly dynamic other aspect in the process (Weih & Riggan, 2010). and complex scenario. High spatial resolution imaging For many applications, this approach is able to retrieve sensors onboard satellites are one of the examples a thematic map showing the elements of interest which have allowed detailed land use and land cover throughout the image with reasonable precision. mapping at a high efficiency and relatively low cost However, in more complex applications, e.g. involving (Fisher, Eileen, James Dennedy-Frank, Kroeger, & small structures or well elaborated shapes, like urban Boucher, 2017), mainly over urban areas and other areas or data including detailed targets like some agri- complex environments. Missions that brought cultural areas or lithological mapping, the object-based advances in this direction are, in chronological order, classification is more suitable (Zhou, Troy, & Grove, IKONOS, QuickBird, RapidEye, Geoeye, WorldView 2008). The reason is that object-based approach is per- (Chuvieco, 2016), as well as aircraft and the recent formed in two basic steps: image segmentation, that unmanned aerial vehicle (UAV) images. The presence aims to group similar pixels in objects, and classifica- of complex ground targets is common in this type of tion, that aims to label the resulting objects (Whiteside, high spatial resolution images but can be adequately Boggs, & Maier, 2011). Working with objects allows the classified by modern computational techniques like analyst to explore not only radiometric information, as Support Vector Machines (SVM), Random Forests in pixel-based approach, but also attributes like texture, (RF), and, more recently, by Convolutional Neural shape, size, and context, improving the classification Networks (CNN) (Jensen, 2009). Indeed, these are process (Duro, Franklin, & Dublé, 2012). Indeed, the the most common image classification techniques object-based approach is able to take advantage of sur- recently found in the literature. rounding and circumstantial characteristics like rough- Automatic image classification can be performed at ness, neighborhood, size, and morphology of resulting the pixel level (pixel-based), where each pixel is objects. CONTACT Daniel Zanotta dzanotta@edu.unisinos.br UNISINOS University, São Leopoldo, Brazil © 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group on behalf of the International Water, Air & Soil Conservation Society(INWASCON). This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 218 L. SARTORIO AND D. ZANOTTA Due to the above-mentioned reasons, the latest con- same area: one true colour (RGB) low-spectral/high- sensus of the specialized literature indicates that urban spatial resolution and another high-spectral/low- targets (buildings, roads, trees, small waterbodies) pre- spatial resolution. The proposed technique, which sent in high spatial resolution images should be classi- will be thoroughly described in what follows, is based fied by objects (Bhaskaran, Paramananda, & on the automatic prior separation of urban and rural Ramnarayan, 2010; Ma et al., 2017), since detailed targets through the well-known Central Limit information about shape, texture, and context are very Theorem (CLT). The above mentioned prior identifi- important attributes to differentiate among targets these cation relies on the fact that most rural or natural standard attributes (Mather & Tso, 2016). Object-based elements (i.e., fields, forests, soils, rocks) present approach along with hierarchical and nonparametric classes with normal (Gaussian) probability density classifier (no assumption of probability distribution of function, whereas urban classes (i.e., buildings, roads, classes), is expected to be more effective in the urban and other city structures) generally present indescrib- scenes. Conversely, rural targets (fields, forests, medium able probability density function (Billingsley, 1995). and large waterbodies, rocks, and varied soils) oughta Once the two primary targets were parameterized, be better classified at pixel-level, with images including a maximum likelihood classifier can be used to iden- as many as spectral bands are possible (Aguirre- tify urban and rural pixels in the low spatial resolution Gutiérrez, Seijmonsbergen, & Duivenvoorden, 2012; image available. Then, in an adaptive fashion, our Ferreira, Zortea, Zanotta, Shimabukuro, & Souza strategy follows by assigning pre-identified urban Filho, 2016). Rural or natural targets present large areas to classification using high spatial resolution areas and subtle radiometric variations along distances, image data associated with object-based analysis, which allows correct description by even very low spa- whereas rural areas are assigned with low spatial reso- tial resolution images. Due to this reason, sensors lution images and associated to pixel-level analysis. designed to monitor these areas usually have many The combination of two simultaneous approaches spectral bands (Fisher et al., 2017), enabling detailed and images of different spatial and spectral resolutions description about the chemical composition of targets, aims at producing a robust tool for optimizing classi- which is crucial for lithological or vegetation mapping fication of the scenes covering complex and hetero- (Herold, Roberts, Gardner, & Dennison, 2004; geneous environments by mainly two important Lillesand, Kiefer, & Chipman, 2014). Furthermore, for reasons: (1) consideration of multi-source data, attri- rural targets, parametric classifiers (which assume well- butes, and classification strategy for separate areas, known probability distribution of classes) might esti- and (2) limiting the number of classes for each classi- mate the classes with greater efficiency, since it is fication problem, reducing overlap among classes. designed to work with only radiometric information, which is very abundant in multispectral images of med- ium to low spatial resolution (Whiteside et al., 2011). Materials and methods There are several strategies for improving and First stage: prior identification of urban and rural refine the classification of high spatial resolution and areas hyperspectral images (Zhao, Du, & Emery, 2017; Zhong et al., 2017; Zhong, Ma, Ong, Zhu, & Zhang, The hybrid classification proposed here assumes the 2017). Despite the recent improvements, the classifi- existence of two images covering the same region: cation of scenes simultaneous including urban and a low spatial resolution multispectral image, which rural targets remains a challenge for classifiers tradi- will be used to classify the rural part of the scene, tionally used for remote sensing image recognition. As and a high spatial resolution (not necessarily multi- can be noted this condition is very common and spectral) to classify the urban portion of the scene. For brings many challenges when dealing with mapping rural areas, it is very convenient to have heterogeneous areas. Analysts usually rely on the time- a multispectral image, with as many channels as it consuming and labor-intensive prior separation of the needs for a correct characterization of the natural different targets by visual interpretation followed by targets involved (Fisher et al., 2017). This assumption independent classification of each area. An alternative relies on the fact that the chemical composition is very is the selection of a unique method retrieving the best important to characterize the subtle differences among trade-off using the high spatial resolution image avail- spectral signatures. At the same time, for the urban able. However, the use of only one type of image data area, the mandatory image parameter is the pixel size, and classification strategy in these mixed scenarios not the number of channels (Myint, Gober, Brazel, hinders the optimization of the accuracy of the results. Grossman-Clarke, & Weng, 2011). For example, for The present study suggests an automatic adaptive buildings and roads, the shape and texture of objects framework to overcome the classification of scenes are crucial information for recognizing them. simultaneously including rural and urban classes. It We assume images with fine spatial registration assumes the use of two different images covering the (geometrical alignment) and radiometrically GEOLOGY, ECOLOGY, AND LANDSCAPES 219 corrected. The first stage is the core of the proposed (Blaschke, Lang, & Hay, 2008), where fp ,p ,..., p } 1 2 m technique. It uses the low spatial resolution image to are the proportions occupied by m targets. Then, the separate the hereafter called primary classes: urban ω spectral response r for each channel k can be U k and rural ω . It is important to stress that the aim at depicted as: Rj this stage is not to find the definitive classes of targets, r ¼ r p þ r p þ ... þ r p (3) k k1 1 k2 2 km m but only to identify the nature of targets on the scene, whether urban or rural. We understand as rural those where r is the spectral response of the pixel in the kth classes representing natural elements like waterbodies, spectral band, r is the pure spectral response of each km fields, rocks, bare soils, forests, crops, etc. Therefore, target present in the pixel in the kth spectral band. even in the early stages, rural class ω can assume Rj The above demonstration shows the multi-source more than one subclass j. nature of large pixels under images including varied We employ the Central Limit Theorem (CLT) to targets. Exploratory experiments have indeed proved perform the task of separating ω and ω by consid- U Rj our initial assumption regarding the expected Normal ering these to primary classes as two different popula- distributions of primary classes included in low spatial tions. CLT states that, given a set of sufficiently large resolution images. Since the aim of the proposed samples collected from a population with a finite level method is not to estimate the proportions of pixel’s of variance, the mean of all samples from the same compounds (unmixing), the number and nature of population will be approximately equal to the mean of targets in each pixel along the image can vary, but the population (Zhong et al., 2017). Furthermore, the causing any loss of validity. set of samples will approximate to a Normal/Gaussian The proposed method proceeds by collecting sam- distribution pattern, with variances approximately ples of pixels corresponding to the primary classes equal to the variance of the whole population divided directly over the low spatial resolution image (i.e., by each sample’s size n. Following this theorem, urban ω and as many rural as exists ω ). The statis- U Nj images with sufficiently large pixel sizes (low spatial tical information with normal distribution assumed resolution data) including contributions from many for these classes (e.g., vector means μ and covariance targets together are expected to have classes presenting matrix  ) can then be derived and used to feed Normal/Gaussian probability density distributions, parametric classification rules. As stated before, data which allow us to determine, in a prior fashion, the presenting this behaviour show a high level of differ- main nature of the pixels: whether urban or rural. entiation by probabilistic classifiers, such as maximum To better understand the proposed method and the likelihood, which can be expressed by the following adequacy of CLT to the related problem, let membership function (Eq.4), derived from Bayes x ; x ; ... ; x be a randomly selected sample of size 1 2 n theorem: n, a sequence of independent and identically distrib- ΦðÞ x ¼ Pðx jω Þ uted random variables drawn from distributions of c i i c 1 1 expected values given by µ and finite variances given ¼ exp  ðx  μ Þ   1 x  μ pffiffiffiffiffi pffiffiffiffiffiffiffiffi i c i c c by σ . Consider we are interested in the samples aver- 2π jj age X of these random variables. (4) x þ x þ ... þ x 1 n 1 2 n where ΦðÞ x is the probability density function of X ¼ ¼ x (1) c i n i i¼1 n n a pixel x belonging to class ω , which can initially i c The theorem implies that sample averages converge to assume urban (ω ) or rural (ω ), d is the dimension- U Rj the expected value µ as n→ ∞ and, for large enough size ality of the data, x the spectral response of pixel i, μ is n, the distribution of X is closer to the Normal distribu- the mean vector and  the covariance matrix of class tion with mean µ and variance σ /n.Asn approaches to ω . These samples are then used to train the supervised pffiffiffi classifier, which will later be used to determine the infinity, the random variables n(X -µ)convergeto primary classes over the entire scene studied. The a normal distribution with µ =0, NðÞ 0; σ . expected result is a mask separating rural and urban pffiffiffi 1 n d zone able to direct what kind of classifier is applied in n x  μ ! N 0; σ (2) i¼1 each area. The suggested mask can be built from the following rule: Thus, spectral generalization caused by pixels with large sizes is important and contributes to the proper opera- MðÞ i 2 ω if ΦðÞ x ¼ maxfg ΦðÞ x " cUfg ; Rj (5) c c i i tion of the proposed framework. To adapt the CLT for the real problem approached here, we transfer the con- where MðÞ i corresponds to the position of the pixel x cept of one-dimensional sample average X to the mul- in the mask M and ΦðÞ x is the membership vector n i tispectral response r of each pixel over the low spatial containing the ΦðÞ x values for each class. The urban c i resolution image. We assume each pixel’sresponse r as (ω ) and rural (ω ) classes proceed to the second i U Rj a linear combination of targets included inside it stage of the methodology. At this point, the user can 220 L. SARTORIO AND D. ZANOTTA consider performing a morphologic dilation of few causes overfitting of the training samples. RF is an pixels along the urban area perimeter to guarantee to alternative that avoids overfitting by averages multiple enclose of urban targets. The rationale behind this deep decision trees trained in different parts of the procedure is the assumption that it is even worth same training dataset (Hastie, Tibshirani, & including rural classes inside the urban area by mis- Friedman, 2009). take, instead of letting urban elements outside it to be The area in M recognized as rural can keep the wrongly classified as rural. original classes received at the primary stage, or can be classified again using the low spatial resolution image by a pixel-based approach and generalist classi- Second stage: adaptive classification system fication technique, such as Linear or Quadratic Discriminant Analysis (LDA, QDA) or Maximum As the literature suggests, due to the high-frequency Likelihood. The generalist classification technique to spectral behavior verified in urban areas, these sites operate on the low spatial resolution image is defined have shown better classification results when classified as G, whereas the classification technique chosen to by specialized techniques (Zanotta, Haertel, operate on the segmented high spatial resolution Shimabukuro, & Renno, 2014), which are able to image is defined as H. The adaptive classification of take into account many parameters and specificities the entire scene proceeds obeying the following rule: of targets (Lu, Hetrick, & Moran, 2010). At the same time, to prospect the ability to handle information x ) Hif MðÞ i 2 ω i U from many kinds simultaneously and to avoid multi- (6) x ) Gif MðÞ i 2 ω i Nj labelling of unique objects formed by groups of pixels, the most recent studies have suggested using object- where one pixel x is expected to be classified by high based approaches for urban classification (Blaschke spatial resolution image, as well as technique H, only if et al., 2008). Conversely, rural environments are it is recognized as urban area in the first stage more appropriately classified using detailed multispec- ðÞ MðÞ i 2 ω . Conversely, if the pixel x is recognized U i tral information, instead of data about the shape, tex- as rural in the first step (MðÞ i 2 ω ), then this element Rj ture, or spatial context of targets. Therefore, the large is expected to be classified by the low spatial resolution is the number of available spectral channels, the better data, operated by technique G.A flowchart of the is the recognition of the target. Many kinds of land proposed technique is presented in Figure 1. cover classes like vegetation, rocks, and soils present The resulting classification map is expected to pre- very similar characteristics, which are often differen- sent an improvement according to traditional meth- tiated only by detailed inspection of spectral signatures ods applying a single rule throughout the scene, (Dinis et al., 2010). ignoring the fact that it contains targets of distinct The portion of the high spatial resolution image M natures. As said before, the core of the proposed recognized as the urban area is then segmented and technique is to automatically exploit the advantages directed to the complementary classification step. The of each source of data and the potentials of proper segmentation process intends the aggregation of simi- classification tools for every specific environment. lar neighboring pixels to produce objects with Furthermore, the reduction in the number of classes improved attributes (Jensen & Lulla, 1987). For sake available for each classification system is expected to of simplicity, we chose the widely used region growing avoid overlaps and confusion among classes, improv- segmentation technique available in many image pro- ing overall classification results. cessing packages. Region growing starts by merging individual pixels using spectral similarity criteria, which can be more adequately classified by using not Results only their spectral attributes, but also texture, shape, Data description and spatial context features (Blaschke, Kelly, & Merschdorf, 2015). Resulting objects are then classi- In order to test and exemplify the performance of the fied using one of the techniques suitable for urban methodology suggested in this study, we performed one environments. The most popular approaches for this experiment with and area located in Cape Town, type of application are those which can handle many Western Cape Province, South Africa. The images are classes at the same time, while avoiding overfitting and geometrically and radiometrically/atmospherically cor- making optimized usage of the available attributes. rected. Standing for low spatial resolution data we Some modern examples are the hierarchical Random have a Landsat 8-OLI, acquired on 3 September 2013 Forests (RF) (Jiang, Wang, Yang, Xie, & Cheng, 2010), (Figure 2(a)). The image has 30 m spatial resolution for since it can manage different attributes according to the spectral bands used in the experiment (1–7). The the importance of each one to the specific problem high spatial resolution image came from GeoEye-1 addressed. Traditional decision trees classifiers tend to acquired on 31 July 2013, with 1.65 m spatial resolution learn highly irregular patterns, which frequently (Figure 2(b)). Two images with different resolutions GEOLOGY, ECOLOGY, AND LANDSCAPES 221 Figure 1. Flowchart of the proposed approach. (a) (b) (c) Figure 2. (a): true colour composition (4 3 2) of a Landsat 8-OLI subset covering Cape Town/South Africa, (b): the same subset imaged by GeoEye-1 in 3 2 1 true colour composition (c): Ground truth made with information from both images simultaneously (ground truth second stage). The colour palette refers to the ground truth image. Black areas refer to very ambiguous or too much mixed areas and were not considered in the accuracy assessment. covering the same area were used: one low spatial resolu- resolution image, but avoiding areas considered as tion multispectral image covering the whole study area, rural according to the first ground truth data. Then, and one high spatial resolution image covering at least this second ground truth related only to the detailed the urban spots. urban area was merged to the first (only rural areas) in order to produce the absolute ground truth, which was finally used to assess the performance of the entire Validation dataset classification. Ambiguous areas (black areas) were Reliable ground truth was prepared by expert visual not labelled and consequently disregarded during the interpretation to allow accuracy assessment, which accuracy assessment. was made in two steps: first, based only on the low spatial resolution image to test prior identification of Experiment with Landsat 8-OLI combined to primary targets by CLT (rural targets and generic Geoeye-1 urban), and second, based on the low and high spatial resolution images simultaneously to assess the final The selected area includes rocks surrounded by vege- classification result (rural and urban targets in fine tation and some portions of the urban area spread. detail). It is important to stress that the second ground Primary samples of forest, field, rocks, and urban areas truth was built by drawing only suburban classes were collected directly on the image. Based on the CLT (roofs, roads, trees, soils, etc.) directly over the high- (Zhong et al., 2017), the image received the primary 222 L. SARTORIO AND D. ZANOTTA classification by maximum likelihood considering shown in Figure 4(b). It is important to stress that this Normal statistical distribution of classes to separate initial map aimed to test only the ability to separate the primary targets selected on the scene and using between rural (forest, grass, and rocks) and urban area OLI-Landsat 8. The initial supposition of Normal dis- in terms of overall and average accuracies. This result tribution of primary classes was confirmed by analyz- retrieved an overall accuracy of 82.9% and an average ing the histograms of Figure 3, which also over plots accuracy of 84.0%. Qualitatively, it is also possible to the estimated probability density functions (dotted notice spatial correspondence between Figure 4(a,b). lines). For this experiment, bands 4 and 7 were suffi- Most importantly, the pre identification of the urban cient to separate all the four primary classes. area resulted in a high classification confidence, which The resulting mask M(i) separating the primary was greatly aided by the post-classification morpholo- classes is showed in Figure 4(a). As can be seen, con- gical dilation process. It is worth noting that, to avoid sidering the trade-off mentioned at the end of section further classification errors from the first to the second 2, we have performed a morphologic dilation of few stage, it is preferable to obtain an excess than an pixels along the perimeter of the urban area to expand absence of an urban area resulting from the first stage. it (rounded features at the edges of the urban area). The final classification map was then obtained by Proceeding to the second stage of the method refining the classification of the urban area through (adaptive classification), the high-resolution GeoEye- the high-resolution image (GeoEye-1). The end result 1 image was segmented by a basic region growing was validated using the fully ground truth present in technique only over the identified urban regions Figure 2c. The performance of the proposed metho- (grey colour in Figure 4(a)), and then classified using dology was compared with the traditional classifica- RF, the technique selected in this experiment to oper- tion, i.e., using only a single image and one ate on the high-resolution image. The areas identified classification technique. To allow comparison, we as rural (forest, grass, and rocks) were kept with the selected the same RF classification technique used to class resulting from the first stage. test the proposed technique, as well as an identical set Therefore, representative samples of roofs (clay, con- of class samples. The results obtained in terms of crete, and fibrocement), forests and paved roads were overall and average accuracies are presented in Table collected over the high-resolution GeoEye-1 image and 1. We see that for overall accuracy the scores achieved used for training purposes. The RF classifier was trained for the proposed method were significantly higher by using the C4.5 algorithm (Jensen & Lulla, 1987)with than those found by the traditional approach. Tables the available spectral image features. Finally, the classi- 2 and 3 present the confusion matrices computed for fication maps were merged to produce one final image both tested techniques. Figure 5 shows details in the for each classifier RF (Figure 4(c)). urban area of the classified images compared to the same parts in the Geoeye image. We can also visually verify that the changes pro- Analysis vided by the proposed approach caused punctual increases in the classification performance over the The classification map resulting from the first stage entire image and for all classes. However, due to the (Figure 4(a)) was validated using the reference data large difference between the number of pixels for each built by vectorization from the Landsat 8-OLI image, (a) (b) Figure 3. Primary classes histograms and corresponding probability density functions (dotted lines) for channels 4 and 7. The result shows urban class can be adequately separated in the first stage from the remaining rural classes by means of both combined channels. GEOLOGY, ECOLOGY, AND LANDSCAPES 223 (a) (b) (c) (d) Figure 4. (a): Primary classification mask resulting from the first stage, (b): Reference map for primary classes based on the low spatial resolution image by visual interpretation (ground truth first stage), (c): Final map using proposed adaptive classification and RF method for urban area, (d): Traditional object-based classification using only the Geoeye-1 image and RF method. class, the most important measure in this scenario is Table 1. Average and overall classification accuracies (%) measured using RF for the traditional classification paradigm the average accuracy, which was also higher for the and the proposed simultaneous methodology. proposed method. We can also see through the maps Method RF Traditional RF Simultaneous of Figure 4 that much of this result was due to the Overall 56.8 81.0 generalization caused by the classification at the first Average 54.1 67.7 stage, when providing fine separation between forest Table 2. Confusion matrix regarding experiment applying the proposed hybrid method. Reference Classified Grass Forest Soil Asphalt Clay roof Concrete roof Fibrocement roof ∑ (row total) Grass 46,141 3092 2884 787 2 15 0 52,921 Forest 14,504 58,435 204 3262 103 403 87 76,998 Soil 2568 129 27,516 578 0 0 0 30,791 Asphalt 1005 2789 0 557 44 367 138 4900 Clay roof 0 4 0 0 462 0 0 466 Concrete roof 3 134 0 91 280 2551 126 3185 Fibrocement roof 0 6 0 0 0 278 103 387 ∑ (column total) 64,221 64,589 30,604 5275 891 3614 454 169,648 224 L. SARTORIO AND D. ZANOTTA Table 3. Confusion matrix regarding experiment applying the traditional classification method (object-based). Reference Classified Grass Forest Soil Asphalt Clay roof Concrete roof Fibrocement roof ∑ (row total) Grass 4037 1058 381 441 8 22 0 5947 Forest 38,938 59,350 2346 1548 60 282 33 102,557 Soil 20,180 2295 27,829 480 134 231 74 51,223 Asphalt 786 362 11 2096 27 206 79 3567 Clay roof 0 76 0 21 480 2 0 579 Concrete roof 236 1363 37 682 180 2439 85 5022 Fibrocement roof 44 85 0 7 2 432 183 753 ∑ (column total) 64,221 64,589 30,604 5275 891 3614 454 169,648 and grass, which was not that efficient using image and quantitative results, especially in regions where data with limited spectral range (traditional pixel-based classifiers tend to fail: urban areas or areas approach). with high radiometric variability and areas where the object-based approach is more suitable. Conversely, rural areas including fields, soils, minerals, rocks, and Discussion trees could be correctly classified due to the sufficient We can certainly expect that using only the low number of spectral bands available in the low spatial spatial resolution image to classify the entire area, resolution image. the results over the urban area would be very poor Using the predictions of CLT, the elements corre- and inaccurate due to the presence of small parcels sponding to primary targets could be effectively sepa- and many kinds of materials in the cities. Conversely, rated using the probabilistic Gaussian maximum by using only the high spatial resolution image over likelihood classifier. The union of partially produced the entire region would certainly prevent the rural maps for each environment achieved an optimized area to be optimally classified. This is due because result, matching the advantages of both methods in rural/natural areas do need as many spectral bands as a single scene containing heterogeneous targets. As it is possible to obtain the best representation of the canbeseenin Figure 5, the urban area present spectral signature of the targets, which is the key to a similar classification provided by the traditional achieve the best classification result. approach. However, the absence of rural classes in The comparison of results produced by the pro- the problem could allow classification with some posed method with results generated using only the improvements, not showing rural targets in these object-based classification by RF has shown that the areas. On the other hand, the rural area, previously proposed approach presented encouraging qualitative (a) (b) (c) Figure 5. (a): GeoEye-1 in 3 2 1 true colour composition, (b) Final map using proposed adaptive classification and RF for urban area, (c): Traditional object-based classification using only the Geoeye-1 image and RF. All figures have details depicting parts of interest located on same areas. GEOLOGY, ECOLOGY, AND LANDSCAPES 225 classified by using the multispectral image and para- Blaschke, T., Kelly, M., & Merschdorf, H. (2015). Object- based image analysis: Evolution, history, state of the art, metric (maximum likelihood classifier) was more and future vision. In P. Thenkabail (Ed.), In remotely stable, showing less noisy like pixels, when compared sensed data characterization, classification, and accuracies to the traditional object-based approach. (pp. 277–294). Boca Raton, Florida: CRC Press. Maybe one of the major drawbacks of the proposed Blaschke, T., Lang, S., & Hay, G. (2008). Object-based image method is the first stage, when prior separation between analysis: Spatial concepts for knowledge-driven remote sensing applications. Springer Science & Business Media, rural and urban targets is proceeded. This crucial point Heidelberg, Berlin. can affect greatly affect the final results, once some urban Chuvieco, E. (2016). Fundamentals of satellite remote sen- areas can be wrongly confused with rural. This issue has sing: An environmental approach. Boca Raton, Florida: encouraged us to investigate alternative procedures to CRC Press. find the urban region with even more precision. Dinis, J., Navarro, A., Soares, F., Santos, T., Freire, S., Nighttime images as auxiliary data is one of the possibi- Fonseca, A., . . . Tenedó, J. (2010). Hierarchical object-based classification of dense urban areas by integrat- lities. Nighttime data register artificial light coming from ing high spatial resolution satellite images and Lidar eleva- the surface of the Earth, potentially indicating areas tion data. The International Archives of the Photogrammetry, covered by urban spots. Other possibilities are fixed Remote Sensing and Spatial Information Sciences, 38(4), 6. maps and Synthetic Aperture Radar (SAR) images. Duro, D. C., Franklin, S. E., & Dublé, M. G. (2012). The proposed method is a practical way for map- A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for ping heterogeneous areas reaching soundness classifi- the classification of agricultural landscapes using spot-5 cation results, overcoming sensor limitations Hrg imagery. Remote Sensing of Environment, 118, regarding spatial and spectral resolution. The main 259–272. advantages achieved by the proposed method include: Ferreira, M. P., Zortea, M., Zanotta, D. C., Shimabukuro, Y. E., & Souza Filho, C. R. (2016). (1) The optimization of the classification process Mapping tree species in tropical seasonal semi-deciduous forests with hyperspectral and multispec- by automatically selecting the most appropriate tral data. Remote Sensing of Environment, 179,66–78. base image and classification technique to dif- Fisher, J. R. B., Eileen, A., James Dennedy-Frank, P., ferent areas in the same problem (high spatial Kroeger, T., & Boucher, T. M. (2017). Impact of satellite resolution for urban and high-spectral resolu- imagery spatial resolution on land use classification accu- tion for rural). racy and modeled water quality. Remote Sensing in Ecology and Conservation. doi:10.1002/rse2.61 (2) The independence of available classes for dif- Gholoobi, M., & Kumar, L. (2015). Using object-based hier- ferent primary targets when the classification archical classification to extract land use land cover process is made in separate. As the rural areas classes from high-resolution satellite imagery in do not present the diversity of classes found in a complex urban area. Journal of Applied Remote urban sites, the classification tends to produce Sensing, 9(1), 096052—096052. more consistent results, since each problem is Hastie, T., Tibshirani, R., & Friedman, J. (2009). The ele- ments of statistical learning. New York, NY: Springer. posed with a separate set of samples. Herold, M., Roberts, D. A., Gardner, M. E., & Dennison, P. E. (2004). Spectrometry for urban area remote sensing—Development and analysis of a spectral Disclosure statement library from 350 to 2400 Nm. Remote Sensing of Environment, 91(3–4), 304–319. No potential conflict of interest was reported by the authors. Jensen, J. (2009). Remote sensing of the environment: An earth resource perspective (2th ed.). New Delhi, INDIA: Pearson Education India. ORCID Jensen, J. R., & Lulla, K. (1987). Introductory digital image processing: A remote sensing perspective. Milton Park, UK, Letícia Sartorio http://orcid.org/0000-0001-6936-9939 Taylor & Francis. Daniel Zanotta http://orcid.org/0000-0003-2959-6525 Jiang,L.,Wang,W., Yang, X.,Xie,N.,&Cheng,Y.(2010). Classification methods of remote sensing image based on deci- sion tree technologies, Paper presented at the International References Conference on Computer and Computing Technologies in Agriculture (353–358), Nanchang, China. Aguirre-Gutiérrez, J., Seijmonsbergen, A. C., & Lillesand, T., Kiefer, R. W., & Chipman, J. (2014). Remote Duivenvoorden, J. F. (2012). Optimizing land cover classifi- sensing and image interpretation. Hoboken, NJ: John cation accuracy for change detection, a combined Wiley & Sons. pixel-based and object-based approach in a mountainous Lu, D., Hetrick, S., & Moran, E. (2010). Land cover classifi- area in Mexico. Applied Geography, 34,29–37. cation in a complex urban-rural landscape with Bhaskaran, S., Paramananda, S., & Ramnarayan, M. (2010). Quickbird imagery. Photogrammetric Engineering\& Per-pixel and object-oriented classification methods for Remote Sensing, 76(10), 1159–1168. doi:0099-1112/ mapping urban features using Ikonos satellite Data. 10/7610–1159. Applied Geography, 30(4), 650–665. Ma, L., Li, M., Ma, X., Cheng, L., Du, P., & Liu, Y. (2017). Billingsley, P. (1995). Probability and measure. Wiley Series A review of supervised object-based land-cover image in Probability and Mathematical Statistics, Hoboken, NJ. 226 L. SARTORIO AND D. ZANOTTA classification. ISPRS Journal of Photogrammetry and Zanotta, D. C., Haertel, V., Shimabukuro, Y. E., & Remote Sensing, 130, 277–293. Renno, C. D. (2014). Linear spectral mixing model for Mather, P., & Tso, B. (2016). Classification methods for identifying potential missing endmembers in spectral remotely sensed data. Boca Raton, Florida: CRC press. mixture analysis. IEEE Transactions on Geoscience and Moosavi, V., Talebi, A., & Shirmohammadi, B. (2014). Remote Sensing, 52(5), 3005–3012. Producing a landslide inventory map using pixel-based Zhao, W.,Du,S.,&Emery, W. J.(2017). Object-based con- and object-oriented approaches optimized by Taguchi volutional neural network for high-resolution imagery method. Geomorphology, 204, 646–656. classification. IEEE Journal of Selected Topics in Applied Myint, S., Gober, P., Brazel, A., Grossman-Clarke, S., & Earth Observations and Remote Sensing, 10(7), 3386–3396. Weng,Q.(2011). Per-pixel Vs. object-based classification of Zhong, Y., Fei, F., Liu, Y., Zhao, B., Jiao, H., & Zhang, L. urban land cover extraction using high spatial resolution (2017). Satcnn: Satellite image dataset classification using imagery. Remote Sensing of Environment, 115(5), 1145–1161. Agile convolutional neural networks. Remote Sensing Weih, R. C., & Riggan, N. D. (2010). Object-based classifica- Letters, 8(2), 136–145. tion Vs. Pixel-based classification: Comparative impor- Zhong,Y.,Ma,A.,Ong,Y.,Zhu,Z., &Zhang,L. tance of multi-resolution imagery. The International (2017). Computational intelligence in optical remote Archives of the Photogrammetry, Remote Sensing and sensing image processing. Applied Soft Computing, Spatial Information Sciences, 38(4). 64,75–93. Whiteside, T. G., Boggs, G. S., & Maier, S. W. (2011). Zhou, W., Troy, A., & Grove, M. (2008). Object-based land Comparing object-based and pixel-based classifications cover classification and change analysis in the Baltimore for mapping savannas. International Journal of Applied metropolitan area using multitemporal high resolution Earth Observation and Geoinformation, 13(6), 884–893. remote sensing data. Sensors, 8(3), 1613–1636.

Journal

Geology Ecology and LandscapesTaylor & Francis

Published: Jul 3, 2021

Keywords: Classification; remote sensing; image segmentation; pattern recognition; landscape

References