Robustness assessment of hyperspectral image CNNs using metamorphic testing

Abstract

Remote sensing has proven its utility in many critical domains, such as medicine, military, and ecology. Recently, we have been witnessing a surge in the adoption of deep learning (DL) techniques by the remote sensing community. DL-based classifiers, such as convolutional neural networks (CNNs), have been reported to achieve impressive predictive performances reaching 99% of accuracy when applied to hyperspectral images (HSIs), a high-dimensional type of remote sensing data. However, these deep learners are known to be highly sensitive to even slight perturbations of their high-dimensional raw inputs. In real-world contexts, concerns can be raised about how robust they really are against corner-case scenarios. When HSI classifiers are applied in safety–critical applications, ensuring an adequate level of robustness is crucial to prevent unexpected system behaviors. Yet, there are few studies dealing with their robustness, nor are RGB-testing methods able to cover the HSI-specific challenges. This led us to propose a systematic testing method to assess the robustness of the CNNs trained to classify HSIs. First, we elaborate domain-specific metamorphic transformations that simulate naturally-occurring distortions of remote sensing HSIs. Then, we leverage metaheuristic search algorithms to optimize the fitness of synthetically-distorted inputs to stress the weaknesses of the on-testing CNN, while remaining in compliance with domain expert requirements, in order to preserve the semantic of the generated inputs. Relying on our metamorphic testing method, we assess the robustness of established and novel CNNs for HSI classification, and demonstrate their failure, on average, in 25% of the produced test cases. Furthermore, we fine-tuned the tested CNNs on training data augmented with these failure-revealing metamorphic transformations. Results show that the fined-tuning successfully fixed at least 90% of the CNN weaknesses, with less than 1% of degradation in the original predictive performance, outperforming the common iterative gradient-based adversarial attack, namely, Projected Gradient Descent (PGD).

Type
Publication
In Information and Software Technology
Houssem Ben Braiek
Houssem Ben Braiek
Ph.D., M.Sc., Eng.

I am ML Tech Lead with a background in software engineering, holding M.Sc. and Ph.D. degrees from Polytechnique Montreal with distinction. My role involves supervising and guiding the development of machine learning solutions for intelligent automation systems. As an active SEMLA member, I contribute to research projects in trustworthy AI, teach advanced technical courses in SE4ML and MLOps, and organize workshops.