With deep learning (DL) modeling, aircraft performance system simulators can be derived statistically without the extensive system knowledge required by physics-based modeling. Yet, DL’s rapid and economical simulations face serious trustworthiness challenges. Due to the high cost of aircraft flight tests, it is difficult to preserve held-out test datasets that cover the full range of the flight envelope for estimating the prediction errors of the learned DL model. Considering this risk of test data underrepresentation, even low-error models encounter credibility concerns when simulating and analyzing system behavior under all foreseeable operating conditions. Therefore, domain-aware DL testing methods become crucial to validate the properties and requirements for a DL-based system model beyond conventional performance testing. Crafting such testing methods needs an understanding of the simulated system design and a creative approach to incorporate the domain expertise without compromising the data-driven nature of DL and its acquired advantages.
Therefore, we present our physics-guided adversarial testing method that leverages foreknown physics-grounded sensitivity rules associated with the system and metaheuristic search algorithms in order to expose regions of input space where the DL model violates the system properties distilled from the physics domain knowledge. We also demonstrate how the revealed adversarial inputs can be relevant for model regularization to penalize these physics inconsistencies and to improve the model reliability. Furthermore, the verification of DL model sensitivity at different data points against physics-grounded rules pose scalability challenges when it comes to sequential models that predict dynamical system behavior over successive time steps. In response to that, we encode the directional expected variation of the predicted quantities using a simple physics model that is stochastically calibrated on the experimental dataset. Thus, in addition to assessing the error rate of the sequential DL model on genuine flight tests, we examine the extent to which the trajectory of its time series predictions matches the expected dynamics of variation under craftly-designed synthetic flight scenarios.