new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Jun 10

GIF: A Conditional Multimodal Generative Framework for IR Drop Imaging in Chip Layouts

IR drop analysis is essential in physical chip design to ensure the power integrity of on-chip power delivery networks. Traditional Electronic Design Automation (EDA) tools have become slow and expensive as transistor density scales. Recent works have introduced machine learning (ML)-based methods that formulate IR drop analysis as an image prediction problem. These existing ML approaches fail to capture both local and long-range dependencies and ignore crucial geometrical and topological information from physical layouts and logical connectivity. To address these limitations, we propose GIF, a Generative IR drop Framework that uses both geometrical and topological information to generate IR drop images. GIF fuses image and graph features to guide a conditional diffusion process, producing high-quality IR drop images. For instance, On the CircuitNet-N28 dataset, GIF achieves 0.78 SSIM, 0.95 Pearson correlation, 21.77 PSNR, and 0.026 NMAE, outperforming prior methods. These results demonstrate that our framework, using diffusion based multimodal conditioning, reliably generates high quality IR drop images. This shows that IR drop analysis can effectively leverage recent advances in generative modeling when geometric layout features and logical circuit topology are jointly modeled. By combining geometry aware spatial features with logical graph representations, GIF enables IR drop analysis to benefit from recent advances in generative modeling for structured image generation.

  • 6 authors
·
Apr 10

Interferometer response characterization algorithm for multi-aperture Fabry-Perot imaging spectrometers

In recent years, the demand for hyperspectral imaging devices has grown significantly, driven by their ability of capturing high-resolution spectral information. Among the several possible optical designs for acquiring hyperspectral images, there is a growing interest in interferometric spectral imaging systems based on division of aperture. These systems have the advantage of capturing snapshot acquisitions while maintaining a compact design. However, they require a careful calibration to operate properly. In this work, we present the interferometer response characterization algorithm (IRCA), a robust three-step procedure designed to characterize the transmittance response of multi-aperture imaging spectrometers based on the interferometry of Fabry-Perot. Additionally, we propose a formulation of the image formation model for such devices suitable to estimate the parameters of interest by considering the model under various regimes of finesse. The proposed algorithm processes the image output obtained from a set of monochromatic light sources and refines the results using nonlinear regression after an ad-hoc initialization. Through experimental analysis conducted on four different prototypes from the Image SPectrometer On Chip (ImSPOC) family, we validate the performance of our approach for characterization. The associated source code for this paper is available at https://github.com/danaroth83/irca.

  • 5 authors
·
Mar 24, 2023

UniRGB-IR: A Unified Framework for RGB-Infrared Semantic Tasks via Adapter Tuning

Semantic analysis on visible (RGB) and infrared (IR) images has gained attention for its ability to be more accurate and robust under low-illumination and complex weather conditions. Due to the lack of pre-trained foundation models on the large-scale infrared image datasets, existing methods prefer to design task-specific frameworks and directly fine-tune them with pre-trained foundation models on their RGB-IR semantic relevance datasets, which results in poor scalability and limited generalization. In this work, we propose a general and efficient framework called UniRGB-IR to unify RGB-IR semantic tasks, in which a novel adapter is developed to efficiently introduce richer RGB-IR features into the pre-trained RGB-based foundation model. Specifically, our framework consists of a RGB-based foundation model, a Multi-modal Feature Pool (MFP) module and a Supplementary Feature Injector (SFI) module. The MFP and SFI modules cooperate with each other as an adapter to effectively complement the RGB-based features with the rich RGB-IR features. During training process, we freeze the entire foundation model to inherit prior knowledge and only optimize the proposed adapter. Furthermore, to verify the effectiveness of our framework, we utilize the vanilla vision transformer (ViT-Base) as the pre-trained foundation model to perform extensive experiments. Experimental results on various RGB-IR downstream tasks demonstrate that our method can achieve state-of-the-art performance. The source code and results are available at https://github.com/PoTsui99/UniRGB-IR.git.

  • 6 authors
·
Apr 26, 2024

DifIISR: A Diffusion Model with Gradient Guidance for Infrared Image Super-Resolution

Infrared imaging is essential for autonomous driving and robotic operations as a supportive modality due to its reliable performance in challenging environments. Despite its popularity, the limitations of infrared cameras, such as low spatial resolution and complex degradations, consistently challenge imaging quality and subsequent visual tasks. Hence, infrared image super-resolution (IISR) has been developed to address this challenge. While recent developments in diffusion models have greatly advanced this field, current methods to solve it either ignore the unique modal characteristics of infrared imaging or overlook the machine perception requirements. To bridge these gaps, we propose DifIISR, an infrared image super-resolution diffusion model optimized for visual quality and perceptual performance. Our approach achieves task-based guidance for diffusion by injecting gradients derived from visual and perceptual priors into the noise during the reverse process. Specifically, we introduce an infrared thermal spectrum distribution regulation to preserve visual fidelity, ensuring that the reconstructed infrared images closely align with high-resolution images by matching their frequency components. Subsequently, we incorporate various visual foundational models as the perceptual guidance for downstream visual tasks, infusing generalizable perceptual features beneficial for detection and segmentation. As a result, our approach gains superior visual results while attaining State-Of-The-Art downstream task performance. Code is available at https://github.com/zirui0625/DifIISR

  • 8 authors
·
Mar 3, 2025

A UAV-Based VNIR Hyperspectral Benchmark Dataset for Landmine and UXO Detection

This paper introduces a novel benchmark dataset of Visible and Near-Infrared (VNIR) hyperspectral imagery acquired via an unmanned aerial vehicle (UAV) platform for landmine and unexploded ordnance (UXO) detection research. The dataset was collected over a controlled test field seeded with 143 realistic surrogate landmine and UXO targets, including surface, partially buried, and fully buried configurations. Data acquisition was performed using a Headwall Nano-Hyperspec sensor mounted on a multi-sensor drone platform, flown at an altitude of approximately 20.6 m, capturing 270 contiguous spectral bands spanning 398-1002 nm. Radiometric calibration, orthorectification, and mosaicking were performed followed by reflectance retrieval using a two-point Empirical Line Method (ELM), with reference spectra acquired using an SVC spectroradiometer. Cross-validation against six reference objects yielded RMSE values below 1.0 and SAM values between 1 and 6 degrees in the 400-900 nm range, demonstrating high spectral fidelity. The dataset is released alongside raw radiance cubes, GCP/AeroPoint data, and reference spectra to support reproducible research. This contribution fills a critical gap in open-access UAV-based hyperspectral data for landmine detection and offers a multi-sensor benchmark when combined with previously published drone-based electromagnetic induction (EMI) data from the same test field.

  • 4 authors
·
Oct 2, 2025

Multi-Spectroscopic Method to Quantify Rapid Decomposition of an Organophosphate Simulant Using Reactive Materials as a Function of Metal Powder Chemistry and Temperature

The development of advanced diagnostic systems to measure and optimize emerging energetic material performance is critical for the defeat of Chemical Warfare Agents (CWA). This study presents an integrated multi-spectroscopic approach to monitor the interaction between a CWA simulant, Diisopropyl Methyl Phosphonate (DIMP), and combusting composite metal particles. A custom benchtop Polygonal Rotating Mirror Infrared Spectrometer (PRiMIRS), equipped with a customizable experimental chamber, is employed to observe DIMP decomposition. Tunable Diode Laser Absorption Spectroscopy (TDLAS) is used to measure path-averaged gas temperature profiles during combustion. In the experiment, the chamber is preheated to evaporate liquid DIMP. Various composite metal powders (Al-8Mg):3Zr, (Al-8Mg):Zr, 2(Al-8Mg):Zr, and 4(Al-8Mg):Zr are placed on a stainless steel mount and ignited using 3Al-2Ni sputter-deposited nanolayered foils. The combusting metal particles mix with the DIMP vapor, initiating chemical and thermal interactions. PRiMIRS captures DIMP spectral evolution, while TDLAS simultaneously monitors gas temperature. A spectral defeat parameter was developed to enable quantitative real-time assessment of the DIMP destruction. It uses infrared light absorption by both from DIMP and its immediate decomposition products Isopropyl Methyl Phosphonate (IMP) and Isopropyl Alcohol (IPA). Fourier Transform Infrared Spectroscopy (FTIR) serves as a secondary verification tool quantifying the decomposition products over extended timeframes, and Transmission Electron Microscopy (TEM) confirms the expected metal oxide dispersion within the reaction space. This study reports variability in DIMP defeat as a function of metal powder stoichiometry, metal powder loading, and path-averaged gas temperature profiles, offering critical insights into optimizing reactive materials for effective CWA neutralization.

  • 6 authors
·
Sep 4, 2025

SAGA: Semantic-Aware Gray color Augmentation for Visible-to-Thermal Domain Adaptation across Multi-View Drone and Ground-Based Vision Systems

Domain-adaptive thermal object detection plays a key role in facilitating visible (RGB)-to-thermal (IR) adaptation by reducing the need for co-registered image pairs and minimizing reliance on large annotated IR datasets. However, inherent limitations of IR images, such as the lack of color and texture cues, pose challenges for RGB-trained models, leading to increased false positives and poor-quality pseudo-labels. To address this, we propose Semantic-Aware Gray color Augmentation (SAGA), a novel strategy for mitigating color bias and bridging the domain gap by extracting object-level features relevant to IR images. Additionally, to validate the proposed SAGA for drone imagery, we introduce the IndraEye, a multi-sensor (RGB-IR) dataset designed for diverse applications. The dataset contains 5,612 images with 145,666 instances, captured from diverse angles, altitudes, backgrounds, and times of day, offering valuable opportunities for multimodal learning, domain adaptation for object detection and segmentation, and exploration of sensor-specific strengths and weaknesses. IndraEye aims to enhance the development of more robust and accurate aerial perception systems, especially in challenging environments. Experimental results show that SAGA significantly improves RGB-to-IR adaptation for autonomous driving and IndraEye dataset, achieving consistent performance gains of +0.4% to +7.6% (mAP) when integrated with state-of-the-art domain adaptation techniques. The dataset and codes are available at https://github.com/airliisc/IndraEye.

  • 5 authors
·
Apr 22, 2025

Characterising the Atmosphere of 55 Cancri e: 1D Forward Model Grid for Current and Future JWST Observations

Recent JWST observations with NIRCam and MIRI of the ultra-short-period super-Earth 55 Cancri e indicate a possible volatile atmosphere surrounding the planet. Previous analysis of the NIRCam spectra suggested potential absorption features from CO2 or CO and significant sub-weekly variability. The MIRI low-resolution spectrum does not contain substantial features but was found to be consistent with effective heat redistribution models. In this work, we computed a grid of over 25000 self-consistent 1D forward models incorporating H-N-O-C-S-P-Si-Ti equilibrium chemistry and assessed plausible atmospheric compositions based on the current JWST data. Despite exhaustive analysis, the composition and properties of the atmosphere remain elusive. While our results statistically favour a global, hydrogen-free, nitrogen-dominated atmosphere enriched in PO and CO2, various alternative compositions, including H2O-,CO-, PH3-, or Si-bearing remain viable explanations. Unconstrained heat redistribution efficiency and absolute NIRCam flux are among the largest sources of uncertainty in our analysis. We also find that the heat redistribution factor and surface pressure are highly degenerate with atmospheric composition, and that these parameters cannot be independently constrained using current JWST observations. Furthermore, we show that the observed variability may arise from dynamic interactions between the atmosphere and an underlying magma ocean, driving rapid shifts in atmospheric chemistry and thermal emission. Our results highlight the importance of using self-consistent forward models when analysing novel JWST spectra with limited signal-to-noise ratios -- such as those of 55 Cancri e -- as it allows for a more comprehensive evaluation of potential atmospheric scenarios while also being less sensitive to subtle spectral differences than retrievals...

  • 12 authors
·
Mar 20, 2025

Toward Real-world Infrared Image Super-Resolution: A Unified Autoregressive Framework and Benchmark Dataset

Infrared image super-resolution (IISR) under real-world conditions is a practically significant yet rarely addressed task. Pioneering works are often trained and evaluated on simulated datasets or neglect the intrinsic differences between infrared and visible imaging. In practice, however, real infrared images are affected by coupled optical and sensing degradations that jointly deteriorate both structural sharpness and thermal fidelity. To address these challenges, we propose Real-IISR, a unified autoregressive framework for real-world IISR that progressively reconstructs fine-grained thermal structures and clear backgrounds in a scale-by-scale manner via thermal-structural guided visual autoregression. Specifically, a Thermal-Structural Guidance module encodes thermal priors to mitigate the mismatch between thermal radiation and structural edges. Since non-uniform degradations typically induce quantization bias, Real-IISR adopts a Condition-Adaptive Codebook that dynamically modulates discrete representations based on degradation-aware thermal priors. Also, a Thermal Order Consistency Loss enforces a monotonic relation between temperature and pixel intensity, ensuring relative brightness order rather than absolute values to maintain physical consistency under spatial misalignment and thermal drift. We build FLIR-IISR, a real-world IISR dataset with paired LR-HR infrared images acquired via automated focus variation and motion-induced blur. Extensive experiments demonstrate the promising performance of Real-IISR, providing a unified foundation for real-world IISR and benchmarking. The dataset and code are available at: https://github.com/JZD151/Real-IISR.

  • 6 authors
·
Mar 4

I-GLIDE: Input Groups for Latent Health Indicators in Degradation Estimation

Accurate remaining useful life (RUL) prediction hinges on the quality of health indicators (HIs), yet existing methods often fail to disentangle complex degradation mechanisms in multi-sensor systems or quantify uncertainty in HI reliability. This paper introduces a novel framework for HI construction, advancing three key contributions. First, we adapt Reconstruction along Projected Pathways (RaPP) as a health indicator (HI) for RUL prediction for the first time, showing that it outperforms traditional reconstruction error metrics. Second, we show that augmenting RaPP-derived HIs with aleatoric and epistemic uncertainty quantification (UQ) via Monte Carlo dropout and probabilistic latent spaces- significantly improves RUL-prediction robustness. Third, and most critically, we propose indicator groups, a paradigm that isolates sensor subsets to model system-specific degradations, giving rise to our novel method, I-GLIDE which enables interpretable, mechanism-specific diagnostics. Evaluated on data sourced from aerospace and manufacturing systems, our approach achieves marked improvements in accuracy and generalizability compared to state-of-the-art HI methods while providing actionable insights into system failure pathways. This work bridges the gap between anomaly detection and prognostics, offering a principled framework for uncertainty-aware degradation modeling in complex systems.

orailix Orailix
·
Nov 26, 2025 2

Elastic and structural anisotropy in silica thin films for gravitational-wave detectors

The thermal noise of mirror coatings for gravitational-wave detectors critically depends on the elastic properties of the constituent materials. Data analyses and theoretical models typically assume each material is homogeneous and isotropic, but isotropy has never been explicitly verified. Using Brillouin light scattering (BLS), we demonstrate for the first time that ion-beam-sputtered SiO2 -- a material still viable for future mirror coatings -- exhibits cylindrical elastic symmetry, with in-plane isotropy but a notable 6% compressive anisotropy along the film normal. This anisotropy remains unchanged after the post-deposition heat treatment currently used in ground-based detectors (500 ^circC, 10 h) but is nearly eliminated at 900 ^circC. Infrared reflectivity experiments support these findings by directly revealing heterogeneities in the distribution of bridging and non-bridging oxygen structures along the growth axis. While BLS measures the real part of the elastic constants at GHz frequencies, the data reveal negligible contributions from mechanical relaxations in the kHz-GHz range, making BLS a valid substitute for low-frequency properties obtained from standard anisotropy-insensitive techniques. Our results highlight that restoring isotropy through heat treatment -- by softening the material, enabling more than 7% out-of-plane expansion, and smoothing out structural heterogeneities -- may play a key role in reducing thermal noise. This proof-of-concept study extends beyond silica, providing critical insights for the design of future coatings.

  • 14 authors
·
May 6

UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation

Pre-training techniques significantly enhance the performance of semantic segmentation tasks with limited training data. However, the efficacy under a large domain gap between pre-training (e.g. RGB) and fine-tuning (e.g. infrared) remains underexplored. In this study, we first benchmark the infrared semantic segmentation performance of various pre-training methods and reveal several phenomena distinct from the RGB domain. Next, our layerwise analysis of pre-trained attention maps uncovers that: (1) There are three typical attention patterns (local, hybrid, and global); (2) Pre-training tasks notably influence the pattern distribution across layers; (3) The hybrid pattern is crucial for semantic segmentation as it attends to both nearby and foreground elements; (4) The texture bias impedes model generalization in infrared tasks. Building on these insights, we propose UNIP, a UNified Infrared Pre-training framework, to enhance the pre-trained model performance. This framework uses the hybrid-attention distillation NMI-HAD as the pre-training target, a large-scale mixed dataset InfMix for pre-training, and a last-layer feature pyramid network LL-FPN for fine-tuning. Experimental results show that UNIP outperforms various pre-training methods by up to 13.5\% in average mIoU on three infrared segmentation tasks, evaluated using fine-tuning and linear probing metrics. UNIP-S achieves performance on par with MAE-L while requiring only 1/10 of the computational cost. Furthermore, UNIP significantly surpasses state-of-the-art (SOTA) infrared or RGB segmentation methods and demonstrates broad potential for application in other modalities, such as RGB and depth. Our code is available at https://github.com/casiatao/UNIP.

  • 6 authors
·
Feb 4, 2025

HATIR: Heat-Aware Diffusion for Turbulent Infrared Video Super-Resolution

Infrared video has been of great interest in visual tasks under challenging environments, but often suffers from severe atmospheric turbulence and compression degradation. Existing video super-resolution (VSR) methods either neglect the inherent modality gap between infrared and visible images or fail to restore turbulence-induced distortions. Directly cascading turbulence mitigation (TM) algorithms with VSR methods leads to error propagation and accumulation due to the decoupled modeling of degradation between turbulence and resolution. We introduce HATIR, a Heat-Aware Diffusion for Turbulent InfraRed Video Super-Resolution, which injects heat-aware deformation priors into the diffusion sampling path to jointly model the inverse process of turbulent degradation and structural detail loss. Specifically, HATIR constructs a Phasor-Guided Flow Estimator, rooted in the physical principle that thermally active regions exhibit consistent phasor responses over time, enabling reliable turbulence-aware flow to guide the reverse diffusion process. To ensure the fidelity of structural recovery under nonuniform distortions, a Turbulence-Aware Decoder is proposed to selectively suppress unstable temporal cues and enhance edge-aware feature aggregation via turbulence gating and structure-aware attention. We built FLIR-IVSR, the first dataset for turbulent infrared VSR, comprising paired LR-HR sequences from a FLIR T1050sc camera (1024 X 768) spanning 640 diverse scenes with varying camera and object motion conditions. This encourages future research in infrared VSR. Project page: https://github.com/JZ0606/HATIR

  • 7 authors
·
Jan 8

A Machine Learning-based Framework for Predictive Maintenance of Semiconductor Laser for Optical Communication

Semiconductor lasers, one of the key components for optical communication systems, have been rapidly evolving to meet the requirements of next generation optical networks with respect to high speed, low power consumption, small form factor etc. However, these demands have brought severe challenges to the semiconductor laser reliability. Therefore, a great deal of attention has been devoted to improving it and thereby ensuring reliable transmission. In this paper, a predictive maintenance framework using machine learning techniques is proposed for real-time heath monitoring and prognosis of semiconductor laser and thus enhancing its reliability. The proposed approach is composed of three stages: i) real-time performance degradation prediction, ii) degradation detection, and iii) remaining useful life (RUL) prediction. First of all, an attention based gated recurrent unit (GRU) model is adopted for real-time prediction of performance degradation. Then, a convolutional autoencoder is used to detect the degradation or abnormal behavior of a laser, given the predicted degradation performance values. Once an abnormal state is detected, a RUL prediction model based on attention-based deep learning is utilized. Afterwards, the estimated RUL is input for decision making and maintenance planning. The proposed framework is validated using experimental data derived from accelerated aging tests conducted for semiconductor tunable lasers. The proposed approach achieves a very good degradation performance prediction capability with a small root mean square error (RMSE) of 0.01, a good anomaly detection accuracy of 94.24% and a better RUL estimation capability compared to the existing ML-based laser RUL prediction models.

  • 3 authors
·
Nov 5, 2022

TIRAuxCloud: A Thermal Infrared Dataset for Day and Night Cloud Detection

Clouds are a major obstacle in Earth observation, limiting the usability and reliability of critical remote sensing applications such as fire disaster response, urban heat island monitoring, and snow and ice cover mapping. Therefore, the ability to detect clouds 24/7 is of paramount importance. While visible and near-infrared bands are effective for daytime cloud detection, their dependence on solar illumination makes them unsuitable for nighttime monitoring. In contrast, thermal infrared (TIR) imagery plays a crucial role in detecting clouds at night, when sunlight is absent. Due to their generally lower temperatures, clouds emit distinct thermal signatures that are detectable in TIR bands. Despite this, accurate nighttime cloud detection remains challenging due to limited spectral information and the typically lower spatial resolution of TIR imagery. To address these challenges, we present TIRAuxCloud, a multi-modal dataset centered around thermal spectral data to facilitate cloud segmentation under both daytime and nighttime conditions. The dataset comprises a unique combination of multispectral data (TIR, optical, and near-infrared bands) from Landsat and VIIRS, aligned with auxiliary information layers. Elevation, land cover, meteorological variables, and cloud-free reference images are included to help reduce surface-cloud ambiguity and cloud formation uncertainty. To overcome the scarcity of manual cloud labels, we include a large set of samples with automated cloud masks and a smaller manually annotated subset to further evaluate and improve models. Comprehensive benchmarks are presented to establish performance baselines through supervised and transfer learning, demonstrating the dataset's value in advancing the development of innovative methods for day and night time cloud detection.

  • 7 authors
·
Feb 25

Real-World Remote Sensing Image Dehazing: Benchmark and Baseline

Remote Sensing Image Dehazing (RSID) poses significant challenges in real-world scenarios due to the complex atmospheric conditions and severe color distortions that degrade image quality. The scarcity of real-world remote sensing hazy image pairs has compelled existing methods to rely primarily on synthetic datasets. However, these methods struggle with real-world applications due to the inherent domain gap between synthetic and real data. To address this, we introduce Real-World Remote Sensing Hazy Image Dataset (RRSHID), the first large-scale dataset featuring real-world hazy and dehazed image pairs across diverse atmospheric conditions. Based on this, we propose MCAF-Net, a novel framework tailored for real-world RSID. Its effectiveness arises from three innovative components: Multi-branch Feature Integration Block Aggregator (MFIBA), which enables robust feature extraction through cascaded integration blocks and parallel multi-branch processing; Color-Calibrated Self-Supervised Attention Module (CSAM), which mitigates complex color distortions via self-supervised learning and attention-guided refinement; and Multi-Scale Feature Adaptive Fusion Module (MFAFM), which integrates features effectively while preserving local details and global context. Extensive experiments validate that MCAF-Net demonstrates state-of-the-art performance in real-world RSID, while maintaining competitive performance on synthetic datasets. The introduction of RRSHID and MCAF-Net sets new benchmarks for real-world RSID research, advancing practical solutions for this complex task. The code and dataset are publicly available at https://github.com/lwCVer/RRSHID.

  • 6 authors
·
Mar 23, 2025

Degradation Prediction of Semiconductor Lasers using Conditional Variational Autoencoder

Semiconductor lasers have been rapidly evolving to meet the demands of next-generation optical networks. This imposes much more stringent requirements on the laser reliability, which are dominated by degradation mechanisms (e.g., sudden degradation) limiting the semiconductor laser lifetime. Physics-based approaches are often used to characterize the degradation behavior analytically, yet explicit domain knowledge and accurate mathematical models are required. Building such models can be very challenging due to a lack of a full understanding of the complex physical processes inducing the degradation under various operating conditions. To overcome the aforementioned limitations, we propose a new data-driven approach, extracting useful insights from the operational monitored data to predict the degradation trend without requiring any specific knowledge or using any physical model. The proposed approach is based on an unsupervised technique, a conditional variational autoencoder, and validated using vertical-cavity surface-emitting laser (VCSEL) and tunable edge emitting laser reliability data. The experimental results confirm that our model (i) achieves a good degradation prediction and generalization performance by yielding an F1 score of 95.3%, (ii) outperforms several baseline ML based anomaly detection techniques, and (iii) helps to shorten the aging tests by early predicting the failed devices before the end of the test and thereby saving costs

  • 5 authors
·
Nov 5, 2022

FLAME 3 Dataset: Unleashing the Power of Radiometric Thermal UAV Imagery for Wildfire Management

The increasing accessibility of radiometric thermal imaging sensors for unmanned aerial vehicles (UAVs) offers significant potential for advancing AI-driven aerial wildfire management. Radiometric imaging provides per-pixel temperature estimates, a valuable improvement over non-radiometric data that requires irradiance measurements to be converted into visible images using RGB color palettes. Despite its benefits, this technology has been underutilized largely due to a lack of available data for researchers. This study addresses this gap by introducing methods for collecting and processing synchronized visual spectrum and radiometric thermal imagery using UAVs at prescribed fires. The included imagery processing pipeline drastically simplifies and partially automates each step from data collection to neural network input. Further, we present the FLAME 3 dataset, the first comprehensive collection of side-by-side visual spectrum and radiometric thermal imagery of wildland fires. Building on our previous FLAME 1 and FLAME 2 datasets, FLAME 3 includes radiometric thermal Tag Image File Format (TIFFs) and nadir thermal plots, providing a new data type and collection method. This dataset aims to spur a new generation of machine learning models utilizing radiometric thermal imagery, potentially trivializing tasks such as aerial wildfire detection, segmentation, and assessment. A single-burn subset of FLAME 3 for computer vision applications is available on Kaggle with the full 6 burn set available to readers upon request.

  • 9 authors
·
Dec 2, 2024

ALFA: A Dataset for UAV Fault and Anomaly Detection

We present a dataset of several fault types in control surfaces of a fixed-wing Unmanned Aerial Vehicle (UAV) for use in Fault Detection and Isolation (FDI) and Anomaly Detection (AD) research. Currently, the dataset includes processed data for 47 autonomous flights with 23 sudden full engine failure scenarios and 24 scenarios for seven other types of sudden control surface (actuator) faults, with a total of 66 minutes of flight in normal conditions and 13 minutes of post-fault flight time. It additionally includes many hours of raw data of fully-autonomous, autopilot-assisted and manual flights with tens of fault scenarios. The ground truth of the time and type of faults is provided in each scenario to enable evaluation of the methods using the dataset. We have also provided the helper tools in several programming languages to load and work with the data and to help the evaluation of a detection method using the dataset. A set of metrics is proposed to help to compare different methods using the dataset. Most of the current fault detection methods are evaluated in simulation and as far as we know, this dataset is the only one providing the real flight data with faults in such capacity. We hope it will help advance the state-of-the-art in Anomaly Detection or FDI research for Autonomous Aerial Vehicles and mobile robots to enhance the safety of autonomous and remote flight operations further. The dataset and the provided tools can be accessed from https://doi.org/10.1184/R1/12707963.

  • 3 authors
·
Jul 14, 2019

Monte Carlo Linear Clustering with Single-Point Supervision is Enough for Infrared Small Target Detection

Single-frame infrared small target (SIRST) detection aims at separating small targets from clutter backgrounds on infrared images. Recently, deep learning based methods have achieved promising performance on SIRST detection, but at the cost of a large amount of training data with expensive pixel-level annotations. To reduce the annotation burden, we propose the first method to achieve SIRST detection with single-point supervision. The core idea of this work is to recover the per-pixel mask of each target from the given single point label by using clustering approaches, which looks simple but is indeed challenging since targets are always insalient and accompanied with background clutters. To handle this issue, we introduce randomness to the clustering process by adding noise to the input images, and then obtain much more reliable pseudo masks by averaging the clustered results. Thanks to this "Monte Carlo" clustering approach, our method can accurately recover pseudo masks and thus turn arbitrary fully supervised SIRST detection networks into weakly supervised ones with only single point annotation. Experiments on four datasets demonstrate that our method can be applied to existing SIRST detection networks to achieve comparable performance with their fully supervised counterparts, which reveals that single-point supervision is strong enough for SIRST detection. Our code will be available at: https://github.com/YeRen123455/SIRST-Single-Point-Supervision.

  • 8 authors
·
Apr 10, 2023

Bayesian Deep Learning for Exoplanet Atmospheric Retrieval

Over the past decade, the study of extrasolar planets has evolved rapidly from plain detection and identification to comprehensive categorization and characterization of exoplanet systems and their atmospheres. Atmospheric retrieval, the inverse modeling technique used to determine an exoplanetary atmosphere's temperature structure and composition from an observed spectrum, is both time-consuming and compute-intensive, requiring complex algorithms that compare thousands to millions of atmospheric models to the observational data to find the most probable values and associated uncertainties for each model parameter. For rocky, terrestrial planets, the retrieved atmospheric composition can give insight into the surface fluxes of gaseous species necessary to maintain the stability of that atmosphere, which may in turn provide insight into the geological and/or biological processes active on the planet. These atmospheres contain many molecules, some of them biosignatures, spectral fingerprints indicative of biological activity, which will become observable with the next generation of telescopes. Runtimes of traditional retrieval models scale with the number of model parameters, so as more molecular species are considered, runtimes can become prohibitively long. Recent advances in machine learning (ML) and computer vision offer new ways to reduce the time to perform a retrieval by orders of magnitude, given a sufficient data set to train with. Here we present an ML-based retrieval framework called Intelligent exoplaNet Atmospheric RetrievAl (INARA) that consists of a Bayesian deep learning model for retrieval and a data set of 3,000,000 synthetic rocky exoplanetary spectra generated using the NASA Planetary Spectrum Generator. Our work represents the first ML retrieval model for rocky, terrestrial exoplanets and the first synthetic data set of terrestrial spectra generated at this scale.

  • 11 authors
·
Nov 8, 2018

TeX-1500: A Paired Real-World LWIR Hyperspectral Dataset and Benchmark for Temperature-Emissivity-Texture Decomposition

Temperature-emissivity-texture (TeX) decomposition seeks to recover object heat state, material spectral response, and visible-like geometric texture from long-wave infrared hyperspectral imaging (LWIR HSI). Existing TeX pipelines are mainly scene-specific inverse solvers, and the lack of paired LWIR HSI-TeX supervision has limited learning-based decomposition. To address this gap, we introduce TeX-1500, a large-scale paired LWIR HSI-TeX dataset and benchmark for supervised HSI-to-TeX decomposition. TeX-1500 contains 1,522 calibrated real-scene pairs from DARPA Invisible Headlights (DARPA IH) pushbroom imagery and our FTIR acquisitions, covering five locations, four seasons, diverse acquisition times, heterogeneous wavelength layouts, and two sensor families. Each sample stores a calibrated valid-band radiance cube, calibrated wavelength positions, and aligned temperature, emissivity, and texture supervision constructed through a consistent restoration and TeX-construction protocol. We further provide TeX-UNet, a simple wavelength-aware baseline that maps calibrated HSI bands and wavelength positions to TeX fields. Experiments on the held-out DARPA IH pushbroom scenes and zero-/few-shot transfer to FTIR scenes show that TeX-1500 provides usable paired supervision and a measurable benchmark for data-driven physical-property-centered thermal perception.

  • 6 authors
·
Jun 1

Effect of glass stability on the low frequency vibrations of vapor deposited glasses

Ultra-stable glasses prepared from the physical vapor deposition of organic molecules present a very low density of two-level states, the kind of glass defects that determine their peculiar low temperature thermal properties. Numerical simulations suggest that quasi-localized harmonic vibrational modes emerge in the soft regions associated with two-level states. However, the connection between the low frequency vibrational modes and the local structural instabilities of glasses remains unexplained. Here we exploit a recently developed spectrograph for nuclear resonant analysis of inelastic X-ray scattering to probe the density of vibrational states of amorphous thin films of ultra-stable and conventional glasses down to an exceptionally low frequency of sim 70 GHz. We show that the glass stability does not affect the harmonic vibrational modes at the lowest frequencies, despite a reduction of almost an order of magnitude in the density of two-level states. At the same time, the vibrational modes at higher frequencies, around the boson peak maximum, are extremely sensitive to the glass stability. Although we cannot exclude the possible existence of quasi-localized modes in glasses, we show that their presence is not strictly necessary to describe the measured density of low frequency vibrations. The experimental developments here presented pave the way to the solution to the long-standing debate on the low frequency vibrations in glasses.

  • 11 authors
·
Feb 24

HyperspectralViTs: General Hyperspectral Models for On-board Remote Sensing

On-board processing of hyperspectral data with machine learning models would enable unprecedented amount of autonomy for a wide range of tasks, for example methane detection or mineral identification. This can enable early warning system and could allow new capabilities such as automated scheduling across constellations of satellites. Classical methods suffer from high false positive rates and previous deep learning models exhibit prohibitive computational requirements. We propose fast and accurate machine learning architectures which support end-to-end training with data of high spectral dimension without relying on hand-crafted products or spectral band compression preprocessing. We evaluate our models on two tasks related to hyperspectral data processing. With our proposed general architectures, we improve the F1 score of the previous methane detection state-of-the-art models by 27% on a newly created synthetic dataset and by 13% on the previously released large benchmark dataset. We also demonstrate that training models on the synthetic dataset improves performance of models finetuned on the dataset of real events by 6.9% in F1 score in contrast with training from scratch. On a newly created dataset for mineral identification, our models provide 3.5% improvement in the F1 score in contrast to the default versions of the models. With our proposed models we improve the inference speed by 85% in contrast to previous classical and deep learning approaches by removing the dependency on classically computed features. With our architecture, one capture from the EMIT sensor can be processed within 30 seconds on realistic proxy of the ION-SCV 004 satellite.

  • 2 authors
·
Oct 22, 2024

Searching for Materials with High Refractive Index and Wide Band Gap: A First-Principles High-Throughput Study

Materials combining both a high refractive index and a wide band gap are of great interest for optoelectronic and sensor applications. However, these two properties are typically described by an inverse correlation with high refractive index appearing in small gap materials and vice-versa. Here, we conduct a first-principles high-throughput study on more than 4000 semiconductors (with a special focus on oxides). Our data confirm the general inverse trend between refractive index and band gap but interesting outliers are also identified. The data are then analyzed through a simple model involving two main descriptors: the average optical gap and the effective frequency. The former can be determined directly from the electronic structure of the compounds, but the latter cannot. This calls for further analysis in order to obtain a predictive model. Nonetheless, it turns out that the negative effect of a large band gap on the refractive index can counterbalanced in two ways: (i) by limiting the difference between the direct band gap and the average optical gap which can be realized by a narrow distribution in energy of the optical transitions and (ii) by increasing the effective frequency which can be achieved through either a high number of transitions from the top of the valence band to the bottom of the conduction or a high average probability for these transitions. Focusing on oxides, we use our data to investigate how the chemistry influences this inverse relationship and rationalize why certain classes of materials would perform better. Our findings can be used to search for new compounds in many optical applications both in the linear and non-linear regime (waveguides, optical modulators, laser, frequency converter, etc.).

  • 6 authors
·
Sep 4, 2018

AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation

Remote sensing image object detection (RSIOD) aims to identify and locate specific objects within satellite or aerial imagery. However, there is a scarcity of labeled data in current RSIOD datasets, which significantly limits the performance of current detection algorithms. Although existing techniques, e.g., data augmentation and semi-supervised learning, can mitigate this scarcity issue to some extent, they are heavily dependent on high-quality labeled data and perform worse in rare object classes. To address this issue, this paper proposes a layout-controllable diffusion generative model (i.e. AeroGen) tailored for RSIOD. To our knowledge, AeroGen is the first model to simultaneously support horizontal and rotated bounding box condition generation, thus enabling the generation of high-quality synthetic images that meet specific layout and object category requirements. Additionally, we propose an end-to-end data augmentation framework that integrates a diversity-conditioned generator and a filtering mechanism to enhance both the diversity and quality of generated data. Experimental results demonstrate that the synthetic data produced by our method are of high quality and diversity. Furthermore, the synthetic RSIOD data can significantly improve the detection performance of existing RSIOD models, i.e., the mAP metrics on DIOR, DIOR-R, and HRSC datasets are improved by 3.7%, 4.3%, and 2.43%, respectively. The code is available at https://github.com/Sonettoo/AeroGen.

  • 7 authors
·
Nov 23, 2024

DiffIR: Efficient Diffusion Model for Image Restoration

Diffusion model (DM) has achieved SOTA performance by modeling the image synthesis process into a sequential application of a denoising network. However, different from image synthesis, image restoration (IR) has a strong constraint to generate results in accordance with ground-truth. Thus, for IR, traditional DMs running massive iterations on a large model to estimate whole images or feature maps is inefficient. To address this issue, we propose an efficient DM for IR (DiffIR), which consists of a compact IR prior extraction network (CPEN), dynamic IR transformer (DIRformer), and denoising network. Specifically, DiffIR has two training stages: pretraining and training DM. In pretraining, we input ground-truth images into CPEN_{S1} to capture a compact IR prior representation (IPR) to guide DIRformer. In the second stage, we train the DM to directly estimate the same IRP as pretrained CPEN_{S1} only using LQ images. We observe that since the IPR is only a compact vector, DiffIR can use fewer iterations than traditional DM to obtain accurate estimations and generate more stable and realistic results. Since the iterations are few, our DiffIR can adopt a joint optimization of CPEN_{S2}, DIRformer, and denoising network, which can further reduce the estimation error influence. We conduct extensive experiments on several IR tasks and achieve SOTA performance while consuming less computational costs. Code is available at https://github.com/Zj-BinXia/DiffIR.

  • 8 authors
·
Mar 16, 2023

Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection

In recent years, object detection utilizing both visible (RGB) and thermal infrared (IR) imagery has garnered extensive attention and has been widely implemented across a diverse array of fields. By leveraging the complementary properties between RGB and IR images, the object detection task can achieve reliable and robust object localization across a variety of lighting conditions, from daytime to nighttime environments. Most existing multi-modal object detection methods directly input the RGB and IR images into deep neural networks, resulting in inferior detection performance. We believe that this issue arises not only from the challenges associated with effectively integrating multimodal information but also from the presence of redundant features in both the RGB and IR modalities. The redundant information of each modality will exacerbates the fusion imprecision problems during propagation. To address this issue, we draw inspiration from the human brain's mechanism for processing multimodal information and propose a novel coarse-to-fine perspective to purify and fuse features from both modalities. Specifically, following this perspective, we design a Redundant Spectrum Removal module to remove interfering information within each modality coarsely and a Dynamic Feature Selection module to finely select the desired features for feature fusion. To verify the effectiveness of the coarse-to-fine fusion strategy, we construct a new object detector called the Removal then Selection Detector (RSDet). Extensive experiments on three RGB-IR object detection datasets verify the superior performance of our method.

  • 5 authors
·
Jan 19, 2024

Accurate Machine Learning Atmospheric Retrieval via a Neural Network Surrogate Model for Radiative Transfer

Atmospheric retrieval determines the properties of an atmosphere based on its measured spectrum. The low signal-to-noise ratio of exoplanet observations require a Bayesian approach to determine posterior probability distributions of each model parameter, given observed spectra. This inference is computationally expensive, as it requires many executions of a costly radiative transfer (RT) simulation for each set of sampled model parameters. Machine learning (ML) has recently been shown to provide a significant reduction in runtime for retrievals, mainly by training inverse ML models that predict parameter distributions, given observed spectra, albeit with reduced posterior accuracy. Here we present a novel approach to retrieval by training a forward ML surrogate model that predicts spectra given model parameters, providing a fast approximate RT simulation that can be used in a conventional Bayesian retrieval framework without significant loss of accuracy. We demonstrate our method on the emission spectrum of HD 189733 b and find good agreement with a traditional retrieval from the Bayesian Atmospheric Radiative Transfer (BART) code (Bhattacharyya coefficients of 0.9843--0.9972, with a mean of 0.9925, between 1D marginalized posteriors). This accuracy comes while still offering significant speed enhancements over traditional RT, albeit not as much as ML methods with lower posterior accuracy. Our method is ~9x faster per parallel chain than BART when run on an AMD EPYC 7402P central processing unit (CPU). Neural-network computation using an NVIDIA Titan Xp graphics processing unit is 90--180x faster per chain than BART on that CPU.

  • 11 authors
·
Mar 4, 2020