SeaAlert: Critical Information Extraction From Maritime Distress Communications with Large Language Models
Maritime distress communications transmitted over very high frequency (VHF) radio are safety-critical voice messages used to report emergencies at sea. Under the Global Maritime Distress and Safety System (GMDSS), such messages follow standardized procedures and are expected to convey essential details, including vessel identity, position, nature of the distress, and required assistance. In practice, however, automatic analysis remains difficult because distress messages are often brief, noisy, and produced under stress, may deviate from the prescribed format, and are further degraded by automatic speech recognition (ASR) errors caused by channel noise and speaker stress.
This paper presents SeaAlert, an LLM-based framework for robust analysis of maritime distress communications. To address the scarcity of labeled real-world data, we develop a synthetic data generation pipeline in which an LLM produces realistic and diverse maritime messages, including challenging variants in which standard distress codewords are omitted or replaced with less explicit expressions. The generated utterances are synthesized into speech, degraded with simulated VHF noise, and transcribed by an ASR system to obtain realistic noisy transcripts.
Comments on the paper include a page count of 12 pages and 8 figures, with subjects categorized under Computation and Language (cs.CL) and Artificial Intelligence (cs.AI). The paper can be cited as arXiv:2604.14163 [cs.CL] or arXiv:2604.14163v1 [cs.CL] for this version, with a digital object identifier (DOI) of https://doi.org/10.48550/arXiv.2604.14163, issued by DataCite. The submission history shows that the paper was submitted by Yehudit Aperstein on Monday, 23 Mar 2026 18:21:13 UTC, with a file size of 863 KB.
No replies yet. Be first.