arxivApril 13, 2026 at 4:00 AM1 min readneutral

Detection of Hate and Threat in Digital Forensics: A Case-Driven Multimodal Approach

View PDF HTML (experimental) Abstract:Digital forensic investigations increasingly rely on heterogeneous evidence such as images, scanned documents, and contextual reports. These artifacts may contain explicit or implicit expressions of harm, hate, threat, violence, or intimidation, yet existing automated approaches often assume clean text input or apply vision models without forensic justification. This paper presents a case-driven multimodal approach for hate and threat detection in forensic analysis. The proposed framework explicitly determines the presence and source of textual evidence, distinguishing between embedded text, associated contextual text, and image-only evidence. Based on the identified evidence configuration, the framework selectively applies text analysis, multimodal fusion, or image-only semantic reasoning using vision language models with vision transformer backbones (ViT). By conditioning inference on evidence availability, the approach mirrors forensic decision-making, improves evidentiary traceability, and avoids unjustified modality assumptions. Experimental evaluation on forensic-style image evidence demonstrates consistent and interpretable behavior across heterogeneous evidence scenarios. Comments: 8 pages, 4 figures Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG) Cite as: arXiv:2604.08609 [cs.CV] (or arXiv:2604.08609v1 [cs.CV] for this version) https://doi.org/10.48550/arXiv.2604.08609 arXiv-issued DOI via DataCite Submission history From: Ponkoj Shill [view email] [v1] Wed, 8 Apr 2026 21:50:02 UTC (1,116 KB)

Read original article ↗

No replies yet. Be first.

arxiv3h ago

Detection of Hate and Threat in Digital Forensics: A Case-Driven Multimodal Approach

Related Articles

Advantage-Guided Diffusion for Model-Based Reinforcement Learning

FluidFlow: a flow-matching generative model for fluid dynamics surrogates on unstructured meshes

HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?