HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help? - Databubble