Revisiting the Reliability of Language Models in Instruction-Following - Databubble