Knowledge vs Reasoning in Clinical NLI Do larger language models naturally learn to reason, or do they mostly get better at recalling facts and mimicking patterns? A new study introduces a Clinical Trial Natural Language Inference benchmar... Benchmark Causal Attribution Chain of Thought Clinical NLI Compositional Grounding Epistemic Verification GKMRV Knowledge Large Language Models Medical AI Model Evaluation Neuro-symbolic AI Reasoning Risk State Abstraction