
Data Contamination Through the Lens of Time

Recent claims about the impressive abilities of large language models (LLMs) are often supported by evaluating publicly available benchmarks. Since LLMs train on wide swaths of the internet, this practice raises concerns of data contamination, i.e., …

MedNLI Is Not Immune: Natural Language Inference Artifacts in the Clinical Domain

Crowdworker-constructed natural language inference (NLI) datasets have been found to contain statistical artifacts associated with the annotation process that allow hypothesis-only classifiers to achieve better-than-random performance (Poliak et al., …


An open-source NLP framework for clinical phenotyping.