About Me

I’m an Assistant Professor at Department of Biomedical Informatics, and Co-Director for Center for Health Artificial Intelligence, at University of Colorado Anschutz Medical Campus. I obtained my Ph.D. in Computer Science and Engineering, with a focus on Natural Language Processing (NLP) and AI in PSU NLP Lab, led by Dr. Rebecca J. Passonneau, at Pennsylvania State University. After PhD, I joined ICU Data Science Lab, at University of Wisconsin-Madison as a postdoc researcher, to expend my research on the intersection of NLP/AI and healthcare. My name in Chinese is 高艳珺.

At CU, I lead an NLP research group named LARK (Language, Reasoning and Knowledge) Lab, dedicated to the development and evaluation of novel NLP methods and their transformative applications in both healthcare and broader domains. I am also affliate faculty with CU Boulder Computational Linguistics (CLASIC) and BoulderNLP group.

My favorite sentence as a computational linguist: “Doc, Note: I Dissent. A Fast Never Prevents A Fatness. I Diet On Cod.” – Garfield.

Besides research and professional life, I am a lifelong fan to YOASOBI (JPOP Band, “小説を音楽にするユニット”) and Sailor Moon🌙.

Prospective postdocs and students are welcome to email me at yanjun dot gao at cuanschutz dot edu.

News

🆕📣🎙 (10/01/2025) Our LARK Lab recently has several paper acceptance! 2 papers (1 Main, 1 Findings) accepted to EMNLP 2025, and 1 paper accepted to NeurIPS GenAI4Health workshop! See you soon in Suchow and San Diego!

🆕📣🎙 (06/01/2025) We’re building the foundations for trustworthy clinical NLP through a series of studies—published and accepted in major journals—that develop validated evaluation instruments and LLM-as-a-judge frameworks to ensure AI-generated clinical summaries are accurate, safe, and aligned with physician reasoning. This is a joint work between our LARK Lab, UWisc ICU Data Science Lab, UW Health, and Epic System. Read about our work: Link

🆕📣🎙 (05/01/2025) Our paper “Anchored Answers: Unravelling Positional Bias in GPT-2’s Multiple-Choice Questions.” has been accepted to ACL 2024 as Findings! We present a focused case study that advances our understanding of LLMs regarding their internal mechanisms and demystify their “black box” nature, making them more transparent and truthworthy. Big shoutout to my collaborator Dr. Ruizhe Li for leading the work!

🆕📣🎙 (10/29/2024) One paper is accepted to EMNLP 2024 as Findings! “When Raw Data Prevails: Are Large Language Model Embeddings Effective in Numerical Data Representation for Medical Machine Learning Applications?”, preprint available: Link

🆕📣🎙 (08/15/2024) I was invited to be on the panel discussion about Prompt Engineering, on Epic UGM! According to the company, hospitals that use its software held medical records of 78% of patients in the United States and over 3% of patients worldwide in 2022.

🆕📣🎙 (08/15/2024) Work collaborating with Intel Corp., UCSD has been accepted and appeared on Association of Computational Linguistics! ``Learning to Maximize Mutual Information for Chain-of-Thoughts Distillation’‘Paper.

🆕📣🎙 (12/15/2023) Invited talk at Northwestern Medicine Healthcare AI Forum, by Institute for Artificial Intelligence in Medicine! Link to the recording

🆕📣🎙 (12/01/2023) Invited talk at Informatics Institute Powertalk Seminar Series, University of Alabama at Birmingham! Link to the recording

🆕📣🎉🎉🎉 (10/09/2023) Excited to share that I received the National Institutes of Health (NIH), National Library of Medicine (NLM) “Pathway to Independence Award” (K99/ROO)! Awarded Amount: $ 874,800.

🆕📣🎉🎉🎉 (06/05/2023) I have won the UW-Madison Department of Medicine Trainee Outstanding Research Awards! Three papers are accepted to BioNLP Workshop and Clinical NLP Workshop (ACL Workshop)!

🆕📣 (12/19/22) Podium abstract accepted to AMIA Informatics Summit! See you in Seattle, March 2023!

🆕📣 (12/14/2022) Invited talk at OHSDI Monthly NLP Working Group!

🆕📣 (Updated on November 15th, 2022) Invited talk at Computation and Informatics in Biology and Medicine (CIBM) Seminar at University of Wisconsin-Madison!

🆕📣🎉🎉🎉 (Updated on November 15th, 2022) I have won the Award for Clinical Research in Department of Medicine (DOM) Annual Research Day! More details: DOM Research Day

🆕📣🎉🎉🎉 (Updated on September 6th, 2022) I’m serving as guest-editor for Journal of Biomedical Informatics, we are organizing a Special issue on Clinial NLP on Secondary Use Link!

🆕📣🎉🎉🎉 (Updated on August 22th, 2022) One paper is accepted to COLING 2022, “Summarizing patients problems from hospital progress notes”! See preprint version Link!

🆕📣🎉🎉🎉 (Updated on July 7th, 2022) One paper is accepted to Journal of American Medical Informatics Association (JAMIA, impact factor 7.942), “A Scoping Review of Publicly Available Language Tasks in Clinical Natural Language Processing”! The official version will be released later!

🆕📣🎉🎉🎉 One paper is accepted to LREC 2022 as oral presentation! We propose the first annotation framework and suite of NLP tasks for Clinical Diagnostic Reasoning, a critical cognitive process for building clinical NLP/AI models! “Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding”! Preprint

🎙 (Updated on March 31, 2022) Invited to give a talk in my Alma Mater, Penn State SEDTAPP!

🎙 (Updated on March 3, 2022) I’m presenting my work on Machine Learning Community at UW-Madison, “ML+X”!

🆕📣🎉🎉🎉 (Updated on Nov 15, 2021) I’m on the organizing committee of National NLP Clinical Challenge (n2c2), housed by Harvard Medical School. We are hosting Progress Note Understanding: Assessment and Plan Reasoning Shared Task! Now calling for participation!

🆕📣 (Updated on Oct 13, 2021) One journal paper published in IEEE Transactions on Education, “Analytical Techniques for Developing Argumentative Writing in STEM: A Pilot Study”, see link here!

🆕📣 (Updated on August 1, 2021) Started from July 26th, 2021, I officially join Churpek/Afshar Lab in Department of Medicine, UW-Madison. I’ll explore and dedicate myself into the world of clinical NLP and developing AI systems for physicians in the hospital!

🆕📣 (Updated on August 3, 2021) I have presented the paper, “ABCD: A Graph Framework to Convert Complex Sentences to a Covering Set of Simple Sentences” in ACL conference online in ACL Anthology! And check out the source codes in our github repository: ABCD-ACL2021-Github!

🆕 (Updated on May 24, 2021) I have succesfully defended my PhD thesis, now I’m officially Dr.Gao!!

🆕 (Updated on May 6, 2021) Our paper, “ABCD: A Graph Framework to Convert Complex Sentences to a Covering Set of Simple Sentences”, is accepted to ACL2021 Main Conference as oral presentation (Acceptance Rate: 21%)!

🆕 (Updated on April 16, 2021) My doctoral thesis gets accepted to AIED Doctoral Consortium 2021!

🆕 (Updated on April 16, 2021) One paper gets accepted to TextGraphs 2021!

🆕 (Updated on April 15, 2021) One paper gets accepted to Visually Grounded Interaction and Language (ViGIL 2021) workshop!

🆕 Our journal paper, “Analytical Techniques for Developing Argumentative Writing in STEM”, has been released in preprint! Check it out!

🆕 Our paper, “Automated Pyramid Summarization Evaluation” is accepted by CoNLL 2019 (acceptance rate 22.66%)! See you in HK!

🆕 Our paper, “Rubric Reliability and Annotation of Content and Argument in Source-Based Argument Essays “ is accepted and published in Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications 2019 (BEA)!

Yanjun Gao

News