Jialin Wu

Jialin Wu

Ph.D. student

The University of Texas at Austin


I am a fifth year Ph.D. student at UTCS advised by Raymond J. Mooney. Before coming to UT Austin, I received my BEng. degree from the Department of Automation supervised by Prof. Xiangyang Ji at Tsinghua University in 2017.

My research focuses on commonsense reasoning and knowledge-based model in vision-language tasks. I also work on explaining the model’s decision both visually and textually. In particular, we explore using textual resources (potentially other modalities of resources) for answering various types of visual questions, including commonsense questions and knowledge-based questions. We explored generating and utilizing captions and explanations for better VQA performance.

I am also the organizer of UT XAI reading group.

I am actively looking for full-time opportunity starting summer/fall 2022 and my CV is here.

  • Language and Vision
  • Explainable AI
  • PhD in Artificial Intelligence, 2017 - present

    UT Austin

  • BEng in Automation, 2013 - 2017

    Tsinghua University


(2021). Multi-Modal Answer Validation for Knowledge-Based VQA. Arxiv.


(2020). Improving VQA and its Explanations by Comparing Competing Explanations. ACL 2020 ALVR Workshop.


(2019). Self-Critical Reasoning for Robust Visual Question Answering. NeurIPS 2019.


(2019). Hidden State Guidance: Improving Image Captioning using An Image Conditioned Autoencoder. NeurIPS 2019 Vigil Workshop.


(2019). Generating Question Relevant Captions to Aid Visual Question Answering. ACL 2019.


(2019). Faithful Multimodal Explanation for Visual Question Answering. ACL 2019 BlackboxNLP Workshop.


(2018). Dynamic Filtering with Large Sampling Field for Convnets. ECCV 2018.



Research Intern
May 2020 – Aug 2020 Seattle
Research Intern
Google Inc.
May 2019 – Aug 2019 New York City