Curriculum Vitae

Qualifications

[2021] Graduate Certificate in University Teaching
- The University of Melbourne
[2013] Ph.D., Engineering Science (Computer)
- The University of Melbourne
- Supervisors: Tim Baldwin and David Newman
[2008] B.E., Software Engineering
- The University of Melbourne
- First Class Honours

Employment History

[2026—present] Associate Professor
- The University of Melbourne, School of Computing and Information Systems
[2022—2025] Senior Lecturer
- The University of Melbourne, School of Computing and Information Systems
[2019—2022] Lecturer
- The University of Melbourne, School of Computing and Information Systems
[2015—2019] Research Staff Member
- IBM Research Australia
[2013—2015] Research Associate
- King’s College London
- Supervisors: Shalom Lappin and Alexander Clark
[2013—2013] Research Fellow
- The University of Melbourne
- Supervisors: Timothy Baldwin and David Newman
[2008—2011] Software Developer
- Transtech Consulting Service

Awards and Honours

[2025] FEIT’s Excellence Awards in Engagement for Public Value and Social Inclusion, University of Melbourne
[2023] Best Paper Award, ALTA
[2023] Outstanding Paper Award, EACL
[2022] Best Paper Award Runner Up, AACL
[2022] Best Paper Award, CSRR
[2021] Best Paper Award, ALTA
[2021] Best Paper Award, MRL
[2020] Outstanding Reviewer, EMNLP
[2019] Research Accomplishment Prize, IBM
[2009—2013] Endeavour International Postgraduate Research Scholarship, University of Melbourne
[2009—2013] Melbourne International Research Scholarship, University of Melbourne
[2005, 2006, 2008] Dean’s Honour List, University of Melbourne
[2005, 2006] Wyvern Medal for Academic Excellency, Queen’s College

Grants

[2025] Harnessing Generative AI to Enhance Student Learning Support in Subject Discussion Forums (AUD$20K)
- FEIT Learning and Teaching Initiatives Grant
- Chief Investigator, joint with Tawfiq Islam, Antonette Mendoza and Alex Zable
[2025] Prototyping AI-Assisted Legal Issue Identification to Enhance Access to Justice Services (AUD$5K)
- Australian Academy of Law
- Chief Investigator, joint with P. Burgess and E. Shareghi
[2025—2027] Verifying Authenticity of Information via Automated Detection and Sourcing (AUD$3.0M)
- Department of Defense (Advanced Strategic Capabilities Accelerator)
- Co-lead Investigator, joint with M. Dras, X. Zhang, M. Rizoiu, A. McIver, Q. Xu, U. Naseem, T. Drummond, L. Frermann and H. Yu
[2024] Developing a Pilot Artificial Intelligence-driven Virtual Patient Architecture for Communication Skill Development of Healthcare Students (AUD$10K)
- MDHS Learning and Teaching Initiative Seed Funding
- Chief Investigator, joint with P. Bowers, T. Ryan, K. Graydon and D. Tomlin
[2024—2026] Empowering Next-Generation Spatial Digital Twins with Linked Spatial Data (AUD$434K)
- Australian Research Council (ARC) Discovery Project
- Chief Investigator, joint with J. Qi and W. Wang
[2023—2026] AI for Legal Problem Diagnosis in the Diverse Language of Australians (AUD$343K)
- Australian Research Council (ARC) Linkage Project
- Chief Investigator, joint with T. Baldwin, T. O’Doherty and B. Merrifield
[2022—2024] Robust Natural Language Processing for Healthcare (AUD$20K)
- Manchester-Melbourne-Toronto Research Fund
- Partner Investigator, joint with D. Beck, T. Cohn, Y. Otmakhova, V. Schlegel, R. Batista-Navarro, Y. Sun and Y. Wu
[2022—2023] Towards Robust NLP: Out-of-distribution Benchmarks and Multi-task Fine-tuning (USD$90K)
- Oracle Gift
- Lead Investigator, joint with D. Beck
[2021] Mirrored Social Media Platform (AUD$75K)
- Defense Science Institute Collaborative Research Grant and Leidos
- Chief Investigator, joint with C. Leckie and S. Karunasekera
[2021] Multi-Party Collaborative Project for Three Case Studies of Mass Influencing Organisations (AUD$419K)
- Department of Defense
- Partner Investigator, joint with L. Sciacca, E. Ebbott, C. Leckie, S. Karunasekera, A. Ahmad, L. Ruppanner, T. van Gelder, A. Perfors, Y. Kashima and R. de Rozario.
[2020—2021] Development of Natural Language Processing for Knowledge Base Population (AUD$107K)
- Mitsubishi Heavy Industries
- Chief Investigator, joint with T. Baldwin
[2020] Document Corpus Analysis (AUD$20K)
- Defence Science and Technology: AI for Decision Making Initiative
- Chief Investigator, joint with M. Mistica and T. Baldwin
[2018—2024] Industrial Transformation Training Centre on Cognitive Computing for Medical Technologies (AUD$4.1M)
- Australian Research Council (ARC)
- Chief Investigator, joint with T. Baldwin, D. Freestone, D. Grayden, C. Masters, K. Verspoor, M. Cook, A. Burkitt, T. Cohn, J. Bailey, I. Mareels, T. Kalincik, A. van Schaik; M. McDonnell, L. Cavedon, J. Batstone, S. Harrer, N. Faux, A. Jimeno Yepes, C. Butler, B. Goudey, U. Asif, J. Tang, B. Mashford, P. Maruff
[2014] Learning Word Sense Distributions for Wordnets of the World’s Languages (USD$10K)
- Google Cloud Credits Award
- Partner Investigator, joint with T. Baldwin, F. Bond and P. Cook
[2013—2015] Personalised Topic Modelling and Sentiment Analysis for Enhanced Information Discovery over Document Streams (AUD$195K)
- Australian Research Council (ARC) Linkage Project
- Partner Investigator, joint with T. Baldwin, J. Wells and D. Johnson

Patents

[2021] Natural language processor for cognitively generating contextual filler designed to keep users engaged
- Tran, K.N.D., Pervin, S., Li, J.J., Mohania, M.K., Lau, J.H. and Dubyak, W., International Business Machines Corp
- U.S. Patent Application 16/186,920

Invited Keynotes, Talks and Tutorials

[2025] Teaching NLP/Using NLP for Teaching (Panelist), ALTA
[2024] Generative AI and Ethics, Mentone Girls’ Grammar
[2023] Few-shot Detection of State-Sponsored Agents in Disinformation Campaigns, Understanding and Countering Mis/Disinformation Workshop 2023
[2023] Careers Night in Data Science, AI & NLP (Panelist), The University of Melbourne
[2023] Rumour and Disinformation Detection in Online Conversations, Monash University, VinAI
[2022] Combating Misinformation on Social Media: From Detection to Mitigation (Tutorial), ALTA
[2022] Generating Product Descriptions and Answers for Customer Queries on E-commerce Platforms (Keynote), Endeavour Group, Open Data Science Conference Asia Pacific
[2022] Responsible Data Science (Panelist), DERC 2022 seminar series, University of Cambridge and RMIT
[2021] Misinformation Detection and Analysis, The University of Melbourne, Information and Influence Seminar Series, Telstra
[2019] Creativity, Machine and Poetry (Keynote), Language: interdisciplinary public forum
[2018] Teaching Machines to Write Better, The University of Melbourne
[2018] Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme, Monash University
[2017] Topically Driven Neural Language Model, Monash University
[2016] Do We Need Grammar?, IBM Research Australia
[2015] Unsupervised Sense Learning and Its Applications, University of Gothenburg
[2015] Application of Probabilistic Models to Acceptability Prediction, University of Amsterdam
[2014] A Probabilistic Approach to Grammaticality, University of Sheffield, University of Edinburgh, The University of Melbourne
[2013] Learning Predominant Sense with Non-parametric Topic Model, South England NLP Meetup
[2013] Automating the Computation of Topic Coherence and Generation of Topic Labels, King’s College London
[2013] Fun with Topic Modelling, National Institute of Informatics (Tokyo)

Teaching

[2024] BUSA90543 Text Analytics for Business, Melbourne Business School.
[2024] COMP90042 Natural Language Processing, Computing and Information Systems.
[2023] BUSA90543 Text Analytics for Business, Melbourne Business School.
[2023] COMP90042 Natural Language Processing, Computing and Information Systems.
[2022] BUSA90501 Text and Web Analytics, Melbourne Business School.
[2022] COMP90042 Natural Language Processing, Computing and Information Systems.
[2021] BUSA90501 Text and Web Analytics, Melbourne Business School.
[2021] COMP90042 Natural Language Processing, Computing and Information Systems.
[2020] COMP90042 Natural Language Processing, Computing and Information Systems.
[2019] BUSA90501 Text and Web Analytics, Melbourne Business School.

Supervision

PhD

[2025—present] Mehreen Mubashar, with Lars Kulik, The University of Melbourne.
[2025—present] Tao Shi, with Qiongkai Xu, The University of Melbourne.
[2025—present] Dingyang Lyu, with Jianzhong Qi, The University of Melbourne.
[2025—present] Reza Ghasemi Madani, with Caren Han, The University of Melbourne.
[2025—present] Shuhe Wang, with Ed Hovy, The University of Melbourne.
[2025—present] Anudeex Shetty, with Qiongkai Xu and Olya Ohrimenko, The University of Melbourne.
[2025—present] Shengxiang Gao, with Jianzhong Qi, The University of Melbourne.
[2024—present] Yanbei Jiang, with Kris Ehinger, The University of Melbourne.
[2023—present] Patrick Bowers, with Dani Tomlin (Audiology and Speech Pathology), Kelley Graydon (Audiology and Speech Pathology) and Tracii Ryan (Education), The University of Melbourne.
[2023—present] Ming-bin (Bryan) Chen, with Lea Frermann, The University of Melbourne.
[2022—present] Rui Xing, with Tim Baldwin, The University of Melbourne.
[2021—present] Rena Gao, with Carsten Roever (Languages and Linguistics), The University of Melbourne.
[2020—2025] Miao Li, with Ed Hovy, The University of Melbourne.
[2020—2025] Rongxin Zhu, with Jiangzhong Qi, The University of Melbourne.
[2020—2024] Lin Tian, with Jenny Zhang (RMIT), RMIT University.
[2020—2024] Takashi Wada, with Tim Baldwin, The University of Melbourne.
[2020—2024] Yulia Otmakhova, with Karin Verspoor and Tim Baldwin, The University of Melbourne.
[2020—2024] Zhuohan Xie, with Trevor Cohn, The University of Melbourne.
[2019—2023] Shiquan Yang, with Sarah Erfani, The University of Melbourne.
[2019—2023] Han Liu, with Richard Sinnott, The University of Melbourne.
[2018—2022] Fajri, with Tim Baldwin, The University of Melbourne.
[2017—2024] Shraey Bhatia, with Tim Baldwin, The University of Melbourne.
[2016—2021] Shiwei Zhang, with Jenny Zhang (RMIT), Jeffrey Chan (RMIT) and Cecile Paris (CSIRO), RMIT University.
[2016—2018] Adel Foda, with Tim Baldwin, The University of Melbourne.

Masters

[2025—2025] David Setiawan, with Raphaël Merx, The University of Melbourne.
[2025—present] Phuong Le, with Kemal Kurniawan, The University of Melbourne.
[2025—2025] Annisa Yusup, with Kemal Kurniawan, The University of Melbourne.
[2024—present] Wenzheng Du, The University of Melbourne.
[2024—2024] Anudeex Shetty, with Qiongkai Xu, The University of Melbourne.
[2024—2024] Zhou Peng, with Qiongkai Xu, The University of Melbourne.
[2023—2024] Carolyn Hicks, with Kathryn Davidson (Architecture, Building and Planning), The University of Melbourne.
[2023—2024] Archit Aggarwal, with Christine de Kock, The University of Melbourne.
[2023—2023] Chao Sun, with Bowei Zou (A*STAR Singapore), The University of Melbourne.
[2023—2023] Kaijin Zhang, The University of Melbourne.
[2023—2023] Yanbei Jiang, with Kris Ehinger, The University of Melbourne.
[2022—2023] Andrew Naughton, with Lea Frermann, The University of Melbourne.
[2021—2022] Zixun Wu, with Kris Ehinger, The University of Melbourne.
[2021—2022] Erya Wen, The University of Melbourne.
[2021—2021] Mulin Shi, with Lea Frermann, The University of Melbourne.
[2020—2020] Jianing Yang, The University of Melbourne.
[2020—2021] Yi Li, with Mel Mistica, The University of Melbourne.
[2020—2021] An Nguyen, The University of Melbourne.
[2020—2021] Chaoxian Zhou, with Daniel Beck, The University of Melbourne.
[2020—2021] Hualong Deng, with Daniel Beck, The University of Melbourne.
[2019—2020] Huidu Lu, with Lachlan Andrew, The University of Melbourne.
[2019—2019] Rongxiao Liu, The University of Melbourne.
[2019—2019] Zhuohan Xie, with Trevor Cohn, The University of Melbourne.
[2017—2017] Steven Xu, with Tim Baldwin, The University of Melbourne.
[2016—2017] Ionut-Teodor Sorodoc, with Tim Baldwin, The University of Melbourne.
[2015—2017] Shraey Bhatia, with Tim Baldwin, The University of Melbourne.
[2014—2016] Andrew Bennett, with Tim Baldwin, The University of Melbourne.

Bachelor

[2024—2024] Toby Simonds, with Kemal Kurniawan, The University of Melbourne.
[2021—2021] Andrew Shen, with Fajri Koto and Tim Baldwin, The University of Melbourne.

Service

[2025—2027] President of the Australasian Language Technology Association (ALTA)
[2025] Senior Area Chair for ACL ARR July, October
[2024] Senior Area Chair for ACL ARR April, August, December
[2024—2026] Action Editor for Computational Linguistics
[2023] Area Chair for EMNLP
[2023] Senior Area Chair for IJCNLP-AACL
[2023] Senior Area Chair for ACL
[2023] Program Chair for ALTA
[2022—present] Master of Computer Science Project Coordinator, University of Melbourne
[2022] Area Chair for EMNLP
[2022] Area Chair for Natural Language Processing and Chinese Computing (NLPCC)
[2021—2022] Master of Information Technology (AI) Coordinator (interim), University of Melbourne
[2021] Area Chair for EMNLP
[2019—present] Review Committee for TACL
[2018] Handbook Chair for ACL
[2015] Program Co-chair for Topic Models: Post-Processing and Applications Workshop (TM 2015)
[2012—present] Programme Committee for ACL, NAACL, EACL, EMNLP, COLING, IJCNLP, AAAI, CoNLL, *SEM, and ALTA

Media Coverage

[2025] Interviewed by Yahoo Finance News about Generative AI and risks.
[2024] Interviewed by SBS Mandarin News about Generative AI and workforce.
[2023] My art+science chatbot project was featured by Sydney Morning Herald
[2021] Research on understanding mass influence activities online was covered by Guardian
[2020] Interviewed about AI and Creativity on the radio show PassW0rd
[2018] Shakespearean sonnet generator was covered by New Scientist, Times UK, Daily Mail, and others

Collaborations Outside Academia

[2020—2023] Gee. A project with artist Georgia Banks to develop a chatbot that incorporates her personality to explore self/ego. It was presented in ANAT Spectra 2022 and saw an exhibition at the National Gallery of Victoria in 2023.
[2021—2022] Exploring AI for Interpreting Artwork. A project with the Ian Potter Museum of Art to develop computational models to generate descriptions for artworks.
[2021—present] AI for Legal Problem Diagnosis. Our collaboration with Justice Connect has led to our partner winning the Not-For-Profit Technology Innovator of the Year Award at Infoxchange’s Australian Technology Awards 2024.

Non-scientifc Publications

[2021] Creativity, Machine and Poetry, Language
[2020] The Poet AI, IEEE Spectrum

Publications

Journal

[2025] Training and Evaluating with Human Label Variation: An Empirical Study. In Computational Linguistics.
[2025] Implications of Declaration of Climate Emergency on Australian Local Government Policy in the State of Victoria: Policy Analysis Utilising an LLM-based Retriever-Reader Pipeline. In Climatic Change.
[2025] Healthcare Educators' Perspectives on Artificial Intelligence Driven Virtual Patients for Teaching Communication Skills. In Interactive Technology and Smart Education.
[2025] Discovering Unusual Word Usages with Masked Language Model via Pseudo-label Training. In Journal of Natural Language Processing.
[2024] Artificial Intelligence Driven Virtual Patients For Communication Skill Development In Healthcare Students: A Scoping Review. In Australasian Journal of Educational Technology.
[2023] Open Domain Response Generation Guided by Retrieved Conversations. In IEEE Access.
[2023] Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study. In Journal of Medical Internet Research.
[2022] FFCI: A Framework for Interpretable Automatic Evaluation of Summarization. In Journal of Artificial Intelligence Research.
[2020] How Furiously Can Colorless Green Ideas Sleep? Sentence Acceptability in Context. In Transactions of the Association for Computational Linguistics.
[2018] Duplicate Detection in Programming Question Answering Communities. In ACM Transactions of Internet Technology.
[2017] Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge. In Cognitive Science.
[2017] Evaluating Topic Representations for Exploring Document Collections. In Journal of the Association for Information Science and Technology.
[2014] Automatic Detection and Language Identification of Multilingual Documents. In Transactions of the Association for Computational Linguistics.
[2013] On Collocations and Topic Models. In ACM Transactions on Speech and Language Processing.

Conference

[2025] Beyond Seen Data: Improving KBQA Generalization Through Schema-Guided Logical Form Generation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025).
[2025] Reasoning Like Experts: Leveraging Multimodal Large Language Models for Drawing-based Psychoanalysis. In Proceedings of the 33rd ACM International Conference on Multimedia (MM 2025).
[2025] Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025).
[2025] WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025).
[2025] Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task. In Findings of the Association for Computational Linguistics: ACL 2025.
[2025] Moderation Matters: Measuring Conversational Moderation Impact in English as a Second Language Group Discussion. In Findings of the Association for Computational Linguistics: ACL 2025.
[2025] Decomposed Opinion Summarization with Verified Aspect-Aware Modules. In Findings of the Association for Computational Linguistics: ACL 2025.
[2025] WHoW: A Cross-domain Approach for Analysing Conversation Moderation. In Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2025).
[2025] An Interpretable and Crosslingual Method for Evaluating Second-Language Dialogues. In Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2025).
[2025] Evaluating Evidence Attribution in Generated Fact Checking Explanations. In Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2025).
[2025] Factual Dialogue Summarization via Learning from Large Language Models. In Proceedings of the 31st International Conference on Computational Linguistics (COLING 2025).
[2025] Interaction Matters: An Evaluation Framework for Interactive Dialogue Assessment on English Second Language Conversations. In Proceedings of the 31st International Conference on Computational Linguistics (COLING 2025).
[2024] KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph. In Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024).
[2024] A Sentiment Consolidation Framework for Meta-Review Generation. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024).
[2024] CMA-R: Causal Mediation Analysis for Explaining Rumour Detection. In Findings of the Association for Computational Linguistics: EACL 2024.
[2023] DeltaScore: Story Evaluation with Perturbations. In Findings of the Association for Computational Linguistics: EMNLP 2023.
[2023] Unsupervised Lexical Simplification with Context Augmentation. In Findings of the Association for Computational Linguistics: EMNLP 2023.
[2023] Summarizing Multiple Documents with Conversational Structure for Meta-Review Generation. In Findings of the Association for Computational Linguistics: EMNLP 2023.
[2023] Annotating and Detecting Fine-grained Factual Inconsistencies for Dialogue Summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023).
[2023] Unsupervised Paraphrasing of Multiword Expressions. In Findings of the Association for Computational Linguistics: ACL 2023.
[2023] Compressed Heterogeneous Graph for Abstractive Multi-document Summarization. In Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI 2023).
[2023] MetaTroll: Few-shot Detection of State-Sponsored Trolls with Transformer Adapters. In Proceedings of the Web Conference 2023 (WWW 2023).
[2023] NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023).
[2023] Improving Visual-Semantic Embedding with Adaptive Pooling and Optimization Objective. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023).
[2022] Robust Task-Oriented Dialogue Generation with Contrastive Pre-training and Adversarial Filtering. In Findings of the Association for Computational Linguistics: EMNLP 2022.
[2022] M3: Multi-level dataset for Multi-document summarisation of Medical studies. In Findings of the Association for Computational Linguistics: EMNLP 2022.
[2022] Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2022).
[2022] LipKey: A Large-Scale News Dataset for Absent Keyphrases Generation and Abstractive Summarization. In Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022).
[2022] Unsupervised Lexical Substitution with Decontextualised Embeddings. In Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022).
[2022] DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2022).
[2022] One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022).
[2022] The Patient Is More Dead Than Alive: Exploring The Current State of The Multi-document Summarisation of The Biomedical Literature. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022).
[2022] An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022).
[2021] IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021).
[2021] Rumour Detection via Zero-shot Cross-lingual Transfer Learning. In Proceedings of The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021).
[2021] Evaluating the Efficacy of Summarization Evaluation across Languages. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
[2021] A Unified Framework to Incorporate Multimodal Knowledge Bases into End-to-End Task-Oriented Dialogue Systems. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI 2021).
[2021] Grey-box Adversarial Attack And Defence For Sentiment Classification. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2021).
[2021] Discourse Probing of Pretrained Language Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2021).
[2021] Automatic Classification of Neutralization Techniques in the Narrative of Climate Change Scepticism. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2021).
[2021] Top-down Discourse Parsing via Sequence Labelling. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (EACL 2021).
[2021] Brief description of COVID-SEE: The Scientific Evidence Explorer for COVID-19 Related Research. In Proceedings of the 43rd European Conference on Information Retrieval (ECIR 2021).
[2020] Less is More: Rejecting Unreliable Reviews for Product Question Answering. In Proceedings of the 2020 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2020).
[2020] IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020).
[2020] Liputan6: A Large-scale Indonesian Dataset for Text Summarisation. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2020).
[2020] Forget Me Not: Reducing Catastrophic Forgetting for Domain Adaptation in Reading Comprehension. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN2020).
[2020] Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020).
[2019] Discovering Relevant Reviews for Answering Product-related Queries. In Proceedings of the 19th IEEE International Conference on Data Mining (ICDM 2019).
[2019] Early Rumour Detection. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2019).
[2018] Document Chunking and Learning Objective Generation for Instruction Design. In Proceedings of the 11th International Conference on Educational Data Mining.
[2018] Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018).
[2018] The Influence of Context on Sentence Acceptability Judgements. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018).
[2018] Topic Intrusion for Automatic Topic Model Evaluation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018).
[2017] End-to-end Network for Twitter Geolocation Prediction and Hashing. In Proceedings of the 8th International Joint Conference on Natural Language Processing (IJCNLP 2017).
[2017] An Automatic Approach for Document-level Topic Model Evaluation. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL-2017).
[2017] Topically Driven Neural Language Model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017).
[2017] Detecting Duplicate Posts in Programming QA Communities via Latent Semantics and Association Rules. In Proceedings of the 26th International Conference on World Wide Web (WWW 2017).
[2017] Multimodal Topic Labelling. In Proceedings of the 15th Conference of the EACL (EACL 2017).
[2016] Automatic Labelling of Topics with Neural Embeddings. In Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016).
[2016] LexSemTm: A Semantic Dataset Based on All-words Unsupervised Sense Distribution Learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).
[2016] The Sensitivity of Topic Coherence Evaluation to Topic Cardinality. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2016).
[2015] Unsupervised Prediction of Acceptability Judgements. In Proceedings of the Joint conference of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2015).
[2014] Representing Topics Labels for Exploring Digital Libraries. In Proceedings of Digital Libraries 2014.
[2014] Applying a Word-sense Induction System to the Automatic Extraction of Diverse Dictionary Examples. In Proceedings of the XVI EURALEX International Congress (EURALEX 2014).
[2014] Novel Word-sense Identification. In Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014).
[2014] Measuring Gradience in Speakers' Grammaticality Judgements. In Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014).
[2014] Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014).
[2014] Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality. In Proceedings of the 14th Conference of the EACL (EACL 2014).
[2013] Unsupervised Word Class Induction for Under-resourced Languages: A Case Study on Indonesian. In Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP 2013).
[2013] A Lexicographic Appraisal of an Automatic Approach for Detecting New Word Senses. In Proceedings of eLex 2013.
[2012] Word Sense Induction for Novel Sense Detection. In Proceedings of the 13th Conference of the EACL (EACL 2012).
[2012] On-line Trend Analysis with Topic Models: #twitter trends detection topic model online. In Proceedings of the 24th International Conference of on Computational Linguistics (COLING 2012).
[2012] Bayesian Text Segmentation for Index Term Identification and Keyphrase Extraction. In Proceedings of the 24th International Conference of on Computational Linguistics (COLING 2012).
[2011] Automatic Labelling of Topics. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL HLT 2011).
[2010] Best Topic Word Selection for Topic Labelling. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Posters Volume.
[2010] Automatic Evaluation of Topic Coherence. In Proceedings of Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2010).

Workshop

[2025] An Analytical Emotion Framework of Rumour Threads on Social Media. In Proceedings of the 1st Workshop on Misinformation Detection in the Era of LLMs (MisD 2025).
[2024] MoDEM: Mixture of Domain Expert Models. In Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association (ALTA 2024).
[2024] To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction. In Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis (WASSA 2024).
[2023] The Uncivil Empathy: Investigating the Relation Between Empathy and Toxicity in Online Mental Health Support Forums. In Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association (ALTA 2023).
[2023] The Next Chapter: A Study of Large Language Models in Storytelling. In Proceedings of The 16th International Natural Language Generation Conference (INLG 2023).
[2022] Automatic Explanation Generation For Climate Science Claims. In Proceedings of The 20th Annual Workshop of the Australasian Language Technology Association (ALTA 2022).
[2022] LED Down the Rabbit Hole: Exploring the Potential of Global Attention for Biomedical Multi-document Summarisation. In Proceedings of the Third Workshop on Scholarly Document Processing.
[2022] Easy-First Bottom-Up Discourse Parsing via Sequence Labelling. In Proceedings of the 3rd Workshop on Computational Approaches to Discourse (CODI 2022).
[2022] Cross-linguistic Comparison of Linguistic Feature Encoding in BERT Models for Typologically Different Languages. In Proceedings of the 4th Workshop on Research in Computational Typology and Multilingual NLP (SIGTYP 2022).
[2022] Can Pretrained Language Models Generate Persuasive, Faithful, and Informative Ads Text for Product Descriptions?. In Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 2022).
[2022] Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian. In Proceedings of the Commonsense Representation and Reasoning Workshop 2022 (CSRR 2022).
[2021] Exploring Story Generation with Multi-task Objectives in Variational Autoencoders. In Proceedings of The 19th Annual Workshop of the Australasian Language Technology Association (ALTA 2021).
[2021] Findings on Conversation Disentanglement. In Proceedings of The 19th Annual Workshop of the Australasian Language Technology Association (ALTA 2021).
[2021] Semi-automatic Triage of Requests for Free Legal Assistance. In Proceedings of the Natural Legal Language Processing Workshop 2021.
[2021] Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora. In Proceedings of the 1st Workshop on Multilingual Representation Learning (MRL 2021).
[2021] Impact of Detecting Clinical Trial Elements in Exploration of COVID-19 Literature. In Proceedings of The Fourth International Workshop on Health Natural Language Processing (HealthNLP 2021).
[2020] #DemocratsAreDestroyingAmerica: Rumour Analysis on Twitter During COVID-19. In Proceedings of the 5th International Workshop on Mining Actionable Insights from Social Networks: Special Edition on Dis/Misinformation Mining from Social Media (MAISoN 2020).
[2019] From Shakespeare to Li-Bai: Adapting a Sonnet Model to Chinese Poetry. In Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association (ALTA 2019).
[2019] Improved Document Modelling with a Neural Discourse Parser. In Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association (ALTA 2019).
[2018] Preferred Answer Selection in Stack Overflow: Better Text Representations ... and Metadata, Metadata, Metadata. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text.
[2017] Decoupling Encoder and Decoder Networks for Abstractive Document Summarization. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres.
[2016] An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation. In Proceedings of the 1st Workshop on Representation Learning for NLP.
[2013] unimelb: Topic Modelling-based Word Sense Induction. In Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013).
[2013] unimelb: Topic Modelling-based Word Sense Induction for Web Snippet Clustering. In Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013).

Jey Han Lau

Curriculum Vitae

Qualifications

Employment History

Awards and Honours

Grants

Patents

Invited Keynotes, Talks and Tutorials

Teaching

Supervision

PhD

Masters

Bachelor

Service

Media Coverage

Collaborations Outside Academia

Non-scientifc Publications

Publications

Journal

Conference

Workshop