Curriculum Vitae


Employment History

Awards and Honours



Invited Keynotes, Talks and Tutorials







Media Coverage

Collaborations Outside Academia

Non-scientifc Publications



  1. [2024]  Artificial Intelligence Driven Virtual Patients For Communication Skill Development In Healthcare Students: A Scoping Review.. In Australasian Journal of Educational Technology.
  2. [2023]  Open Domain Response Generation Guided by Retrieved Conversations. In IEEE Access.
  3. [2023]  Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study. In Journal of Medical Internet Research.
  4. [2022]  FFCI: A Framework for Interpretable Automatic Evaluation of Summarization. In Journal of Artificial Intelligence Research.
  5. [2020]  How Furiously Can Colorless Green Ideas Sleep? Sentence Acceptability in Context. In Transactions of the Association for Computational Linguistics.
  6. [2018]  Duplicate Detection in Programming Question Answering Communities. In ACM Transactions of Internet Technology.
  7. [2017]  Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge. In Cognitive Science.
  8. [2017]  Evaluating Topic Representations for Exploring Document Collections. In Journal of the Association for Information Science and Technology.
  9. [2014]  Automatic Detection and Language Identification of Multilingual Documents. In Transactions of the Association for Computational Linguistics.
  10. [2013]  On Collocations and Topic Models. In ACM Transactions on Speech and Language Processing.


  1. [2024]  KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph. In Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024).
  2. [2024]  Exploring Multi-Document Information Consolidation for Scientific Sentiment Summarization. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024).
  3. [2024]  CMA-R: Causal Mediation Analysis for Explaining Rumour Detection. In Findings of the Association for Computational Linguistics: EACL 2024.
  4. [2023]  DeltaScore: Story Evaluation with Perturbations. In Findings of the Association for Computational Linguistics: EMNLP 2023.
  5. [2023]  Unsupervised Lexical Simplification with Context Augmentation. In Findings of the Association for Computational Linguistics: EMNLP 2023.
  6. [2023]  Summarizing Multiple Documents with Conversational Structure for Meta-Review Generation. In Findings of the Association for Computational Linguistics: EMNLP 2023.
  7. [2023]  Annotating and Detecting Fine-grained Factual Inconsistencies for Dialogue Summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023).
  8. [2023]  Unsupervised Paraphrasing of Multiword Expressions. In Findings of the Association for Computational Linguistics: ACL 2023.
  9. [2023]  Compressed Heterogeneous Graph for Abstractive Multi-document Summarization. In Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI 2023).
  10. [2023]  MetaTroll: Few-shot Detection of State-Sponsored Trolls with Transformer Adapters. In Proceedings of the Web Conference 2023 (WWW 2023).
  11. [2023]  NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023).
  12. [2023]  Improving Visual-Semantic Embedding with Adaptive Pooling and Optimization Objective. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023).
  13. [2022]  Robust Task-Oriented Dialogue Generation with Contrastive Pre-training and Adversarial Filtering. In Findings of the Association for Computational Linguistics: EMNLP 2022.
  14. [2022]  M3: Multi-level dataset for Multi-document summarisation of Medical studies. In Findings of the Association for Computational Linguistics: EMNLP 2022.
  15. [2022]  Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2022).
  16. [2022]  LipKey: A Large-Scale News Dataset for Absent Keyphrases Generation and Abstractive Summarization. In Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022).
  17. [2022]  Unsupervised Lexical Substitution with Decontextualised Embeddings. In Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022).
  18. [2022]  DUCK: Rumour Detection on Social Media by Modelling User and Comment Propagation Networks. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2022).
  19. [2022]  One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022).
  20. [2022]  The Patient Is More Dead Than Alive: Exploring The Current State of The Multi-document Summarisation of The Biomedical Literature. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022).
  21. [2022]  An Interpretable Neuro-Symbolic Reasoning Framework for Task-Oriented Dialogue Generation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022).
  22. [2021]  IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021).
  23. [2021]  Rumour Detection via Zero-shot Cross-lingual Transfer Learning. In Proceedings of The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2021).
  24. [2021]  Evaluating the Efficacy of Summarization Evaluation across Languages. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
  25. [2021]  A Unified Framework to Incorporate Multimodal Knowledge Bases into End-to-End Task-Oriented Dialogue Systems. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI 2021).
  26. [2021]  Grey-box Adversarial Attack And Defence For Sentiment Classification. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2021).
  27. [2021]  Discourse Probing of Pretrained Language Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2021).
  28. [2021]  Automatic Classification of Neutralization Techniques in the Narrative of Climate Change Scepticism. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2021).
  29. [2021]  Top-down Discourse Parsing via Sequence Labelling. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (EACL 2021).
  30. [2021]  Brief description of COVID-SEE: The Scientific Evidence Explorer for COVID-19 Related Research. In Proceedings of the 43rd European Conference on Information Retrieval (ECIR 2021).
  31. [2020]  Less is More: Rejecting Unreliable Reviews for Product Question Answering. In Proceedings of the 2020 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2020).
  32. [2020]  IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020).
  33. [2020]  Liputan6: A Large-scale Indonesian Dataset for Text Summarisation. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2020).
  34. [2020]  Forget Me Not: Reducing Catastrophic Forgetting for Domain Adaptation in Reading Comprehension. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN2020).
  35. [2020]  Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020).
  36. [2019]  Discovering Relevant Reviews for Answering Product-related Queries. In Proceedings of the 19th IEEE International Conference on Data Mining (ICDM 2019).
  37. [2019]  Early Rumour Detection. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2019).
  38. [2018]  Document Chunking and Learning Objective Generation for Instruction Design. In Proceedings of the 11th International Conference on Educational Data Mining.
  39. [2018]  Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018).
  40. [2018]  The Influence of Context on Sentence Acceptability Judgements. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018).
  41. [2018]  Topic Intrusion for Automatic Topic Model Evaluation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018).
  42. [2017]  End-to-end Network for Twitter Geolocation Prediction and Hashing. In Proceedings of the 8th International Joint Conference on Natural Language Processing (IJCNLP 2017).
  43. [2017]  An Automatic Approach for Document-level Topic Model Evaluation. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL-2017).
  44. [2017]  Topically Driven Neural Language Model. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017).
  45. [2017]  Detecting Duplicate Posts in Programming QA Communities via Latent Semantics and Association Rules. In Proceedings of the 26th International Conference on World Wide Web (WWW 2017).
  46. [2017]  Multimodal Topic Labelling. In Proceedings of the 15th Conference of the EACL (EACL 2017).
  47. [2016]  Automatic Labelling of Topics with Neural Embeddings. In Proceedings of the 26th International Conference on Computational Linguistics (COLING 2016).
  48. [2016]  LexSemTm: A Semantic Dataset Based on All-words Unsupervised Sense Distribution Learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).
  49. [2016]  The Sensitivity of Topic Coherence Evaluation to Topic Cardinality. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics — Human Language Technologies (NAACL HLT 2016).
  50. [2015]  Unsupervised Prediction of Acceptability Judgements. In Proceedings of the Joint conference of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2015).
  51. [2014]  Representing Topics Labels for Exploring Digital Libraries. In Proceedings of Digital Libraries 2014.
  52. [2014]  Applying a Word-sense Induction System to the Automatic Extraction of Diverse Dictionary Examples. In Proceedings of the XVI EURALEX International Congress (EURALEX 2014).
  53. [2014]  Novel Word-sense Identification. In Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014).
  54. [2014]  Measuring Gradience in Speakers' Grammaticality Judgements. In Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014).
  55. [2014]  Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014).
  56. [2014]  Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality. In Proceedings of the 14th Conference of the EACL (EACL 2014).
  57. [2013]  Unsupervised Word Class Induction for Under-resourced Languages: A Case Study on Indonesian. In Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP 2013).
  58. [2013]  A Lexicographic Appraisal of an Automatic Approach for Detecting New Word Senses. In Proceedings of eLex 2013.
  59. [2012]  Word Sense Induction for Novel Sense Detection. In Proceedings of the 13th Conference of the EACL (EACL 2012).
  60. [2012]  On-line Trend Analysis with Topic Models: #twitter trends detection topic model online. In Proceedings of the 24th International Conference of on Computational Linguistics (COLING 2012).
  61. [2012]  Bayesian Text Segmentation for Index Term Identification and Keyphrase Extraction. In Proceedings of the 24th International Conference of on Computational Linguistics (COLING 2012).
  62. [2011]  Automatic Labelling of Topics. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL HLT 2011).
  63. [2010]  Best Topic Word Selection for Topic Labelling. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Posters Volume.
  64. [2010]  Automatic Evaluation of Topic Coherence. In Proceedings of Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2010).


  1. [2024]  To Aggregate or Not to Aggregate. That is the Question: A Case Study on Annotation Subjectivity in Span Prediction. In Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis.
  2. [2023]  The Uncivil Empathy: Investigating the Relation Between Empathy and Toxicity in Online Mental Health Support Forums. In Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association.
  3. [2023]  The Next Chapter: A Study of Large Language Models in Storytelling. In Proceedings of The 16th International Natural Language Generation Conference (INLG 2023).
  4. [2022]  Automatic Explanation Generation For Climate Science Claims. In Proceedings of The 20th Annual Workshop of the Australasian Language Technology Association (ALTA 2022).
  5. [2022]  LED Down the Rabbit Hole: Exploring the Potential of Global Attention for Biomedical Multi-document Summarisation. In Proceedings of the Third Workshop on Scholarly Document Processing.
  6. [2022]  Easy-First Bottom-Up Discourse Parsing via Sequence Labelling. In Proceedings of the 3rd Workshop on Computational Approaches to Discourse (CODI 2022).
  7. [2022]  Cross-linguistic Comparison of Linguistic Feature Encoding in BERT Models for Typologically Different Languages. In Proceedings of the 4th Workshop on Research in Computational Typology and Multilingual NLP (SIGTYP 2022).
  8. [2022]  Can Pretrained Language Models Generate Persuasive, Faithful, and Informative Ads Text for Product Descriptions?. In Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 2022).
  9. [2022]  Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian. In Proceedings of the Commonsense Representation and Reasoning Workshop 2022 (CSRR 2022).
  10. [2021]  Exploring Story Generation with Multi-task Objectives in Variational Autoencoders. In Proceedings of The 19th Annual Workshop of the Australasian Language Technology Association (ALTA 2021).
  11. [2021]  Findings on Conversation Disentanglement. In Proceedings of The 19th Annual Workshop of the Australasian Language Technology Association (ALTA 2021).
  12. [2021]  Semi-automatic Triage of Requests for Free Legal Assistance. In Proceedings of the Natural Legal Language Processing Workshop 2021.
  13. [2021]  Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora. In Proceedings of the 1st Workshop on Multilingual Representation Learning (MRL 2021).
  14. [2021]  Impact of Detecting Clinical Trial Elements in Exploration of COVID-19 Literature. In Proceedings of The Fourth International Workshop on Health Natural Language Processing (HealthNLP 2021).
  15. [2020]  #DemocratsAreDestroyingAmerica: Rumour Analysis on Twitter During COVID-19. In Proceedings of the 5th International Workshop on Mining Actionable Insights from Social Networks: Special Edition on Dis/Misinformation Mining from Social Media (MAISoN 2020).
  16. [2019]  From Shakespeare to Li-Bai: Adapting a Sonnet Model to Chinese Poetry. In Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association (ALTA 2019).
  17. [2019]  Improved Document Modelling with a Neural Discourse Parser. In Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association (ALTA 2019).
  18. [2018]  Preferred Answer Selection in Stack Overflow: Better Text Representations ... and Metadata, Metadata, Metadata. In Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text.
  19. [2017]  Decoupling Encoder and Decoder Networks for Abstractive Document Summarization. In Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres.
  20. [2016]  An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation. In Proceedings of the 1st Workshop on Representation Learning for NLP.
  21. [2013]  unimelb: Topic Modelling-based Word Sense Induction. In Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013).
  22. [2013]  unimelb: Topic Modelling-based Word Sense Induction for Web Snippet Clustering. In Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013).