highlights
Selected publications with short, scannable takeaways.
Selected publications (curated). For the full list, see /publications/.
2026
- PoeTone: A Framework for Constrained Generation of Structured Chinese Songci with LLMsIn AAAI, Singapore, 2026For insiders: We build a framework for generating classical Chinese Songci that satisfies strict tone and rhyme constraints.For everyone: Songci poetry has rules like sheet music, and we teach an AI to write it while staying on the right tones and rhymes, with a strict checker that keeps it honest.
2025
- Paths to Causality: Finding Informative Subgraphs within Knowledge Graphs for Knowledge-Based Causal DiscoveryIn KDD, 2025For insiders: We propose a neurosymbolic method for knowledge-based causal discovery that selects relevant knowledge graph subgraphs to ground LLM prompting.For everyone: We help AI reason about cause and effect by highlighting the most informative routes through a knowledge map, like handing a detective the best trail of clues.
2024
- Embedded Named Entity Recognition using Probing ClassifiersIn EMNLP, Miami, FL, USA, 2024For insiders: EMBER enables fast NER in decoder-only language models via probing, adding minimal overhead and avoiding destructive fine-tuning.For everyone: We add “live labels” to a chatbot as it writes, like sticky notes appearing while you type instead of only after the text is finished.
- GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural NetworkIn ACL, Bangkok, Thailand, 2024For insiders: GNNavi guides information flow in prompt-based fine-tuning via a graph neural layer, improving few-shot learning while updating only a small fraction of parameters.For everyone: GNNavi steers how information flows during prompting, like a traffic controller that routes signals to the right places so the model learns better.
2023
- SemOpenAlex: The Scientific Landscape in 26 Billion RDF TriplesIn ISWC, 2023For insiders: We release SemOpenAlex, a scholarly knowledge graph with 26 billion triples, dumps, SPARQL access, and embeddings, enabling large-scale semantic science analytics and search.For everyone: SemOpenAlex is an open “Google Maps for science”, built as a connected map of papers and authors so others can navigate research at web scale.
2022
- Few-Shot Document-Level Relation ExtractionIn NAACL, Seattle, WA, USA, 2022For insiders: We introduce a few-shot benchmark for document-level relation extraction and reveal challenges beyond sentence-level settings, including realistic NOTA behavior.For everyone: We create a benchmark for extracting relationships from whole documents with only a few examples, like testing whether a student understood the full story and not just one line.
2020
- Citation Recommendation: Approaches and DatasetsInt. J. Digit. Libr., 2020For insiders: We survey citation recommendation methods and datasets and highlight evaluation pitfalls and open challenges for assisting scientific writing.For everyone: This survey explains citation recommendation, like a GPS for references, and summarizes which datasets and tests are needed before such tools deserve trust.
2018
- Linked Data Quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGOSemantic Web, 2018For insiders: We compare major knowledge graphs using a systematic data-quality framework and help practitioners choose the right graph for their application needs.For everyone: We create a consumer-style test report for major knowledge graphs, comparing data quality so developers can choose the right “map of the world” for their use case.