Selected Publications
Selected publications with short summaries.
Below is a curated selection of 10 publications in reverse chronological order, each with a short summary of its main contribution.
A full list of publications is available here.
The Hidden Bias: A Study on Explicit and Implicit Political Stereotypes in Large Language Models
K. Löhr, S. Yuan, M. Färber
EACL 2026
Paper
This paper studies political bias and stereotype propagation in major LLMs. Beyond explicit persona prompting, it shows that multilingual language variation can reveal even stronger implicit stereotypes. The work highlights the societal relevance of trustworthy AI and the need for more careful evaluation of LLM behavior.
Paths to Causality: Finding Informative Subgraphs Within Knowledge Graphs for Knowledge-Based Causal Discovery
Y. Susanti, M. Färber
KDD 2025
Paper
While traditional methods rely on observational data, knowledge-based causal discovery uses metadata (like variable names or context) to infer causality – a promising but currently unreliable approach when using LLMs alone. To improve stability and accuracy, we propose a method that combines LLMs with Knowledge Graphs. By identifying informative metapath-based subgraphs and ranking them using a Learning-to-Rank model, our approach improves zero-shot LLM prompts for causal inference.
SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation
I. Besrour, J. He, T. Schreieder, M. Färber
CIKM 2025
Paper
SQuAI is a multi-agent retrieval-augmented generation framework for scientific question answering. It combines decomposition, retrieval, filtering, and citation-grounded answer generation to provide more faithful and transparent answers over large scholarly corpora. The system reflects my current work on trustworthy LLMs and AI for science.
ComplexTempQA: A 100m Dataset for Complex Temporal Question Answering
R. Gruber, A. Abdallah, M. Färber, A. Jatowt
EMNLP 2025
Paper
ComplexTempQA introduces a large-scale benchmark for temporal question answering with over 100 million question-answer pairs. It substantially expands the scale and scope of temporal QA resources and supports research on more realistic, time-aware AI systems.
Embedded Named Entity Recognition using Probing Classifiers
N. Popovič, M. Färber
EMNLP 2024
Paper
Streaming text generation enhances the responsiveness of language model applications like chat assistants, while real-time semantic extraction – such as named entity recognition (NER) – is valuable for tasks like fact-checking and retrieval-augmented generation. To address this, we introduce EMBER, a method for streaming NER in decoder-only language models that avoids fine-tuning and adds minimal inference overhead.
Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-Based Causal Discovery
Y. Susanti, M. Färber
ISWC 2024
Paper
We introduce a method for extracting causal relationships from text by integrating small language models (under 1B parameters) with knowledge graphs via prompt-based learning. The approach outperforms larger models more efficiently and at lower cost, highlighting the power of combining knowledge graphs with compact AI models.
GNNAVI: Navigating the Information Flow in Large Language Models by Graph Neural Network
S. Yuan, E. Nie, M. Färber, H. Schmid, H. Schütze
Findings of ACL 2024
Paper
This paper proposes a parameter-efficient fine-tuning technique for large language models using Graph Neural Networks (GNNs) to improve information flow. GNNAVI achieves state-of-the-art accuracy in few-shot tasks while updating less than 0.5% of model parameters, demonstrating high efficiency and scalability.
SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples
M. Färber, D. Lamprecht, J. Krause, L. Aung, P. Haase
ISWC 2023 · Best Paper Award
Paper
SemOpenAlex is one of the largest scholarly knowledge graphs, with 26 billion RDF triples. It provides open access to global research metadata and supports scientific discovery, interdisciplinary research, and large-scale knowledge integration. The paper received the ISWC Best Paper Award.
Few-Shot Document-Level Relation Extraction
N. Popovič, M. Färber
NAACL 2022
Paper
This paper defines a new benchmark for few-shot learning in document-level relation extraction. It addresses the lack of scalable, domain-specific datasets and introduces a strategy for building more realistic NLP benchmarks in low-resource settings.
The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data
M. Färber
ISWC 2019
Paper
As the sole author, I introduced the largest open scholarly knowledge graph at the time, with over 8 billion triples. This dataset has since supported numerous research and industry applications.
Linked Data Quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO
M. Färber, F. Bartscherer, C. Menne, A. Rettinger
Semantic Web Journal, 2018 · Outstanding Paper Award
Paper
This highly cited journal article presents a framework for evaluating the quality of major knowledge graphs. It remains a foundational reference for semantic web research and linked data quality assessment, and it received the Outstanding Paper Award of the Semantic Web Journal.