Selected Publications | Michael Färber's Research Group

Below is a curated selection of 10 publications in reverse chronological order, each with a short summary of its main contribution.

A full list of publications is available here.

2026

The Hidden Bias: A Study on Explicit and Implicit Political Stereotypes in Large Language Models
K. Löhr, S. Yuan, M. Färber
EACL 2026
Paper

This paper studies political bias and stereotype propagation in major LLMs. Beyond explicit persona prompting, it shows that multilingual language variation can reveal even stronger implicit stereotypes. The work highlights the societal relevance of trustworthy AI and the need for more careful evaluation of LLM behavior.

2025

Paths to Causality: Finding Informative Subgraphs Within Knowledge Graphs for Knowledge-Based Causal Discovery
Y. Susanti, M. Färber
KDD 2025
Paper

While traditional methods rely on observational data, knowledge-based causal discovery uses metadata (like variable names or context) to infer causality – a promising but currently unreliable approach when using LLMs alone. To improve stability and accuracy, we propose a method that combines LLMs with Knowledge Graphs. By identifying informative metapath-based subgraphs and ranking them using a Learning-to-Rank model, our approach improves zero-shot LLM prompts for causal inference.

SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation
I. Besrour, J. He, T. Schreieder, M. Färber
CIKM 2025
Paper

SQuAI is a multi-agent retrieval-augmented generation framework for scientific question answering. It combines decomposition, retrieval, filtering, and citation-grounded answer generation to provide more faithful and transparent answers over large scholarly corpora. The system reflects my current work on trustworthy LLMs and AI for science.

ComplexTempQA: A 100m Dataset for Complex Temporal Question Answering
R. Gruber, A. Abdallah, M. Färber, A. Jatowt
EMNLP 2025
Paper

ComplexTempQA introduces a large-scale benchmark for temporal question answering with over 100 million question-answer pairs. It substantially expands the scale and scope of temporal QA resources and supports research on more realistic, time-aware AI systems.

2024

Embedded Named Entity Recognition using Probing Classifiers
N. Popovič, M. Färber
EMNLP 2024
Paper

Streaming text generation enhances the responsiveness of language model applications like chat assistants, while real-time semantic extraction – such as named entity recognition (NER) – is valuable for tasks like fact-checking and retrieval-augmented generation. To address this, we introduce EMBER, a method for streaming NER in decoder-only language models that avoids fine-tuning and adds minimal inference overhead.

Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-Based Causal Discovery
Y. Susanti, M. Färber
ISWC 2024
Paper

We introduce a method for extracting causal relationships from text by integrating small language models (under 1B parameters) with knowledge graphs via prompt-based learning. The approach outperforms larger models more efficiently and at lower cost, highlighting the power of combining knowledge graphs with compact AI models.

GNNAVI: Navigating the Information Flow in Large Language Models by Graph Neural Network
S. Yuan, E. Nie, M. Färber, H. Schmid, H. Schütze
Findings of ACL 2024
Paper

This paper proposes a parameter-efficient fine-tuning technique for large language models using Graph Neural Networks (GNNs) to improve information flow. GNNAVI achieves state-of-the-art accuracy in few-shot tasks while updating less than 0.5% of model parameters, demonstrating high efficiency and scalability.

2023

SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples
M. Färber, D. Lamprecht, J. Krause, L. Aung, P. Haase
ISWC 2023 · Best Paper Award
Paper

SemOpenAlex is one of the largest scholarly knowledge graphs, with 26 billion RDF triples. It provides open access to global research metadata and supports scientific discovery, interdisciplinary research, and large-scale knowledge integration. The paper received the ISWC Best Paper Award.

2022

Few-Shot Document-Level Relation Extraction
N. Popovič, M. Färber
NAACL 2022
Paper

This paper defines a new benchmark for few-shot learning in document-level relation extraction. It addresses the lack of scalable, domain-specific datasets and introduces a strategy for building more realistic NLP benchmarks in low-resource settings.

2019

The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data
M. Färber
ISWC 2019
Paper

As the sole author, I introduced the largest open scholarly knowledge graph at the time, with over 8 billion triples. This dataset has since supported numerous research and industry applications.

2018

Linked Data Quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO
M. Färber, F. Bartscherer, C. Menne, A. Rettinger
Semantic Web Journal, 2018 · Outstanding Paper Award
Paper

This highly cited journal article presents a framework for evaluating the quality of major knowledge graphs. It remains a foundational reference for semantic web research and linked data quality assessment, and it received the Outstanding Paper Award of the Semantic Web Journal.