Selected Publications | Michael Färber's Research Group

Below is a curated list of ten of my publications (listed chronologically), each with a short summary highlighting its impact.

A full list of publications is available here.

Paths to Causality: Finding Informative Subgraphs Within Knowledge Graphs for Knowledge-Based Causal Discovery

Y. Susanti, M. Färber
KDD 2025
📄 Read the paper

While traditional methods rely on observational data, knowledge-based causal discovery uses metadata (like variable names or context) to infer causality—a promising but currently unreliable approach when using LLMs alone. To improve stability and accuracy, we propose a novel method that combines LLMs with Knowledge Graphs. By identifying informative metapath-based subgraphs and ranking them using a Learning-to-Rank model, our approach enhances zero-shot LLM prompts for more accurate causal inference.

Embedded Named Entity Recognition using Probing Classifiers

N. Popovič, M. Färber
EMNLP 2024
📄 Read the paper

Streaming text generation enhances the responsiveness of language model applications like chat assistants, while real-time semantic extraction—such as named entity recognition (NER)—is valuable for tasks like fact-checking and retrieval-augmented generation. However, current methods often require costly additional models or fine-tuning. To address this, we introduce EMBER, a method for streaming NER in decoder-only language models that avoids fine-tuning and adds minimal inference overhead.

Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-Based Causal Discovery

Y. Susanti, M. Färber
ISWC 2024
📄 Read the paper

We introduce a method for extracting causal relationships from text by integrating small language models (under 1B parameters) with knowledge graphs via prompt-based learning. The approach outperforms larger models—more efficiently and at lower cost—highlighting the power of combining knowledge graphs with compact AI models.

GNNAVI: Navigating the Information Flow in Large Language Models by Graph Neural Network

S. Yuan, E. Nie, M. Färber, H. Schmid, H. Schütze
Findings of ACL 2024
📄 Read the paper

This ACL paper proposes a fine-tuning technique for large language models using Graph Neural Networks (GNNs) to improve information flow. GNNAVI achieves state-of-the-art accuracy in few-shot tasks while updating less than 0.5% of model parameters—demonstrating high efficiency and scalability.

SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples

M. Färber, D. Lamprecht, J. Krause, L. Aung, P. Haase
ISWC 2023 – Best Paper Award
📄 Read the paper

SemOpenAlex is the largest existing scholarly knowledge graph, with 26 billion triples. It provides open access to global research metadata and sets a new benchmark for scientific discovery, interdisciplinary research, and large-scale knowledge integration.

Biases in Scholarly Recommender Systems: Impact, Prevalence, and Mitigation

M. Färber, S. Yuan, M. Coutinho
Scientometrics, 2023
📄 Read the paper

This paper identifies and categorizes biases in academic recommender systems and presents strategies to mitigate them. It offers a framework that advances fairness and transparency in research discovery and information access.

Few-Shot Document-Level Relation Extraction

N. Popovic, M. Färber
NAACL 2022
📄 Read the paper

This paper defines a new benchmark for few-shot learning in document-level relation extraction. It addresses the lack of scalable, domain-specific datasets and introduces a novel strategy for building more realistic NLP benchmarks.

Recommending Datasets for Scientific Problem Descriptions

M. Färber, A.-K. Leisinger
CIKM 2021
📄 Read the paper

This work introduces a system that recommends datasets based on research problem descriptions. It was validated through a large-scale evaluation and user study, and paves the way for AI-driven research data discovery tools.

Quantifying Explanations of Neural Networks in E-Commerce Based on LRP

A. Nguyen, F. Krause, D. Hagenmayer, M. Färber
ECML-PKDD 2021
📄 Read the paper

We propose a framework to explain neural network behavior using Layer-wise Relevance Propagation (LRP) in real-world e-commerce scenarios. Our metrics help ensure transparency, fairness, and accountability in AI systems.

Citation Recommendation: Approaches and Datasets

M. Färber, A. Jatowt
International Journal on Digital Libraries, 2020
📄 Read the paper

A comprehensive survey on citation recommendation methods and datasets. This article is a cornerstone in the field of citation-based AI and part of a broader series of contributions to bibliometrics and NLP.

The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data

M. Färber
ISWC 2019
📄 Read the paper

As the sole author, I introduced the largest open scholarly knowledge graph at the time, with over 8 billion triples. This dataset has since supported numerous research and industry applications.

Linked Data Quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO

M. Färber, F. Bartscherer, C. Menne, A. Rettinger
Semantic Web Journal, 2018 – Outstanding Paper Award
📄 Read the paper

This highly cited journal article presents a framework for evaluating the quality of major knowledge graphs. It remains a foundational reference for semantic web researchers and linked data quality assessment.