Selected Publications
Selection of 10 key publications with short summaries.
Below is a curated list of ten of my publications (listed chronologically), each with a short summary highlighting its impact.
A full list of publications is available here.
1. Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-Based Causal Discovery
Y. Susanti, M. Färber
ISWC 2024
📄 Read the paper
We introduce a method for extracting causal relationships from text by integrating small language models (under 1B parameters) with knowledge graphs via prompt-based learning. The approach outperforms larger models—more efficiently and at lower cost—highlighting the power of combining knowledge graphs with compact AI models.
2. GNNAVI: Navigating the Information Flow in Large Language Models by Graph Neural Network
S. Yuan, E. Nie, M. Färber, H. Schmid, H. Schütze
Findings of ACL 2024
📄 Read the paper
This ACL paper proposes a fine-tuning technique for large language models using Graph Neural Networks (GNNs) to improve information flow. GNNAVI achieves state-of-the-art accuracy in few-shot tasks while updating less than 0.5% of model parameters—demonstrating high efficiency and scalability.
3. SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples
M. Färber, D. Lamprecht, J. Krause, L. Aung, P. Haase
ISWC 2023 – Best Paper Award
📄 Read the paper
SemOpenAlex is the largest existing scholarly knowledge graph, with 26 billion triples. It provides open access to global research metadata and sets a new benchmark for scientific discovery, interdisciplinary research, and large-scale knowledge integration.
4. Biases in Scholarly Recommender Systems: Impact, Prevalence, and Mitigation
M. Färber, S. Yuan, M. Coutinho
Scientometrics, 2023
📄 Read the paper
This paper identifies and categorizes biases in academic recommender systems and presents strategies to mitigate them. It offers a framework that advances fairness and transparency in research discovery and information access.
5. Few-Shot Document-Level Relation Extraction
N. Popovic, M. Färber
NAACL 2022
📄 Read the paper
This paper defines a new benchmark for few-shot learning in document-level relation extraction. It addresses the lack of scalable, domain-specific datasets and introduces a novel strategy for building more realistic NLP benchmarks.
6. Recommending Datasets for Scientific Problem Descriptions
M. Färber, A.-K. Leisinger
CIKM 2021
📄 Read the paper
This work introduces a system that recommends datasets based on research problem descriptions. It was validated through a large-scale evaluation and user study, and paves the way for AI-driven research data discovery tools.
7. Quantifying Explanations of Neural Networks in E-Commerce Based on LRP
A. Nguyen, F. Krause, D. Hagenmayer, M. Färber
ECML-PKDD 2021
📄 Read the paper
We propose a framework to explain neural network behavior using Layer-wise Relevance Propagation (LRP) in real-world e-commerce scenarios. Our metrics help ensure transparency, fairness, and accountability in AI systems.
8. Citation Recommendation: Approaches and Datasets
M. Färber, A. Jatowt
International Journal on Digital Libraries, 2020
📄 Read the paper
A comprehensive survey on citation recommendation methods and datasets. This article is a cornerstone in the field of citation-based AI and part of a broader series of contributions to bibliometrics and NLP.
9. The Microsoft Academic Knowledge Graph: A Linked Data Source with 8 Billion Triples of Scholarly Data
M. Färber
ISWC 2019
📄 Read the paper
As the sole author, I introduced the largest open scholarly knowledge graph at the time, with over 8 billion triples. This dataset has since supported numerous research and industry applications.
10. Linked Data Quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO
M. Färber, F. Bartscherer, C. Menne, A. Rettinger
Semantic Web Journal, 2018 – Outstanding Paper Award
📄 Read the paper
This highly cited journal article presents a framework for evaluating the quality of major knowledge graphs. It remains a foundational reference for semantic web researchers and linked data quality assessment.