Student outcomes
Selected student-led papers and open artifacts from our group.
Student-led papers (selected):
- SQuAI: Scientific Question-Answering with Multi-Agent Retrieval-Augmented Generation — CIKM 2025 (demo) · students: Ines Besrour (M.Sc.), Jingbo He (M.Sc.) — Multi-agent RAG framework for scientific QA with inline citations and supporting evidence.
- ComplexTempQA: A 100m Dataset for Complex Temporal Question Answering — EMNLP 2025 · students: Raphael Gruber (M.Sc.) — 100M-scale benchmark for complex temporal question answering (Wikipedia/Wikidata).
- In-Context Learning for Information Extraction using Fully Synthetic Demonstrations — XLLM@ACL 2025 · students: Ashish Kangen (M.Sc.) — Synthetic demonstration generation + retrieval-based in-context learning for document-level IE.
- SimplifyMyText: An LLM-Based System for Inclusive Plain Language Text Simplification — ECIR 2025 · students: Kyuri Im (M.Sc.) — LLM-based system for plain-language simplification with configurable audiences and input formats.
- CoDy: Counterfactual Explainers for Dynamic Graphs — ICML 2025 · students: Daniel Gomm (M.Sc.) — Counterfactual explanations for temporal/dynamic graph neural networks.
- Hateful Person or Hateful Model? Investigating the Role of Personas in Hate Speech Detection by Large Language Models — PALS@EMNLP 2025 · students: Mario Tawfelis (M.Sc.) — How persona prompts affect LLM fairness in hate-speech detection.
- Optimizing Small Transformer-Based Language Models for Multi-Label Sentiment Analysis in Short Texts — 2025 (workshop/preprint) · students: Julius Neumann (M.Sc.), Robert Lange (M.Sc.) — How to optimize small Transformers for short-text multi-label sentiment under limited context.
- Future Timelines: Extraction and Visualization of Future-related Content From News Articles — WSDM 2024 · students: Juwal Regev (M.Sc.) — Extract and summarize future-related statements from news, then present them on an interactive timeline.
- Benefits of international collaboration in computer science: a case study of China, the European Union, and the United States — Scientometrics 2024 · students: Alberto Gómez-Espés (M.Sc.) — Scientometric study of collaboration patterns and citation impact across regions.
- The Effects of Hallucinations in Synthetic Training Data for Relation Extraction — KBC-LM@ISWC 2024 · students: Steven Rogulsky (M.Sc.) — Empirical analysis of how hallucinated synthetic data affects relation extraction performance.
- AutoRDF2GML: Facilitating RDF Integration in Graph Machine Learning — ISWC 2024 · students: David Lamprecht (M.Sc.) — Framework to transform RDF into ready-to-use heterogeneous graphs for machine learning.
- Biases in Scholarly Recommender Systems: Impact, Prevalence, and Mitigation — Scientometrics 2023 · students: Melissa Coutinho (B.Sc.) — Survey and framework for understanding and mitigating bias in scholarly recommendation.
- Linked Papers With Code: The Latest in Machine Learning as an RDF Knowledge Graph — ISWC 2023 · students: David Lamprecht (M.Sc.) — RDF knowledge graph modeling Papers With Code at scale.
- Recommending Datasets for Scientific Problem Descriptions — CIKM 2021 · students: Ann-Kathrin Leisinger (B.Sc.) — Dataset recommendation from problem descriptions using academic text evidence.
- SQuAI (code) — Code · students: Ines Besrour (M.Sc.), Jingbo He (M.Sc.) — Multi-agent RAG system for scientific QA with citations.
- SQuAI (demo) — Demo · students: Ines Besrour (M.Sc.), Jingbo He (M.Sc.) — Interactive demo of scientific QA with traceable sources.
- ComplexTempQA (dataset + code) — Dataset · students: Raphael Gruber (M.Sc.) — Repository for the 100M-scale temporal QA benchmark.
- CoDy (code) — Code · students: Daniel Gomm (M.Sc.) — Reference implementation for counterfactual explanations on dynamic graphs.
- SimplifyMyText (system) — Demo · students: Kyuri Im (M.Sc.) — Plain-language text simplification system.
- Short-text sentiment Transformers (code) — Code · students: Julius Neumann (M.Sc.), Robert Lange (M.Sc.) — Optimizing small Transformers for short-text sentiment.
- AutoRDF2GML (code) — Code · students: David Lamprecht (M.Sc.) — Code for transforming RDF into graph ML representations.
- Linked Papers With Code (code) — Code · students: David Lamprecht (M.Sc.) — Code pipeline for the LPWC knowledge graph.
- Linked Papers With Code (data snapshot) — Dataset · students: David Lamprecht (M.Sc.) — Archived dataset snapshot for LPWC.