A Library Perspective on Supervised Text Processing in Digital Libraries: An Investigation in the Biomedical Domain

November 06, 2024 · Declared Dead · 🏛 ACM/IEEE Joint Conference on Digital Libraries

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Hermann Kroll, Pascal Sackhoff, Bill Matthias Thang, Maha Ksouri, Wolf-Tilo Balke arXiv ID 2411.12752 Category cs.DL: Digital Libraries Cross-listed cs.CL Citations 0 Venue ACM/IEEE Joint Conference on Digital Libraries Last Checked 3 months ago

Abstract

Digital libraries that maintain extensive textual collections may want to further enrich their content for certain downstream applications, e.g., building knowledge graphs, semantic enrichment of documents, or implementing novel access paths. All of these applications require some text processing, either to identify relevant entities, extract semantic relationships between them, or to classify documents into some categories. However, implementing reliable, supervised workflows can become quite challenging for a digital library because suitable training data must be crafted, and reliable models must be trained. While many works focus on achieving the highest accuracy on some benchmarks, we tackle the problem from a digital library practitioner. In other words, we also consider trade-offs between accuracy and application costs, dive into training data generation through distant supervision and large language models such as ChatGPT, LLama, and Olmo, and discuss how to design final pipelines. Therefore, we focus on relation extraction and text classification, using the showcase of eight biomedical benchmarks.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Digital Libraries

R.I.P. 👻 Ghosted

Constructing bibliometric networks: A comparison between full and fractional counting

Antonio Perianes-Rodriguez, Ludo Waltman, Nees Jan van Eck

cs.DL 🏛 J. Informetrics 📚 1.1K cites 9 years ago

R.I.P. 👻 Ghosted

Measuring academic influence: Not all citations are equal

Xiaodan Zhu, Peter Turney, ... (+2 more)

cs.DL 🏛 J. Assoc. Inf. Sci. Technol. 📚 262 cites 11 years ago

R.I.P. 👻 Ghosted

The Open Access Advantage Considering Citation, Article Usage and Social Media Attention

Xianwen Wang, Chen Liu, ... (+2 more)

cs.DL 🏛 Scientometrics 📚 224 cites 11 years ago

R.I.P. 👻 Ghosted

A Bibliometric Review of Large Language Models Research from 2017 to 2023

Lizhou Fan, Lingyao Li, ... (+4 more)

cs.DL 🏛 ACM TIST 📚 208 cites 3 years ago

R.I.P. 👻 Ghosted

On the Performance of Hybrid Search Strategies for Systematic Literature Reviews in Software Engineering

Erica Mourão, João Felipe Pimentel, ... (+4 more)

cs.DL 🏛 IST 📚 157 cites 6 years ago

R.I.P. 👻 Ghosted

A Systematic Identification and Analysis of Scientists on Twitter

Qing Ke, Yong-Yeol Ahn, Cassidy R. Sugimoto

cs.DL 🏛 PLoS ONE 📚 147 cites 9 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 8 years ago