Multi-Scale Heterogeneous Text-Attributed Graph Datasets From Diverse Domains
December 12, 2024 ยท Declared Dead ยท ๐ The Web Conference
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Yunhui Liu, Qizhuo Xie, Jinwei Shi, Jiaxu Shen, Tieke He
arXiv ID
2412.08937
Category
cs.LG: Machine Learning
Cross-listed
cs.CL
Citations
3
Venue
The Web Conference
Last Checked
4 months ago
Abstract
Heterogeneous Text-Attributed Graphs (HTAGs), where different types of entities are not only associated with texts but also connected by diverse relationships, have gained widespread popularity and application across various domains. However, current research on text-attributed graph learning predominantly focuses on homogeneous graphs, which feature a single node and edge type, thus leaving a gap in understanding how methods perform on HTAGs. One crucial reason is the lack of comprehensive HTAG datasets that offer original textual content and span multiple domains of varying sizes. To this end, we introduce a collection of challenging and diverse benchmark datasets for realistic and reproducible evaluation of machine learning models on HTAGs. Our HTAG datasets are multi-scale, span years in duration, and cover a wide range of domains, including movie, community question answering, academic, literature, and patent networks. We further conduct benchmark experiments on these datasets with various graph neural networks. All source data, dataset construction codes, processed HTAGs, data loaders, benchmark codes, and evaluation setup are publicly available at GitHub and Hugging Face.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning
๐ฎ
๐ฎ
The Ethereal
๐ฎ
๐ฎ
The Ethereal
Continuous control with deep reinforcement learning
๐
๐
Old Age
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
๐
๐
Old Age
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
๐
๐
Old Age
SGDR: Stochastic Gradient Descent with Warm Restarts
๐ฎ
๐ฎ
The Ethereal
Asynchronous Methods for Deep Reinforcement Learning
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted