Links
pop2vec
: Main code of the project- A parallel random walk algorithm for heterogeneous graph edges
- We are currently packaging this code, new repository coming soon.
Description
In this project, we train vector representations on registry data. The sequential approach uses a BERT-type sequence model for life events. The graph approach trains embeddings with deepwalk
on population-scale network data.
We deploy the model training and inference on GPUs of the national cluster computer.
We evaluate the model performance on prediction tasks from the social sciences.