Description

In this project, we train vector representations on registry data. The sequential approach uses a BERT-type sequence model for life events. The graph approach trains embeddings with deepwalk on population-scale network data. We deploy the model training and inference on GPUs of the national cluster computer. We evaluate the model performance on prediction tasks from the social sciences.