EXCERPT (BLOG ITEM + OG, TWITTER)
In this notebook, we compare getML against extant approaches in the relational learning literature on the CORA data set, which is often used for benchmarking. We demonstrate that getML outperforms the state of the art in the relational learning literature on this data set. Beyond the benchmarking aspects, this notebooks showcases getML's excellent capabilities in dealing with categorical data.
Summary:
Author: Dr. Patrick Urbanke
CORA is a well-known benchmarking dataset in the academic literature on relational learning. The dataset contains 2708 scientific publications on machine learning. The papers are divided into 7 categories. The challenge is to predict the category of a paper based on the papers it cites, the papers it is cited by and keywords contained in the paper.
It has been downloaded from the CTU Prague relational learning repository (Motl and Schulte, 2015).
Notebook:
Open in nbviewer
Open in mybinder