Department of Statistics
Carnegie Mellon University
Title: Co-authorship and Citation Networks of statisticians
Abstract: We have collected a data set for the networks of statisticians, consisting of titles,
authors, abstracts, MSC numbers, keywords, and citation counts of papers published
in representative journals in statistics and related elds. In Phase I of our study,
the data set covers all published papers from 2003 to 2012 in Annals of Statistics,
Biometrika, JASA, and JRSS-B. In Phase II of our study, the data set covers all
published papers in 36 journals in statistics and related elds, spanning 40 years.
The data sets motivate an array of interesting problems in social networks, topic
learning, and knowledge discovery.
In the rst part of the talk, I will discuss the problem of network membership
estimation. We propose a new spectral approach called Mixed-SCORE, and reveal
a surprising simplex structure underlying the networks. We explain why Mixed-
SCORE is the right approach and use it to investigate two networks constructed
from the Phase I data.
In the second part of the talk, I will report some Exploratory Data Analysis (EDA)
results including productivity, journal ranking, topic learning, citation patterns. This
part of result is based on Phase II data.