[statnet_help] Degree calculation discrepancy from edgelist

Michael Siciliano michael_d_siciliano at yahoo.com
Mon Dec 7 20:42:42 PST 2020


Hi All,
I found a very strange issue when calculating degree scores from a network object built from an edgelist. Not sure this would constitute a bug as I discuss below, but I could see this issue happening to others so thought I would bring it to the listservs attention. Basically when calling 'degree' the network object and the sociomatrix of that network object give different results.
Here is a quick reproducible example of what is happening. Start by making an edgelist and turning it into a network object. 
edge.dat = data.frame(Source = c("a", "b", "c", "d", "a", "b"),                      Target = c("b", "a", "d", "a", "c", "d"))
net = network(edge.dat, matrix.type = "edgelist", directed = FALSE)
net Network attributes:  vertices = 4   directed = FALSE   hyper = FALSE   loops = FALSE   multiple = FALSE   bipartite = FALSE   total edges= 6     missing edges= 0     non-missing edges= 6 
 Vertex attribute names:     vertex.names 
No edge attributes

The network object indicates 6 edges in an undirected network, but really there are only 5.  This creates issues when calculated degree centrality. As it double counts the tie between a and b.  One for the a->b tie and one for the b->a tie; even though it is stored as an undirected network and centrality is being calculated with gmode = "graph". This is best seen by comparing the degree scores between the following calculations. When calculating degree from the network object, actor a and b have a degree score that is 1 larger than their degree score based on the sociomatrix of that same network object.
data.frame(names = net %v% "vertex.names",       degree = degree(net, gmode = "graph") ,       degree.mat = degree(as.sociomatrix(net), gmode = "graph"))
  names   degree   degree.mat     a            4             3     b            3             2     c            2             2     d            3             3

Not clear this is actually a bug as I am feeding it an edgelist that is technically directed; and telling it to treat it as undirected.  But I assume this could happen in applications where symmetric ties are being pulled from large archival sources and the researcher may not know there is an a->b tie and a b->a tie in the resulting edgelist. I stumbled upon this issue from networks developed from text analysis as a colleague and I were getting different degree scores from one another.  There may also be something simple I am just not seeing and would be grateful for someone to point it out.
Thanks. Best,
Michael

ps. This issue does not happen with adjacency matrices.
set.seed(1813)amat = rgraph(n=10, tprob = .1)amat
net2 = network(amat, directed = FALSE)
data.frame(names = net2 %v% "vertex.names",           degree = degree(net2, gmode = "graph") ,           degree.mat = degree(as.sociomatrix(net2), gmode = "graph"))
There is a 2->7 and 7->2 tie that is not double counted. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/statnet_help/attachments/20201208/29ebbf4f/attachment.html>


More information about the statnet_help mailing list