[statnet_help] Degree calculation discrepancy from edgelist
michael_d_siciliano at yahoo.com
Mon Dec 7 20:42:42 PST 2020
I found a very strange issue when calculating degree scores from a network object built from an edgelist. Not sure this would constitute a bug as I discuss below, but I could see this issue happening to others so thought I would bring it to the listservs attention. Basically when calling 'degree' the network object and the sociomatrix of that network object give different results.
Here is a quick reproducible example of what is happening. Start by making an edgelist and turning it into a network object.
edge.dat = data.frame(Source = c("a", "b", "c", "d", "a", "b"), Target = c("b", "a", "d", "a", "c", "d"))
net = network(edge.dat, matrix.type = "edgelist", directed = FALSE)
net Network attributes: vertices = 4 directed = FALSE hyper = FALSE loops = FALSE multiple = FALSE bipartite = FALSE total edges= 6 missing edges= 0 non-missing edges= 6
Vertex attribute names: vertex.names
No edge attributes
The network object indicates 6 edges in an undirected network, but really there are only 5. This creates issues when calculated degree centrality. As it double counts the tie between a and b. One for the a->b tie and one for the b->a tie; even though it is stored as an undirected network and centrality is being calculated with gmode = "graph". This is best seen by comparing the degree scores between the following calculations. When calculating degree from the network object, actor a and b have a degree score that is 1 larger than their degree score based on the sociomatrix of that same network object.
data.frame(names = net %v% "vertex.names", degree = degree(net, gmode = "graph") , degree.mat = degree(as.sociomatrix(net), gmode = "graph"))
names degree degree.mat a 4 3 b 3 2 c 2 2 d 3 3
Not clear this is actually a bug as I am feeding it an edgelist that is technically directed; and telling it to treat it as undirected. But I assume this could happen in applications where symmetric ties are being pulled from large archival sources and the researcher may not know there is an a->b tie and a b->a tie in the resulting edgelist. I stumbled upon this issue from networks developed from text analysis as a colleague and I were getting different degree scores from one another. There may also be something simple I am just not seeing and would be grateful for someone to point it out.
ps. This issue does not happen with adjacency matrices.
set.seed(1813)amat = rgraph(n=10, tprob = .1)amat
net2 = network(amat, directed = FALSE)
data.frame(names = net2 %v% "vertex.names", degree = degree(net2, gmode = "graph") , degree.mat = degree(as.sociomatrix(net2), gmode = "graph"))
There is a 2->7 and 7->2 tie that is not double counted.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the statnet_help