[statnet_help] Constraint blocks of dyads via 'constraints = ~ Dyads(vary, fix)'

Hunter, David Russell dhunter at stat.psu.edu
Sat Dec 12 15:45:01 PST 2020


Hi, Karim. I think I know what’s going on with your new code. Coincidentally, during a recent Zoom conversation with Pavel K about this very topic (i.e., the Dyads operator) within the last several days, I believe he said that the current implementation of Dyads doesn’t scale well because it keeps track of literally all of the dyads in the full network. This is inefficient from a storage perspective of course, and so the warning message you got indicates how Pavel has handled the issue of networks that are too large for now: He’s sacrificed the exact calculations that are possible for the smaller networks for the approximations one has to accept when Monte Carlo methods are used. (BTW, this sort of tradeoff is made all the time by the ergm package; it’s how maximum likelihood estimation for general networks is even possible.)

Bottom line, I’m not sure you need to “fix” anything, and if you’re truly needing the network sizes you indicate then for the time being you may have to live with the approximations. That said, I’m not nearly as familiar with the code base these days as folks like Pavel and others, so there may be an alternative for implementing block-diagonal constraints—that’s originally what you’d wanted to do, correct? Let me know if I’ve got any of this badly wrong.

Best,
Dave

From: Karim Khader <Karim.Khader at hsc.utah.edu>
Date: Saturday, December 12, 2020 at 1:30 PM
To: "Hunter, David Russell" <dhunter at stat.psu.edu>, "statnet_help at u.washington.edu" <statnet_help at u.washington.edu>
Subject: RE: [statnet_help] Constraint blocks of dyads via 'constraints = ~ Dyads(vary, fix)'

Dear Dave,

Thank you very much for your detailed and clear response to my question. It really helped me to better understand the calculations, and also pointed me to an error in my example. The error was in the creation of the node list (“vertex.attr”) that did not account appropriately for the nodes that were not part of an edge. I also inadvertently made “Flomarriage” a directed network which was not my intention.

I have fixed my example (new code below if you are interested), and now including two copies of ‘flomarriage’ into a larger network ‘Flomarriage’ does result in e2 and e3 giving the exact same estimate, which is what I was expecting to see.

But the problem that motivated the original example and question remain. I have 95 networks that I am trying to combine into a single network, and wanting to constrain edges from crossing between the 95 networks. I suspect that it is actually the size/scope of the model I am working with that is causing the problem. In fact, if you run the code below with N > 88, you will see that e2 and e3 are no longer the same, and there is a warning message “In ergm.pl(nw, fd, model, verbose = verbose, control = control, : Too many unique dyads. MPLE is approximate, and MPLE standard errors are suspect.”

I am new to statnet, so not sure at the moment how to fix this issue. If you have any advice, I would be very grateful to hear your thoughts.

Thanks again for your help!

Best,

Karim

require(statnet)
require(dplyr)
require(boot)

data(florentine)
### Create N independent copies of 'flomarriage'
N <- 2
Florentine <- list()
for (i in 1:N){
Florentine[[i]] <- flomarriage
}

### Create new edgelist that incorporates the N independent copies,
### and assigns each node a 'group' id depending on which group they belong to
EL <- list()
for (i in 1:N){
EL[[i]] <- as.edgelist(Florentine[[i]]) + (i-1)*16
EL[[i]] <- as.data.frame(EL[[i]])%>%
mutate(group = i)%>%
rename(from = V1, to = V2)
}
EL <- do.call("rbind", EL)
NL <- data.frame(id = seq(from = 1, to = N*16, by = 1), group = unlist(lapply(c(1:N), FUN = function(x){rep(x, times = 16)})))

### Create new Network that represents N independent copies of flomarriage
Flomarriage <- network(select(EL, from, to), vertex.attr = NL, directed = FALSE)

e1 <- ergm(Flomarriage ~ edges)
e2 <- ergm(flomarriage ~ edges)
e3 <- ergm(Flomarriage ~ edges, constraints = ~ Dyads(vary = ~nodematch("group"), fix = NULL))


From: Hunter, David Russell <drh20 at psu.edu>
Sent: Friday, December 11, 2020 8:01 AM
To: Karim Khader <Karim.Khader at hsc.utah.edu>; statnet_help at u.washington.edu
Subject: Re: [statnet_help] Constraint blocks of dyads via 'constraints = ~ Dyads(vary, fix)'

Dear Karim,

Thanks for including such an easy-to-reproduce example! Let me know if this helps explain what’s going on with your code:


> table(Flomarriage %v% "group")


1 2
15 15

Your “Flomarriage" network is directed, so the output above reveals that there are (15x14) plus (15x14), or 420, different dyads that have nonzero change statistics for the nodematch(“group”) ergm model term. This is out of 30*29, or 870, total dyads in the network, leaving 450 that have zero change stats for the nodematch term.


> summary(Flomarriage ~ edges + nodematch("group"))

edges nodematch.group
40 19

The output above reveals that 19 of the 420 possible dyads that DO have nonzero change stats are edges, whereas 40–19 or 21 of the 450 possible dyads that DO NOT have nonzero change stats are edges. So we can calculate by hand the log-odds, or logit, of both 19/420 and 21/450, and these give the values of parameter estimates for the “edges” term in networks that hold fixed the dyads in one category or the other, as follows:

The “Dyads” operator with “fix” will fix the dyads with nonzero change stats, leading to the logit(21/450) answer:

> logit <- function(p) log(p/(1-p))

> logit(21/450)

[1] -3.016934

> ergm(Flomarriage~edges, constraints = ~ Dyads(fix = ~nodematch("group"), vary = NULL))$coef

edges
-3.016934

Similarly, if we use the “vary” option with the “Dyads” operator, we’ll fix the zero-change-stat dyads, which yields the logit(19/420) answer:

> logit(19/420)

[1] -3.049522

> ergm(Flomarriage~edges, constraints = ~ Dyads(vary = ~nodematch("group"), fix = NULL))$coef

edges
-3.049522

On the other hand, the “flomarriage” network is undirected with 20 edges and 16 nodes and thus (16-choose-2), or 120, dyads. This leads to an “edges” coefficient of logit(20/120):

> logit(20/120)

[1] -1.609438

> ergm(flomarriage~edges)$coef

edges
-1.609438

…and “Flomarriage” without any constraints has 40 edges out of a possible 870:

> logit(40/870)

[1] -3.032546

> ergm(Flomarriage~edges)$coef

edges
-3.032546

Hope this helps,
Dave


On Dec 10, 2020, at 6:46 PM, Karim Khader <Karim.Khader at hsc.utah.edu<mailto:Karim.Khader at hsc.utah.edu>> wrote:

All,

Does anyone have experience using the Dyad constraint (i.e. constraints = ~ Dyads(vary = ~nodematch("group"), fix = NULL)) within ‘ergm’, used to impose a block diagonal structure on the network adjacency matrix? I have recently tried implementing it but am finding that the model estimates when the constraint is used do not make sense.

Code for a simple example that illustrates my concern is included below. In the example, I understand that e1 should be different from e2 and e3, but I would have thought that e2 and e3 should return the same estimate (since the # of possible edges and the total # of edges represented in e3 is a multiple of those represented in e2). Am I simply misunderstanding how constraints = ~ Dyads() is supposed to work?

Best,

Karim Khader
Assistant Professor
Division of Epidemiology
University of Utah
Salt Lake City, Utah


require(statnet)
require(dplyr)

data(florentine)
### Create N independent copies of 'flomarriage'
N <- 2
Florentine <- list()
for (i in 1:N){
Florentine[[i]] <- flomarriage
}

### Create new edgelist that incorporates the N independent copies,
### and assigns each node a 'group' id depending on which group they belong to
EL <- list()
for (i in 1:N){
EL[[i]] <- as.edgelist(Florentine[[i]]) + (i-1)*16
EL[[i]] <- as.data.frame(EL[[i]])%>%
mutate(group = i)%>%
rename(from = V1, to = V2)
}
EL <- do.call("rbind", EL)
NL <- rbind(EL%>%select(from, group)%>%rename(id=from),
EL%>%select(to, group)%>%rename(id=to))%>%
arrange(group, id)%>%
distinct(.keep_all = T)

### Create new Network that represents N independent copies of flomarriage
Flomarriage <- network(select(EL, from, to), vertex.attr = NL, directed = TRUE)

e1 <- ergm(Flomarriage ~ edges)
e2 <- ergm(flomarriage ~ edges)
e3 <- ergm(Flomarriage ~ edges, constraints = ~ Dyads(vary = ~nodematch("group"), fix = NULL))
_______________________________________________
statnet_help mailing list
statnet_help at u.washington.edu<mailto:statnet_help at u.washington.edu>
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman13.u.washington.edu%2Fmailman%2Flistinfo%2Fstatnet_help&data=04%7C01%7Cdrh20%40psu.edu%7Ccab6c82fafdc41b5b24108d89d65f6f4%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C637432408550493476%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=LVxnH42MLma9C7n8JxJcdOKAjez7lVd%2FUd3%2Bt4sM8OU%3D&reserved=0<https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman13.u.washington.edu%2Fmailman%2Flistinfo%2Fstatnet_help&data=04%7C01%7Cdhunter%40stat.psu.edu%7Cf6cff6ff0e674dd9e20e08d89ecbff28%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C637433946288378721%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=3MnKgV3HqQmeAJPcub2ufsNhnsuh2FjhMUbJ%2BiAlbho%3D&reserved=0>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/statnet_help/attachments/20201212/6a43d5b2/attachment.html>


More information about the statnet_help mailing list