[statnet_help] Basis for simulate and GOF

Carter T. Butts buttsc at uci.edu
Sun Aug 9 19:02:33 PDT 2020


Hi, Adam -

In addition to Martina's excellent advice, I would also note that the
number of steps required for effective MCMC convergence depends on graph
size.  The gof() defaults are set up to be (1) fast, and (2) reasonably
effective for "typical" models and graph sizes encountered in common
social network settings.  If you want higher quality draws, or if your
graph is larger (or, as Martina noted, your model mixes slowly), you
will need to increase your number of draws, burn-in, or thinning
interval via the control() functions (see also the ergm tutorials on
this point).

My own heuristic for getting high-quality mixing is to scale the
thinning parameter as apx k*nv^2, where nv is the number of vertices. 
This provides enough draws between each graph realization for every edge
variable to be toggled an average of k times.  (Note that toggles will
not necessarily be distributed evenly, because our standard proposal
schemes use more efficient schemes than that, but this is still a
helpful mnemonic approximation.)  If your model is very "sticky," you
may need k to be on the order of 100 or more to get good results.  For
models with much lower level of dependence, you may get perfectly fine
results with k<10.  You can make some guesses about what k may need to
be by looking at your model and thinking about the conditional
probability of converting an edge to a null or vice versa under typical
structural conditions (which you can work out from the change score):
how many "tries" would you expect it to take for a toggle to be
successful?   This, too, is a heuristic (e.g., it does not take into
account the phenomenon in which "easy" toggles on other edge variables
"soften up" an initially hard-to-accept toggle), but it can be helpful. 
In the end, though, the most effective thing to do is often to perform
some short pilot runs, and work out how long it takes to get decent
mixing by looking at the diagnostics that Martina suggested.

(FWIW, it should be noted that all this also depends on your
objectives.  If you just need to get convergence on the first few
moments of a particular set of graph statistics, you may be able to do
very well with many fewer than nv^2 toggles per draw (especially if the
graph is very sparse).  If your application is very sensitive to the
identities of specific edges, and you need very high quality
randomization over the space of graphs, then you may need much longer
Markov chains.  As usual, there is no one-size-fits-all solution,
because different models and different applications impose different
requirements.  One must inspect the results to ensure that what one is
getting is adequate for one's particular needs.)

Finally, if you want to get a deeper understanding of these issues, I
would suggest taking a look at the more general literature on MCMC
convergence diagnostics.  (These days, any text on computational or
Bayesian statistics is likely to review the basics, and frankly one gets
into diminishing returns very quickly beyond that.)  Simulation of ERGM
draws by MCMC is not fundamentally different from other MCMC simulation
problems, and even a quick perusal of "MCMC basics" will provide very
useful insights on why the ergm tools behave as they do.

Hope that helps,

-Carter

On 8/9/20 12:56 PM, martina morris wrote:

> Hi Adam,

>

>

>> Some background: I’ve fitted a series of models of increasing

>> complexity using ergm, and estimated the “mean

>> connectivity” (the marginal of each of the edges in the network)

>> estimated by each of them:

>>

>> model.i <- ergm(g ~ ...)

>> samples <- lapply(simulate(model.i, nsim=500), FUN=as.matrix)

>> mean_samp <- as.matrix(Reduce("+", samples) / length(samples))

>

> it would actually help to see what your model terms are.

>

>> I also conducted similar GOF comparison using the gof function. For

>> some models, the mean of the samples (as

>> well as the distribution of different statistics in the gof function)

>> was suspiciously similar to the original

>> network. Specifically, the models seemed to capture

>> different “inhomogeneous”/“symmetry breaking" properties

>> of the original network that they shouldn’t be able to, according to

>> their terms/covariates. My current

>> understanding is that this resulted from the default behaviour of

>> “simulate”, which started from

>> the original data matrix, and simply “didn’t get far enough”.

>

> there are a couple of possible explanations for this:

>

> 1. one is how strongly the model constrains the tie distribution. in

> some cases, these constraints can severely reduce the sample space of

> networks defined by the model.  in that case, it becomes very

> difficult to take a "step" in the MCMC process, because the

> probability of a change (to the sufficient stats in the model) is so

> low for most proposals.

>

> 2. the other is that once you "pin down" some key lower order

> properties (like the degree distribution and mixing by nodal

> attributes) many of the higher order graph properties (like component

> size and geodesic distributions) are also often constrained.  you may

> change the individual node-id in a specific position, but the

> positions are fairly stable.  we see this alot in our infectious

> disease modeling simulations.  there's a good example in this paper in

> figure 2:

>

> Krivitsky, P. N. and M. Morris (2017). "Inference for social network

> models from egocentrically sampled data, with application to

> understanding persistent racial disparities in hiv prevalence in the

> us." Annals of Applied Statistics 11(1): 427-455. doi:10.1214/16-aoas1010

> https://projecteuclid.org/euclid.aoas/1491616887

>

>

>  > I’d really appreciate your answers for the following two questions:

>>  *  Is there any quantitative way to measure that the samples

>> generated by “simulate” (or “gof”) are

>>     “independent enough” for reliable estimation of single edge

>> marginals or other statistics of interest?

>

> have you looked at your MCMC diagnostics from the fits?  i'd suggest

> starting there.  if there is visible serial correlation, that suggests

> #1 above, and you might want to increase your MCMC.interval.  what you

> want to see is a fuzzy caterpillar.  if you already have that for all

> of your models, that suggests the issue is #2 above.

>

>>  *  Are there any obvious drawbacks to supplying a “naive" basis

>> (like a Bernoulli graph with the same density)

>>     for simulate and/or gof?

>

> just time.  keep in mind the MCMC burnin time will need to be

> increased so that you can reach the target statistics.  but you can

> try this and see whether, once you do reach the targets, you find the

> same lack of variation in the higher order stats.  if so, that again

> points to #2 above.

>

> best,

> mm

>

>

>  > Respectfully,

>> Adam Haber

>>

>>

>

> ****************************************************************

>  Professor Emerita of Sociology and Statistics

>  Box 354322

>  University of Washington

>  Seattle, WA 98195-4322

>

>  Office:        (206) 685-3402

>  Dept Office:   (206) 543-5882, 543-7237

>  Fax:           (206) 685-7419

>

> morrism at u.washington.edu

> http://faculty.washington.edu/morrism/

>

> _______________________________________________

> statnet_help mailing list

> statnet_help at u.washington.edu

> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/statnet_help/attachments/20200809/506f629c/attachment.html>


More information about the statnet_help mailing list