[statnet_help] Convergence assessment in small networks/with limited data

David Kretschmer dkretsch at mail.uni-mannheim.de
Wed Oct 7 08:49:32 PDT 2020


Hi Martina,

thank you again! Now that I have dug deeper into the convergence and GOF analysis, one more question has come up on the basis of your suggestions.

For the networks in which the distribution plots from the convergence analysis produce bimodal patterns, I now produce comparable distribution plots based on GOF simulations. If these distributions are bimodal as well, I then conclude that the model does not adequately reproduce the characteristics of the actual network. Then I do not consider the network for the subsequent meta-analysis.

However, how do I judge asymmetric distribution plots for the GOF simulations? For example, when my observed network statistic has the value 1, GOF simulations frequently produce a distribution with a large number of networks with value 0 of the network statistic and many fewer observations with values of 1, 2, 3, 4 (etc.).

I guess that this is “by design” because it ensures that the mean of the simulations is very close to the actual network statistic of 1. However, it also means that in most simulated networks, the observed statistic is zero and thus somewhat off the observed network statistic. Is this a problem? And, for the broader picture, am I still in the domain of “convergence analysis” with this assessment or in the domain of “goodness of fit analysis”?

Thank you for your help.

Best,
David

--
David Kretschmer
Universität Mannheim
Mannheimer Zentrum für Europäische Sozialforschung (MZES)
A5, 6
68159 Mannheim
Tel.: +49-621-181-2024


> On 2. Oct 2020, at 20:31, martina morris <morrism at uw.edu> wrote:

>

> On Fri, 2 Oct 2020, David Kretschmer wrote:

>

>

>> thank you very much for the extremely helpful reply! I wasn’t aware of the fact that the distribution/density plots do

>> not bear significance for convergence but only for inference, but what you described makes perfect sense. Given that

>> parameter estimates (and standard errors) in my analysis are fed into a meta-regression model rather than used for

>> network-level inference, I suppose that this is less of a problem for my setup.

>

> the one caveat for this is that if the MCMC netstat distribution plots are

> bimodal, which should also be seen in the traceplots as jumps between high

> and low values of the stats, this is not ideal. more below at your q2

>

>> I do have two brief follow-up questions and would be very grateful if you could provide some insight on these too:

>> 1. Regarding the “fuzzy caterpillar”: Given the few observations for the network statistics, the Markov Chain frequently

>> only jumps between a very small number of discrete values (say deviations from the observed value of -1, 0, 1, and 2).

>> Therefore, the chain frequently does not look like a fuzzy caterpillar. However, I suppose that this also is a direct

>> consequence of the low observed number for the network statistic and not an indicator of bad convergence? I suppose that

>> the criterion for MCMC convergence that the chain explores the parameter space then is still fulfilled, but that the

>> parameter space just is much smaller in these situations?

>

> yes. it just means the fuzz doesn't extend far from the deviation = 0 reference line. you can increase the MCMC.interval and MCMC.sample to 1e5 and see if the correlation is reduced, but the range probably won't change.

>

>> 2. In some cases, the chains only alternate between deviations from the observed statistic of -1 and 1, and either never

>> or only infrequently take on a deviation of 0. (In the density/distribution plot, this is reflected in a u-shaped

>> distribution, with low density at a value of 0 and high density at -1 and 1). Do you have any idea for why this pattern

>> occurs and what this means for convergence?

>

>

> bimodal distributions are one dx for model degeneracy -- and they imply model misspecification. the model is reproducing the mean values of the sufficient statistics (the netstats) by averaging over values that are too high, and too low (see https://csss.uw.edu/research/working-papers/assessing-degeneracy-statistical-models-social-networks)

>

> if the deviation values bounce back and forth between 1 and -1, you're not

> in the classic degeneracy situation where the counts are way off in both

> directions, so the question is really what happens when you simulate from

> the fitted model. normally, that should be assessed with the GOF plots,

> not the MCMC plots, and in particular gof ~ model (by default, this is one

> of the plots output from that function). but the boxplots in that routine

> would obscure a persistent bimodal pattern. with your small network, i'd

> suggest visual inspection of the networks produced by the simulation, to

> see if they look markedly different than your observed net.

>

>> Thank you,

>> Best

>> David

>> --

>> David Kretschmer

>> Universität Mannheim

>> Mannheimer Zentrum für Europäische Sozialforschung (MZES)

>> A5, 6

>> 68159 Mannheim

>> Tel.: +49-621-181-2024

>>

>> On 2. Oct 2020, at 19:22, martina morris <morrism at uw.edu> wrote:

>> Hi David,

>> There are two dx issues here: convergence and statistical inference. The MCMC plots give some insight into both.

>> Statistical inference:

>> The principles here are similar to traditional statistical inference -- if you have a small number of observations

>> (esp. for subgroups), the sampling distributions of your statistics may not approximate a symmetric or normal

>> distribution. This is exacerbated by the lower bound at 0.

>> The MCMC distribution plots (on the right side of the MCMC dx plot layout) are essentially showing the sampling

>> distribution of the stats under the model. Asymmetries here suggest the statistical inference may be compromised.

>> For small nets, you'll often see sawtooth shaped MCMC dx plots. That is also a function of discrete, integer

>> valued stats that only vary over a small range. Not inherently a problem for statistical inference.

>> Convergence:

>> Convergence is best assessed using the MCMC traceplots on the left hand side of the plot layout. There, you're

>> looking for a "fuzzy caterpillar". What you don't want to see is a plot that trends up or down, or one that has

>> strong serial correlation in the estimates (some modest correlation is not a problem).

>> HTH,

>> Martina

>> On Fri, 2 Oct 2020, David Kretschmer wrote:

>>

>> Dear all,

>> I estimate models on relatively small networks (about 20-30 nodes), including dyadic covariates with

>> limited information,

>> e.g. with only one or two actual tie observations for the dyadic covariate in some of the networks.

>> I wonder about convergence analysis in this setup, in particular when considering density plots.

>> The values for the network statistics related to the dyadic covariate produced during ERGM estimation

>> have to be zero or

>> positive; they cannot be negative. Because the number of tie observations in the empirical network is

>> so low (only one or

>> two instances), the deviations from these observations shown in the convergence analysis very

>> frequently cannot produce

>> symmetrical density plots: Because the number predicted in the ERGM estimation is constrained to be

>> zero or higher, the

>> deviations are also constrained in one direction. This is also what I observe when looking at density

>> plots for these

>> networks.

>> What I would like to know is what this implies *substantively* for the analysis of these networks:

>> Should criteria for

>> convergence be relaxed in such settings, i.e., is it also fine for the density plots to be asymmetric?

>> Or does this

>> simply mean that results from these networks should not be interpreted at all?

>> Any help would be greatly appreciated.

>> Best,

>> David

>> --

>> David Kretschmer

>> Universität Mannheim

>> Mannheimer Zentrum für Europäische Sozialforschung (MZES)

>> A5, 6

>> 68159 Mannheim

>> Tel.: +49-621-181-2024

>> ****************************************************************

>> Professor Emerita of Sociology and Statistics

>> Box 354322

>> University of Washington

>> Seattle, WA 98195-4322

>> Office: (206) 685-3402

>> Dept Office: (206) 543-5882, 543-7237

>> Fax: (206) 685-7419

>> morrism at u.washington.edu

>> http://faculty.washington.edu/morrism/

>>

>

> ****************************************************************

> Professor Emerita of Sociology and Statistics

> Box 354322

> University of Washington

> Seattle, WA 98195-4322

>

> Office: (206) 685-3402

> Dept Office: (206) 543-5882, 543-7237

> Fax: (206) 685-7419

>

> morrism at u.washington.edu

> http://faculty.washington.edu/morrism/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/statnet_help/attachments/20201007/c54a55aa/attachment.html>


More information about the statnet_help mailing list