[statnet_help] ERGM and statistical power (and effect size)

Carter T. Butts buttsc at uci.edu
Sat May 16 16:19:40 PDT 2020

Hi, Barnaba -

This depends on what you mean by "effect size" (and, for that matter,
the notion of power that you want to use).  Your coefficients may be
directly interpreted in terms of their effect on the conditional
log-odds of a tie being present, and this is the most immediate and
natural notion of "effect size" in a TERGM.  One can also talk about
"effect sizes" in terms of quantities such as e.g. the difference in
conditional expectations of selected network properties with a
coefficient at its estimated value versus the coefficient set to zero (a
"knock-out" study, if you will), with these quantities being
approximated by simulation.  This can be a very useful and insightful
approach, but it requires you to be specific about what, precisely, you
want to know about the relationship between the coefficient of interest
and model behavior.  There is no "one size fits all" approach to this
family of questions; often, however, your substantive problem will
strongly motivate a particular choice of outcome, and looking at how
this varies as a function of your parameter of interest can give you a
lot of insight.

(For example, let's say that you are interested in the amount of mixing
between two groups, as assessed e.g. by a nodemix statistic.  You fit a
TERGM with a nodemix effect (among other things), and obtain an
associated parameter estimate.  What is the marginal impact of that
parameter on between-group mixing?  One way to probe that question is to
simulate draws from the fitted model, and from the same model with the
nodemix parameter set to zero, comparing the mean nodemix statistic in
the two cases.  The difference tells you how much more or less mixing
you would expect to obtain, with all other social forces held constant,
if the specific force parameterized by your nodemix term were not
active.  This is only one of many comparisons that one can perform, but
it illustrates the concept.)

With respect to power, this always becomes complex as soon as one leaves
the world of simple null hypothesis tests.  As a purely practical
matter, you may get more mileage out of answering a closely related but
distinct question: what hypotheses regarding your parameters can you
/not/ reject, given the data?  This amounts to looking at your
confidence intervals.  E.g., if you have a nodematch effect for
membership in group A with a point estimate of 1 and a 95% CI of 0.5 to
1.5, then an associated null hypothesis test would reject the hypothesis
(and the 0.05 level) that the conditional log-odds of an i,j edge are
increased by less than 0.5 when  i and j both belong to A, or likewise
that the conditional log-odds of an i,j edge are increased by more than
1.5 in the same circumstance.  You can thus /exclude/ (in a Frequentist
sense) effect sizes (on log-odds scale) smaller than 0.5, or larger than
1.5.  If you have very little power for estimating an effect, you will
generally find that the range of values that cannot be excluded is very
large (i.e., the CIs are wide); by turns, if your CIs are small, then
this is telling you that you have enough precision to be able to exclude
all values (at least in terms of rejection of the associated null
hypothesis test) outside of a very narrow range.  If your reviewer is
not asking about power merely for the satisfaction of asking about it,
then you may well be able to get at their substantive concern by a
closer interpretation of your confidence intervals.  (This is especially
true for non-significant results, where you may not be able to determine
the sign of an effect, but may still be able to reject the hypothesis
that the effect is large.  From a substantive standpoint, this often
enough to falsify a theory.)

Finally, it is possible to define R^2-like measures for TERGMs et al.,
but to date I don't know that anyone has found such things to be very
helpful.  In particular, you can use any of the many deviance reduction
indices (aka "deviance R^2" measures) for an ERGM or a TERGM, as one
would with any other model with a well-defined deviance.  The reduction
in deviance from the null deviance is certainly associated with improved
fit (and is of course the basis for the AIC et al.), but my sense is
that measures like 1-(residual deviance)/(null deviance) are only weakly
and heuristically related to how well the model works in practice. 
Still, if your reviewer wants a deviance R^2, it's an easy and probably
harmless thing to provide.

These are complex questions, but hopefully the above is useful!


On 5/16/20 9:42 AM, Barnaba Danieluk wrote:

> Dear Statnet,


> One of the reviewers asked me about statistical power of my TERGM

> model. Is there any parameter in ERGM (TERGM) that I can use as a

> effect size indicator? Is it something similar to R-square in ERGM?

> And how can I count the statistical power of my analysis?


> I would be very grateful for every advice. Thank you in advance


> Best regards,

> Barnaba


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/statnet_help/attachments/20200516/f67b9d55/attachment.html>

More information about the statnet_help mailing list