[statnet_help] Modeling group formation with ERGMs

Hoffman Marion
Thu Aug 20 02:59:52 PDT 2020

Hi Flavien,

It just happens that I've been working on a model specifically for such data (together with Per Block and Tom Snijders), since I believe classic ERGMs are not ideally suited for them. The idea is to define exponential family distributions for partitions of nodes (sets of exclusive groups of nodes) instead of networks, in order to capture the particular constraints and dependencies of this type of group structures.
Much of the theory relies on similar bases as the ERGM, but model specifications and interpretations differ and sufficient statistics/effects are defined at the level of groups rather than dyads. For example, modelling homophily at the group level can then be done in different ways (for example you might say that a group is homophilous if everyone is the same, or if everyone has at least 1 person similar to them in the group, etc.).

So far I have used it to model the composition of teams formed during hackathons and physical group gatherings, so it seems to fit to your problem. Unfortunately the paper explaining all the math and strategies to specify/estimate such model is in the submission process right now... But if you're interested, I can share a conference presentation and maybe we can have a chat.

Let me know if you're interested. I hope it helps you figuring out how to analyze your data.

Best regards!

Marion Hoffman
PhD Student
Chair of Social Networks
ETH Zürich
Weinbergstrasse 109, H17
8006 Zürich, Switzerland
+33 6 31 23 69 24
From: statnet_help <statnet_help-bounces at mailman13.u.washington.edu> on behalf of Flavien Ganter <fg2465 at columbia.edu>
Sent: 19 August 2020 18:55:09
Sent: 19 August 2020 18:55:09
To: statnet_help at u.washington.edu
Subject: [statnet_help] Modeling group formation with ERGMs

Hi all,

I am working on data from a series of experiments in which participants cluster into groups of participants. All participants should be in one, and only one, group; a group is formed by at least two participants, and the overall number of groups that participant can form is bounded ([4;12]). I'm interested in studying homophily in the group formation process.

I wanted to model this data with a series of ERGMs (one per experiment), by considering edges as representations of group membership; that is, participants in the same group would all be tied to each other, and participants in different groups would never be tied to each other.
So, I would like to specify an ERGM with two constraints:

1. Each node has to be in one, and only one, complete and isolated community, that is, tied with every other node of the community, and not tied with every other node that is not in the community. My understanding is that it would be equivalent to constraining simulated networks to not have any stars / open triads—if there is an indirect path between two nodes, there has to be an edge between these two nodes that closes the triangle.
2. The number of groups is between 4 and 12.

And I would have homophily terms, and a term for the actual number of groups.

Including the homophily terms is straightforward, but I could not find how to implement both constraints and the term for the total number of working groups. Can you see a way to implement these things with statnet::ergm? Or with any other program you can think of? If not, I'd be curious if anyone has ideas on how to deal with this data structure in a different way.

Thanks so much in advance for your insights,

Flavien Ganter
PhD Student & Paul F. Lazarsfeld Fellow
Department of Sociology | Columbia University
501 Knox Hall, 606 West 122nd Street, New York, NY 10027

