From marc.sarazin at ed.ac.uk Tue Jan 5 11:51:06 2021
From: marc.sarazin at ed.ac.uk (SARAZIN Marc)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] "RTP" version of dgwesp / dgwnsp / dgwdsp
Message-ID:
Dear Statnetters,
Happy New Year! I hope you are all keeping well and safe.
Upon looking through the source code of the ERGM package, I noticed a "type" for dgwesp / dgwnsp / dgwdsp that is not documented in the help files: "RTP" (Recursive Two-Path). This effect for dgwesp documents cases where there is both a direct tie between i and j (e.g. i --> j) and one or more symmetric two-paths between i and j (i.e. i <--> k <--> j ).
The dgwesp effect seems to work just fine, and indeed I have been using it in my models. However, there seems to be an issue with dgwnsp (i.e. cases where i <--> k <--> j but there is no direct tie between i and j). Specifically, the dgwnsp RTP effect always calculates to 0, which also means that the value of dgwdsp RTP is always equal to the value of dgwesp RTP. I have found this both in an empirical network and in a test network, specifically designed to test this bug.
I'm e-mailing 1) to see if this is a known bug of the dgwnsp/dgwdsp RTP terms; and 2) if there were any plans to fix these terms in future (I would gladly fix them myself, but this is currently beyond my skill set). Likewise, and secondarily, 3) if there were any known bugs with the dgwesp RTP term (the one that works), which might explain why it isn't mentioned in the help documentation.
Thanks very much!
All the best,
Marc
--
Dr Marc Sarazin
Post-Doctoral Fellow in Education
Moray House School of Education and Sport, University of Edinburgh
Academic Visitor, Department of Education, University of Oxford
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From morrism at uw.edu Mon Jan 11 12:15:28 2021
From: morrism at uw.edu (martina morris)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] "RTP" version of dgwesp / dgwnsp / dgwdsp
In-Reply-To:
References:
Message-ID:
Hi Marc,
Thx for reporting this. This isn't a known bug, so give us a bit to check
and see if it's indeed a bug.
I believe the RTP option is documented, as I see it under ?ergm::gwdsp.
So you might want to make sure you have the most recent version of ergm
installed.
FYI (and for the list's information) -- possible bug reports are best
submitted via our GitHub repositories. For ergm, that is here:
https://github.com/statnet/ergm/
If you don't have a GitHub account, though, feel free to continue using
the statnet_help list.
And the statnet_help list is also the right place for all questions that
aren't source code related.
best,
mm
On Tue, 5 Jan 2021, SARAZIN Marc wrote:
> Dear Statnetters,
>
> Happy New Year! I hope you are all keeping well and safe.
>
> Upon looking through the source code of the ERGM package, I noticed a "type" for dgwesp / dgwnsp /
> dgwdsp that is not documented in the help files: "RTP" (Recursive Two-Path). This effect for dgwesp
> documents cases where there is both a direct tie between i and j (e.g. i --> j) and one or more
> symmetric two-paths between i and j (i.e. i <--> k <--> j ).
>
> The dgwesp effect seems to work just fine, and indeed I have been using it in my models. However,
> there seems to be an issue with dgwnsp (i.e. cases where?i <--> k <--> j but there is no direct tie
> between i and j). Specifically, the dgwnsp RTP effect always calculates to 0, which also means that
> the value of dgwdsp RTP is always equal to the value of dgwesp RTP. I have found this both in an
> empirical network and in a test network, specifically designed to test this bug.
>
> I'm e-mailing 1) to see if this is a known bug of the dgwnsp/dgwdsp RTP terms; and 2) if there were
> any plans to fix these terms in future (I would gladly fix them myself, but this is currently beyond
> my skill set). Likewise, and secondarily, 3) if there were any known bugs with the dgwesp RTP term
> (the one that works), which might explain why it isn't mentioned in the help documentation.
>
> Thanks very much!
> All the best,
> Marc
>
> --
> Dr Marc Sarazin
> Post-Doctoral Fellow in Education
> Moray House School of Education and Sport, University of Edinburgh
> Academic Visitor, Department of Education, University of Oxford
> The University of Edinburgh is a charitable body, registered in Scotland, with registration number
> SC005336.
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
Office: (206) 685-3402
Dept Office: (206) 543-5882, 543-7237
Fax: (206) 685-7419
morrism@u.washington.edu
http://faculty.washington.edu/morrism/
From morrism at uw.edu Tue Jan 12 16:55:39 2021
From: morrism at uw.edu (martina morris)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] "RTP" version of dgwesp / dgwnsp / dgwdsp
In-Reply-To:
References:
Message-ID:
Marc -- would you please:
1. open an issue in the ergm GitHub repo for this, and
2. post your test code for the RTP term
many thx,
mm
On Tue, 5 Jan 2021, SARAZIN Marc wrote:
> Dear Statnetters,
>
> Happy New Year! I hope you are all keeping well and safe.
>
> Upon looking through the source code of the ERGM package, I noticed a "type" for dgwesp / dgwnsp /
> dgwdsp that is not documented in the help files: "RTP" (Recursive Two-Path). This effect for dgwesp
> documents cases where there is both a direct tie between i and j (e.g. i --> j) and one or more
> symmetric two-paths between i and j (i.e. i <--> k <--> j ).
>
> The dgwesp effect seems to work just fine, and indeed I have been using it in my models. However,
> there seems to be an issue with dgwnsp (i.e. cases where?i <--> k <--> j but there is no direct tie
> between i and j). Specifically, the dgwnsp RTP effect always calculates to 0, which also means that
> the value of dgwdsp RTP is always equal to the value of dgwesp RTP. I have found this both in an
> empirical network and in a test network, specifically designed to test this bug.
>
> I'm e-mailing 1) to see if this is a known bug of the dgwnsp/dgwdsp RTP terms; and 2) if there were
> any plans to fix these terms in future (I would gladly fix them myself, but this is currently beyond
> my skill set). Likewise, and secondarily, 3) if there were any known bugs with the dgwesp RTP term
> (the one that works), which might explain why it isn't mentioned in the help documentation.
>
> Thanks very much!
> All the best,
> Marc
>
> --
> Dr Marc Sarazin
> Post-Doctoral Fellow in Education
> Moray House School of Education and Sport, University of Edinburgh
> Academic Visitor, Department of Education, University of Oxford
> The University of Edinburgh is a charitable body, registered in Scotland, with registration number
> SC005336.
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
Office: (206) 685-3402
Dept Office: (206) 543-5882, 543-7237
Fax: (206) 685-7419
morrism@u.washington.edu
http://faculty.washington.edu/morrism/
From e0570515 at u.nus.edu Thu Jan 14 18:55:55 2021
From: e0570515 at u.nus.edu (Cui Qinghua)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] An inquiry about where to find the code
Message-ID: <84E45CAF-2931-49A1-B8D3-3CC97423C426@contoso.com>
Dear statnet_help,
I am looking for the code which Professor Marijtje A.J. van Duijna said she provided it on the statnet website in her paper. The paper name is ? Comparison of Maximum Pseudo Likelihood and Maximum Likelihood Estimation of Exponential Family Random Graph Models ?.
[cid:image001.png@01D6EB2C.FEFA8E50]
This Screenshot is from the paper.
Could you please tell me where I can find it?
Thank you very much.
Kind regards,
Qinghua
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 58552 bytes
Desc: image001.png
URL:
From morrism at uw.edu Mon Jan 18 13:57:36 2021
From: morrism at uw.edu (martina morris)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] An inquiry about where to find the code
In-Reply-To: <84E45CAF-2931-49A1-B8D3-3CC97423C426@contoso.com>
References: <84E45CAF-2931-49A1-B8D3-3CC97423C426@contoso.com>
Message-ID:
Hi Qinghua,
We moved/updated the website last year, and in the process, we restored
items selectively. But I've looked in our archive, and I don't see it
there either.
Have reached out to Marijtje to see if she still has the code.
best,
mm
On Fri, 15 Jan 2021, Cui Qinghua wrote:
>
> Dear statnet_help,
>
>
>
> I am looking for the code which Professor Marijtje A.J. van Duijna said she provided it on the
> statnet website in her paper. The paper name is ? Comparison of Maximum Pseudo Likelihood and Maximum
> Likelihood Estimation of Exponential Family Random Graph Models ?.
>
>
>
>
>
> [IMAGE]
> This Screenshot is from the paper.
>
> Could you please tell me where I can find it?
>
>
>
> Thank you very much.
>
>
>
> Kind regards,
>
> Qinghua
>
>
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
Office: (206) 685-3402
Dept Office: (206) 543-5882, 543-7237
Fax: (206) 685-7419
morrism@u.washington.edu
http://faculty.washington.edu/morrism/
From handcock at stat.ucla.edu Thu Jan 28 00:24:14 2021
From: handcock at stat.ucla.edu (Mark S. Handcock)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] An inquiry about where to find the code
In-Reply-To:
References: <84E45CAF-2931-49A1-B8D3-3CC97423C426@contoso.com>
Message-ID:
Hi Qinghua,
I have placed the code on github at the address:
https://github.com/handcock/ERGM_MPLE_MLE
Best,
Mark
Istat
On 1/18/21 1:57 PM, martina morris wrote:
> Hi Qinghua,
>
> We moved/updated the website last year, and in the process, we
> restored items selectively.? But I've looked in our archive, and I
> don't see it there either.
>
> Have reached out to Marijtje to see if she still has the code.
>
> best,
> mm
>
> On Fri, 15 Jan 2021, Cui Qinghua wrote:
>
>>
>> Dear statnet_help,
>>
>>
>>
>> I am looking for the code which Professor Marijtje A.J. van Duijna
>> said she provided it on the
>> statnet website in her paper. The paper name is ? Comparison of
>> Maximum Pseudo Likelihood and Maximum
>> Likelihood Estimation of Exponential Family Random Graph Models ?.
>>
>>
>>
>>
>>
>> [IMAGE]
>> This Screenshot is from the paper.
>>
>> Could you please tell me where I can find it?
>>
>>
>>
>> Thank you very much.
>>
>>
>>
>> Kind regards,
>>
>> Qinghua
>>
>>
>>
>
> ****************************************************************
> ?Professor Emerita of Sociology and Statistics
> ?Box 354322
> ?University of Washington
> ?Seattle, WA 98195-4322
>
> ?Office:??????? (206) 685-3402
> ?Dept Office:?? (206) 543-5882, 543-7237
> ?Fax:?????????? (206) 685-7419
>
> morrism@u.washington.edu
> http://faculty.washington.edu/morrism/
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From morrism at uw.edu Wed Feb 3 16:02:21 2021
From: morrism at uw.edu (martina morris)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] "RTP" version of dgwesp / dgwnsp / dgwdsp
In-Reply-To:
References:
Message-ID:
Hi Marc,
Pavel has implemented a fix for this. If you'd like to give it a try,
you just need to install ergm from GitHub (master branch).
It would be great if you could give it a spin and see if works as you
think it should.
best,
Martina
On Tue, 5 Jan 2021, SARAZIN Marc wrote:
> Dear Statnetters,
>
> Happy New Year! I hope you are all keeping well and safe.
>
> Upon looking through the source code of the ERGM package, I noticed a "type" for dgwesp / dgwnsp / dgwdsp that is not
> documented in the help files: "RTP" (Recursive Two-Path). This effect for dgwesp documents cases where there is both a
> direct tie between i and j (e.g. i --> j) and one or more symmetric two-paths between i and j (i.e. i <--> k <--> j ).
>
> The dgwesp effect seems to work just fine, and indeed I have been using it in my models. However, there seems to be an
> issue with dgwnsp (i.e. cases where?i <--> k <--> j but there is no direct tie between i and j). Specifically, the dgwnsp
> RTP effect always calculates to 0, which also means that the value of dgwdsp RTP is always equal to the value of dgwesp
> RTP. I have found this both in an empirical network and in a test network, specifically designed to test this bug.
>
> I'm e-mailing 1) to see if this is a known bug of the dgwnsp/dgwdsp RTP terms; and 2) if there were any plans to fix
> these terms in future (I would gladly fix them myself, but this is currently beyond my skill set). Likewise, and
> secondarily, 3) if there were any known bugs with the dgwesp RTP term (the one that works), which might explain why it
> isn't mentioned in the help documentation.
>
> Thanks very much!
> All the best,
> Marc
>
> --
> Dr Marc Sarazin
> Post-Doctoral Fellow in Education
> Moray House School of Education and Sport, University of Edinburgh
> Academic Visitor, Department of Education, University of Oxford
> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
Office: (206) 685-3402
Dept Office: (206) 543-5882, 543-7237
Fax: (206) 685-7419
morrism@u.washington.edu
http://faculty.washington.edu/morrism/
From Timothee.Chabot at eui.eu Sat Feb 6 00:51:37 2021
From: Timothee.Chabot at eui.eu (=?iso-8859-1?Q?Chabot=2C_Timoth=E9e_Pierre_Jules?=)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] using "timecov( )" and "memory( )" terms in R BTERGM
Message-ID:
Dear Statnet Members,
I am currently trying to fit a bootstrapped TERGM on a directed friendship network with 4 observations, about 250 nodes per wave and a density between 0.065 and 0.069 (more or less 4000 ties per observation ; at each time transition, ~60% of ties remain and ~40% are dissolved and replaced by new ties). I am having trouble getting my models to fit without degeneracy, and therefore have a few questions.
1) I suspect one of the reason the models I specify are degenerate is because of unmodeled time heterogeneity in the parameters (also, the network density changes quite a bit between waves, with a curved pattern : 0.069, 0.065, 0.066 and 0.068). My initial idea was to use the terms "edges" in combination with "timecov" to have different baseline density parameters for each transition (either as a continuous time trend with a single "timecov", or with one coefficient for each transition using the "maximum" and "minimum" argument from timecov). However, I would also like to have a "memory" term to account for tie inertia accross waves ; but it seems that btergm does not manage to handle "memory" and "timecov" at the same time, even for very simple specifications. The model does not converge and I get the following error message : "Algorithm did not converge. There might be a collinearity between predictors and/or dependent networks at one or more time steps.".
Yet the "mtergm" function seems to do that just fine (though I will not be able to use it for more complex models, as they'd take too much computing power with mtergm). Below I paste an example with built-in btergm data, where the issue is apparent (note that I have the same issue on my networks and regardless of whether I do a simple specification, or a more complex one with various attribute-related or endogenous effects).
### Example R Code 1 ###
library(statnet)
library(btergm)
data("knecht", package = "xergm.common") # load data
for (i in 1:length(friendship)) {
rownames(friendship[[i]]) <- 1:nrow(friendship[[i]])
colnames(friendship[[i]]) <- 1:ncol(friendship[[i]])
}
rownames(primary) <- rownames(friendship[[1]])
colnames(primary) <- colnames(friendship[[1]])
friendship <- handleMissings(friendship, na = 10, method = "remove")
friendship <- handleMissings(friendship, na = NA, method = "fillmode") # handle missings
# Now estimate the same, simple model, one with btergm and the other with mtergm.
m1 <- btergm(friendship ~ edges + memory("autoregression") + timecov(), R = 100)
m2 <- mtergm(friendship ~ edges + memory("autoregression") + timecov())
summary(m2)
# btergm does not converge whereas mtergm seems to do fine.
### End ###
Do you know why that is the case ? And is there a way for me to model different densities at each wave, while also accounting for time inertia ? I saw in the btergm vignette that both terms could indeed be associated in certain models (the simple example with the network of international alliances), so why does it not work here ? Is it a matter of the number of observations available ?
2) I have been trying to offset certain terms before launching the estimation (as a way to play around with the models and try to understand what was going on). However, whenever I try to pass coefficients through offset.coef, the "mtergm" function gives me an error message, saying : "formal argument 'offset.coef' corresponds to several arguments provided". It seems as if there is a default value of offset.coef that the function uses, and me passing my own values explicitly does not override it, such that the function ends up with 2 different values for offset.coef. I guess I am not using the offset option correctly, but could not find what I am doing wrong.
Again, see an example with btergm data (run the above piece of code first to load data) :
### Example R Code 2 ###
# Find some reasonable offset values.
m3 <- mtergm(friendship ~ edges + mutual + delrecip)
summary(a) # ok, so we take values close to these coefficients
myoffset <- c(2,1.5)
# Now try to re-estimate with only the "edges" term free to change :
m4 <- mtergm(friendship ~ edges + offset(mutual) + offset(delrecip),offset.coef = myoffset) # estimation won t start
### End ###
If you have any suggestion regarding either point, that would help me a lot. Many thanks in advance !
All the best,
Timoth?e Chabot
PhD researcher
European University Institute - SPS department
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination, distribution, forwarding, or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited without the express permission of the sender. If you received this communication in error, please contact the sender and delete the material from any computer.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From philip.leifeld at essex.ac.uk Sat Feb 6 12:11:07 2021
From: philip.leifeld at essex.ac.uk (Leifeld, Philip)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] using "timecov( )" and "memory( )" terms in R
BTERGM
In-Reply-To:
References:
Message-ID:
Hi Timoth?e,
With only four time steps, the bootstrap is probably not very effective
because it samples across conditionally independent time points, not
nodes or dyads. That should also explain why it fails if you combine the
inclusion of a time trend with a term for autoregression of ties: First
the inclusion of that autoregressive term through the memory statistic
cuts off the first time point from the list of networks to be modeled
because there is no prior network to include in the statistic. That
leaves you with only three remaining networks to model, which is
probably too little for effective bootstrapping (though four wouldn't
have been much more sensible to begin with). Second, the time trend
using the timecov model term introduces dyadic covariates full of values
1, 2, or 3 to indicate where you are on the timeline (or even a discrete
function of time with some event time as cutoff, as you were
mentioning). That leaves you with a situation in which you are creating
bootstrap samples of three networks with replacement, and it will
occasionally happen that you select the same three networks in a
bootstrap sample, which would turn the model term for temporal
heterogeneity (timecov) into a constant, and that can be expected to
make the model unidentified. So I would not use the bootstrap method
here; it was made for long time series of networks, such as
international conflict with annual measurements since 1816 or something
like that. With such data, it is effective and asymptotically unbiased.
With small time series like this one, use MCMC-MLE, which is unbiased
either way, but would take prohibitively long to estimate with longer
time series such as the one I mentioned.
You can use the mtergm function in the btergm package for this purpose,
which simply acts as a wrapper for the ergm function in statnet's ergm
package by putting the networks and covariates into a block-diagonal
shape and adding structural zeros for the off-diagonal elements since
ties across different time points are not allowed. These structural
zeros are added by using an offset matrix. That's why your attempt to
add another offset matrix fails; the argument has simply already been
used internally. It *ought to be* possible for the function to catch
such a situation and combine the ties in the structural zero matrix with
the offset matrix you are providing, but we never really needed this
functionality ourselves and thus never implemented it. Please do feel
free to start a pull request at github.com/leifeld/btergm/ should you
feel like adding it to the code base. If you require this functionality
but don't want to start a pull request, you could try using the
tergmprepare function in the btergm package for getting the data into
the right shape in the usual way and then calling the ergm function
yourself after merging the offset matrix with your own offset matrix. I
am not sure off the top of my head if the tergmprepare function is
exported, so you may need to source it from GitHub for now. Or you could
simply set up the data as a block-diagonal structure yourself and do it
without the help of the btergm package (though that can quickly become
nasty with all the changes in node composition, of which tergmprepare
takes care).
I would think having three time points with about 250 nodes each (so
essentially like a single network with 750 nodes) should be feasible
with MCMC-MLE as long as your theory fits the data well (which may not
be the case here from what you describe).
Perhaps it is useful to know that I recently added another function to
the btergm package, which acts as a TERGM wrapper for the Bergm package
for Bayesian estimation of ERGMs. It's not on CRAN yet as I haven't done
a lot of testing and there may still be some glitches, but you could
check out the version on GitHub if you'd like to give it a try. The
function is unconfusingly called tbergm. It's a wrapper for bergm, which
is itself, I believe, a wrapper for ergm that takes away the MLE
component and wraps the MCMC sampler into a Bayesian MCMC estimation
routine. I don't know if it solves your estimation worries, but perhaps
worth a try.
Note that 40 per cent of node composition change across time points is
quite a bit. While it is not per se a problem for estimating the model,
it may have consequences for what parts of the data are being modeled.
See the JStatSoft paper for how this is dealt with in the btergm package.
If you have any follow-up questions, may I ask you to open an issue on
the GitHub issue tracker for btergm? btergm is not really part of
statnet, just building on it.
Take care,
Philip
--
Philip Leifeld
Professor, Department of Government
University of Essex
http://www.philipleifeld.com
On 06/02/2021 08:51, Chabot, Timoth?e Pierre Jules wrote:
> Dear Statnet Members,
>
> I am currently trying to fit a bootstrapped TERGM on a directed
> friendship network with 4 observations, about 250 nodes per wave and a
> density between 0.065 and 0.069 (more or less 4000 ties per observation
> ; at each time transition, ~60% of ties remain and ~40% are dissolved
> and replaced by new ties). I am having trouble getting my models to fit
> without degeneracy, and therefore have a few questions.
>
> 1) I suspect one of the reason the models I specify are degenerate is
> because of unmodeled time heterogeneity in the parameters (also, the
> network density changes quite a bit between waves, with a curved pattern
> : 0.069, 0.065, 0.066 and 0.068). My initial idea was to use the terms
> "edges" in combination with "timecov" to have different baseline density
> parameters for each transition (either as a continuous time trend with a
> single "timecov", or with one coefficient for each transition using the
> "maximum" and "minimum" argument from timecov). However, I would also
> like to have a "memory" term to account for tie inertia accross waves ;
> but it seems that btergm does not manage to handle "memory" and
> "timecov" at the same time, even for very simple specifications. The
> model does not converge and I get the following error message :
> "/Algorithm did not converge. There might be a collinearity ?between
> predictors and/or dependent networks at one or more time steps./".
>
> Yet the "mtergm" function seems to do that just fine (though I will not
> be able to use it for more complex models, as they'd take too much
> computing power with mtergm). Below I paste an example with built-in
> btergm data, where the issue is apparent (note that I have the same
> issue on my networks and regardless of whether I do a simple
> specification, or a more complex one with various attribute-related or
> endogenous effects).
>
> ### Example R Code 1 ###
>
> library(statnet)
> library(btergm)
>
> data("knecht", package = "xergm.common") # load data
> for (i in 1:length(friendship)) {
> ? rownames(friendship[[i]]) <- 1:nrow(friendship[[i]])
> ? colnames(friendship[[i]]) <- 1:ncol(friendship[[i]])
> ? ? }
> rownames(primary) <- rownames(friendship[[1]])
> colnames(primary) <- colnames(friendship[[1]])
>
> friendship <- handleMissings(friendship, na = 10, method = "remove")
> friendship <- handleMissings(friendship, na = NA, method = "fillmode") #
> handle missings
>
> # Now estimate the same, simple model, one with btergm and the other
> with mtergm.
>
> m1 <- btergm(friendship ~ edges + memory("autoregression") + timecov(),
> R = 100)
> m2 <- mtergm(friendship ~ edges + memory("autoregression") + timecov())
>
> summary(m2)
>
> # btergm does not converge whereas mtergm seems to do fine.
>
> ### End ###
>
> Do you know why that is the case ? And is there a way for me to model
> different densities at each wave, while also accounting for time inertia
> ? I saw in the btergm vignette that both terms could indeed be
> associated in certain models (the simple example with the network of
> international alliances), so why does it not work here ? Is it a matter
> of the number of observations available ?
>
> 2) I have been trying to offset certain terms before launching the
> estimation (as a way to play around with the models and try to
> understand what was going on). However, whenever I try to pass
> coefficients through offset.coef, the "mtergm" function gives me an
> error message, saying : "/formal argument 'offset.coef' corresponds to
> several arguments provided/". It seems as if there is a default value of
> offset.coef that the function uses, and me passing my own values
> explicitly does not override it, such that the function ends up with 2
> different values for offset.coef. I guess I am not using the offset
> option correctly, but could not find what I am doing wrong.
>
> Again, see an example with btergm data?(run the above piece of code
> first to load data) :
>
> ### Example R Code 2 ###
>
> # Find some reasonable offset values.
> m3 <- mtergm(friendship ~ edges + mutual + delrecip)
> summary(a) # ok, so we take values close to these coefficients
>
> myoffset <- c(2,1.5)
>
> # Now try to re-estimate with only the "edges" term free to change :
> m4 <- mtergm(friendship ~ edges + offset(mutual) +
> offset(delrecip),offset.coef = myoffset) # estimation won t start
>
> ### End ###
>
> If you have any suggestion regarding either point, that would help me a
> lot. Many thanks in advance !
>
> All the best,
>
> Timoth?e Chabot
> PhD researcher
> European University Institute - SPS department
>
> The information transmitted is intended only for the person or entity to
> which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination, distribution,
> forwarding, or other use of, or taking of any action in reliance upon,
> this information by persons or entities other than the intended
> recipient is prohibited without the express permission of the sender. If
> you received this communication in error, please contact the sender and
> delete the material from any computer.
>
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>
From p.krivitsky at unsw.edu.au Tue Mar 16 22:24:53 2021
From: p.krivitsky at unsw.edu.au (Pavel Krivitsky)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] ergm 4.0, tergm 4.0,
and others have been published to GitHub
Message-ID:
Dear All,
We have published a major update of the following packages to GitHub:
* ergm
* tergm
* ergm.count
* ergm.rank
More than 3 years in the making, this update brings major improvements
to Statnet's computational and modelling capabilities.
You can try them out by running:
install.packages("remotes")
library(remotes)
install_github("statnet/ergm")
install_github("statnet/tergm")
install_github("statnet/ergm.count")
install_github("statnet/ergm.rank")
The user interfaces of functions are largely the same, though some
control parameters may behave somewhat differently.
We hope to release these updates to CRAN in the next few months.
Enjoy!
Pavel Krivitsky
on behalf of the Statnet Team
P.S. If your package depends on ergm, the are some API changes, but the
R and C APIs are now cleaner and better documented. And, in particular,
the user terms API is backwards compatible: old implementations will
still work, though new, more efficient implementations may now be
possible using the new APIs.
From mkatz at isenberg.umass.edu Tue Mar 23 11:15:43 2021
From: mkatz at isenberg.umass.edu (Matthew Katz)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] statnet error: "Error in .Call(getEdgeAttribute R, el,
attrname, na.omit, null.na, deleted,
edges.omit) NULL value passed as symbol address
Message-ID:
Hi Statnet Group,
I am new to using Statnet, and am receiving the same error code over and over again. The IT department at my college was unable to identify the issue. We spent a solid two hours trying to identify the problem yesterday, but were unable to do so.
Sometimes the below code works. Sometimes it freezes and forces RStudio to abort. But 95% of the time it returns the same error code.
Error in .Call(getEdgeAttribute_R, el, attrname, na.omit, null.na, deleted.edges.omit) :
NULL value passed as symbol address
The only statnet command (I've probably tried 10 different ones) that seems to always work is network.size . Once I use the next command, the error message returns. I have already installed and loaded statnet and UserNetR, before using the command below
> network.size(Moreno)
[1] 33
> gden(Moreno)
Error in .Call(getEdgeAttribute_R, el, attrname, na.omit, null.na, deleted.edges.omit) :
NULL value passed as symbol address
The dataset in the above examples is one from UserNetR package. I tried to create my own basic matrix, and ran into the same problem.
> mattmat1 <- rbind(c(0,1,1,0,0),
+ c(0,0,1,1,1),
+ c(1,1,0,0,0),
+ c(0,0,0,1,1),
+ c(1,1,1,1,0))
> rownames(mattmat1) <- c("A", "B", "C", "D", "E")
> colnames(mattmat1) <- c("A", "B", "C", "D", "E")
> matt1 <- network(mattmat1, matrix.type="adjacency")
Error in .Call(setVertexAttribute_R, x, attrname, value, v) :
NULL value passed as symbol address
Per the mailing instructions, I also ran the two lines of code suggested for problems related to installing. Here is the output:
> sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] statnet_2019.6 tsna_0.3.1 sna_2.6 statnet.common_4.4.1
[5] ergm.count_3.4.0 tergm_3.7.0 networkDynamic_0.10.1 ergm_3.11.0
[9] network_1.16.1 UserNetR_2.62 igraphdata_1.0.1 igraph_1.2.6
loaded via a namespace (and not attached):
[1] magrittr_2.0.1 MASS_7.3-53.1 lattice_0.20-41 rlang_0.4.10 fansi_0.4.2
[6] tools_4.0.4 parallel_4.0.4 grid_4.0.4 nlme_3.1-152 lpSolve_5.6.15
[11] rle_0.9.2 utf8_1.2.1 coda_0.19-4 ellipsis_0.3.1 tibble_3.1.0
[16] lifecycle_1.0.0 crayon_1.4.1 Matrix_1.3-2 purrr_0.3.4 vctrs_0.3.6
[21] trust_0.1-8 robustbase_0.93-7 DEoptimR_1.0-8 compiler_4.0.4 pillar_1.5.1
[26] pkgconfig_2.0.3
>
> getOption("pkgType")
[1] "both"
As I mentioned above, sometimes everything works fine. But the majority of time, I get the error messages. I welcome any suggestions, as I'm looking forward to learning to use statnet!
Thank you!
Matt Katz
UMass Amherst
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From jmoody77 at duke.edu Wed Mar 24 10:56:34 2021
From: jmoody77 at duke.edu (James Moody)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] tutorial for bipartite ERGM
Message-ID:
Hey Folks -
Hope you are well.
Does anyone have a tutorial already setup for estimating ERGMs on bipartite/two-mode networks?
Thanks in advance,
Jim
James Moody
Robert O. Keohane Professor of Sociology
Founding Director, Duke Network Analysis Center
Duke University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From tindall at mail.ubc.ca Wed Mar 24 11:00:22 2021
From: tindall at mail.ubc.ca (Tindall, David)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] tutorial for bipartite ERGM
In-Reply-To:
References:
Message-ID: <54be4e1668b3484e9dc8b99d56e83527@mail.ubc.ca>
I'll be very interested in the answer to this, as well!
DBT
________________________________
David Tindall Professor Department of Sociology, University of British Columbia Chair Environment and Society Minor, Faculty of Arts, University of British Columbia Mailing address: Department of Sociology University of British Columbia 6303 N.W. Marine Drive Vancouver, British Columbia Canada V6T 1Z1 Office Location: Anthropology and Sociology Building Room 1317 E-mail: tindall@mail.ubc.ca
________________________________
From: statnet_help on behalf of James Moody
Sent: March 24, 2021 10:56 AM
To: 'statnet_help@u.washington.edu'
Subject: [statnet_help] tutorial for bipartite ERGM
[CAUTION: Non-UBC Email]
Hey Folks ?
Hope you are well.
Does anyone have a tutorial already setup for estimating ERGMs on bipartite/two-mode networks?
Thanks in advance,
Jim
James Moody
Robert O. Keohane Professor of Sociology
Founding Director, Duke Network Analysis Center
Duke University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From laura.wolbring at kit.edu Mon Mar 29 04:57:53 2021
From: laura.wolbring at kit.edu (Wolbring, Laura (IFSS))
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] Add odds ratios and 95% confidence interval to ergm
summary table
Message-ID:
Dear all,
I want to add odds ratios and a 95% confidence interval to my ergm summary table including dependence terms (gwdsp, gwdegree, gwesp).
To calculate odds ratios, I already used the exp() and sqrt() functions but I have read that standard errors for dependence models are adjusted due to the use of MCMC estimation. Therefore, odds ratios for dependence models should only be calculated by editing the underlying code of the summary table.
I would be very grateful for any advice on how to add odds ratios to the ergm summary table.
Thanks in advance and best regards
Laura
Karlsruhe Institute of Technology (KIT)
Institute of Sports and Sports Science (IfSS)
Laura Wolbring
Research Assistant
Engler-Bunte-Ring 15
Building 40.40
Room -110
76131 Karlsruhe
Phone : +49 721 608-46696
E-Mail: laura.wolbring@kit.edu
Web: www.sport.kit.edu
KIT - Die Forschungsuniversit?t in der Helmholtz-Gemeinschaft
Das KIT ist seit 2010 als familiengerechte Hochschule zertifiziert.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From p.krivitsky at unsw.edu.au Mon Mar 29 16:16:59 2021
From: p.krivitsky at unsw.edu.au (Pavel N. Krivitsky)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] Add odds ratios and 95% confidence interval to
ergm summary table
In-Reply-To:
References:
Message-ID:
Dear Laura,
The quickest way to calculate the odds ratios is probably to use the
built-in confint() function to get the intervals for the parameters and
then call exp() on those.
You can extract the summary table by calling coef(summary(ergm_fit)).
Putting it all together, something like
cbind(coef(summary(ergm_fit)), exp(confint(ergm_fit)))
should do the trick.
I hope this helps,
Pavel
On Mon, 2021-03-29 at 11:57 +0000, Wolbring, Laura (IFSS) wrote:
> Dear all,
> ?
> I want to add odds ratios and a 95% confidence interval to my ergm
> summary table including dependence terms (gwdsp, gwdegree, gwesp).
> To calculate odds ratios, I already used the exp() and sqrt()
> functions but I have read that standard errors for dependence models
> are adjusted due to the use of MCMC estimation. Therefore, odds
> ratios for dependence models should only be calculated by editing the
> underlying code of the summary table.
> ?
> I would be very grateful for any advice on how to add odds ratios to
> the ergm summary table.
> ?
> Thanks in advance and best regards
> Laura
> ?
> Karlsruhe Institute of Technology (KIT)
> Institute of Sports and Sports Science (IfSS)
> ?
> Laura Wolbring
> Research Assistant
> ?
> Engler-Bunte-Ring 15
> Building 40.40
> Room -110
> 76131 Karlsruhe
> Phone : +49 721 608-46696
> ?
> E-Mail:laura.wolbring@kit.edu
> Web: www.sport.kit.edu
> ?
> KIT - Die Forschungsuniversit?t in der Helmholtz-Gemeinschaft
> Das KIT ist seit 2010 als familiengerechte Hochschule zertifiziert.
> ?
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From goodreau at uw.edu Mon Mar 29 16:19:28 2021
From: goodreau at uw.edu (Steven M. Goodreau)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] Network Modeling for Epidemics: Summer short course
at the University of Washington, 2-6 August, 2021
Message-ID:
Network Modeling for Epidemics (NME) is a 5-day short course at the
University of Washington that provides an introduction to stochastic
network models for infectious disease transmission dynamics, with a focus
on empirically based modeling of HIV transmission. It is a ''hands-on''
course, using the EpiModel software package in R (www.epimodel.org).
EpiModel provides a unified framework for statistically based modeling of
dynamic networks from empirical data, and simulation of epidemic dynamics
on these networks. It has a flexible open-source platform for learning and
building several types of epidemic models: deterministic compartmental,
stochastic individual-based, and stochastic network models. Resources
include simple models that run in a browser window, built-in generic models
that provide basic control over population contact patterns, pathogen
properties and demographics, and templates for user-programmed modules that
allow EpiModel to be extended to the full range of pathogens, hosts, and
disease dynamics for advanced research. This course will touch on the
deterministic and individual-based models, but its primary focus is on the
theory, methods and application of network models.
The course uses a mix of lectures, tutorials, and labs with students
working in small groups. On the final day, students work to develop an
EpiModel prototype model (either individually or in groups based on shared
research interests), with input from the instructors, including the lead
EpiModel software developer, Dr. Samuel Jenness.
Returning students: We encourage previous attendees with active modeling
projects to apply to return for a refresher course. The EpiModel package
has been significantly enhanced over the last few years. Returning students
with active projects will have the opportunity to work with course
instructors to address key challenges in the design of their network model
code.
*Dates and location:*
The course will be taught from Monday, August 2 to Friday, August 6 using
remote learning technologies (Zoom, Slack). Hours are 8am-3pm Pacific
Daylight Time (UTC -7), with a meal break. If the COVID reopening process
allows, we may include optional in-person meetings on the University of
Washington campus in Seattle, for those who wish to participate.
*Costs:*
Course fee is $750. We offer a limited number of fee waivers for
pre-doctoral students or for attendees from low income countries.
*Application dates and decision dates:*
* May 1: Application deadline.
* June 1: Decisions will be made and announced.
* July 1: Registration deadline. Late registration is possible through
July 15 with a late fee of $250.
* A waitlist will be established along with rolling admission through
June 15 if space allows.
*Application:*
Apply online at https://catalyst.uw.edu/webq/survey/morrism/404597
Course website and more information: http://statnet.github.io/nme
Please feel free to share widely!
Yours,
Martina Morris, Steve Goodreau and Samuel Jenness
--
*****************************************************************
Steven M. Goodreau / Professor / Dept. of Anthropology
Physical address: Denny Hall M236
Mailing address: Campus Box 353100 / 4216 Memorial Way NE
Univ. of Washington / Seattle WA 98195
1-206-685-3870 (phone) /1-206-543-3285
(fax)http://faculty.washington.edu/goodreau
*****************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From chuck.hayward at colorado.edu Fri Apr 2 10:52:06 2021
From: chuck.hayward at colorado.edu (Chuck Hayward)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] Missing final isolate node
Message-ID:
Hi all,
About 5 years ago, I did a network analysis of a listserv. I?m no programmer, but I pieced together the code from examples I found online and reading lots of the available help resources. I?m now trying to rerun it on another dataset that is set up exactly the same as the original. My node list imports correctly with 153 nodes and I?m separately loading the edgelist (which is just a list of edges based on from/to node definitions, so no isolates are included in the edgelist). When I build the network object from these two, I?m only getting 152 nodes. The final node is missing and happens to be an isolate (it is defined in the node list but does not appear in the edgelist). I modified the edgelist to make the final node have an edge, and then it shows up, so I?m assuming it has something to do with the node being an isolate. Is there a way to fix this problem of a final isolate node missing when creating a network object?
Here is the relevant code:
#read edge lists to create networks.
options(stringsAsFactors = FALSE)
mEL<-read.csv("mEL.csv", as.is=TRUE, header = TRUE)
#load attributes.
mInfo=read.csv("mInfo.csv", as.is = TRUE, header = TRUE)
#create network object.
mnet<-network(mEL, matrix.type="edgelist", vertex.attr = mInfo, directed = TRUE)
Thanks for any insight you can provide,
--------------------------------------------
Chuck Hayward (he/him)
Senior Professional Research Assistant
Ethnography & Evaluation Research (E&ER)
University of Colorado Boulder
https://www.colorado.edu/eer/
Schedule a meeting with me:
https://calendly.com/chuck-hayward
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From buttsc at uci.edu Fri Apr 2 20:26:58 2021
From: buttsc at uci.edu (Carter T. Butts)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] Missing final isolate node
In-Reply-To:
References:
Message-ID:
Hi, Chuck -
Your problem is that your edgelist doesn't contain information on the
number of vertices; when this information is not present, network() will
make its best guess by taking the largest vertex ID referred to in the
edge set as the vertex count.? In your case, that doesn't work due to
having an isolate in last position.
A simple and generic fix is to ensure that your edgelists always have
vertex size information appended.? You can do that by setting the "n"
attribute of the edgelist.? E.g., if your edgelist is in a matrix called
myedgelist, and you have vertnum vertices in network, this will do the
trick:
attr(myedgelist,"n") <- vertnum? #Set the number of vertices to vertnum
Passing this to network() should create a network with the correct
number of vertices.? (Likewise, you can use an sna edgelist, which also
contains this information.)
Hope that helps!
-Carter
On 4/2/21 10:52 AM, Chuck Hayward wrote:
>
> Hi all,
>
> About 5 years ago, I did a network analysis of a listserv. I?m no
> programmer, but I pieced together the code from examples I found
> online and reading lots of the available help resources. I?m now
> trying to rerun it on another dataset that is set up exactly the same
> as the original. My node list imports correctly with 153 nodes and I?m
> separately loading the edgelist (which is just a list of edges based
> on from/to node definitions, so no isolates are included in the
> edgelist). When I build the network object from these two, I?m only
> getting 152 nodes. The final node is missing and happens to be an
> isolate (it is defined in the node list but does not appear in the
> edgelist). I modified the edgelist to make the final node have an
> edge, and then it shows up, so I?m assuming it has something to do
> with the node being an isolate. Is there a way to fix this problem of
> a final isolate node missing when creating a network object?
>
> Here is the relevant code:
>
> #read edge lists to create networks.
>
> options(stringsAsFactors = FALSE)
>
> mEL<-read.csv("mEL.csv", as.is=TRUE, header = TRUE)
>
> #load attributes.
>
> mInfo=read.csv("mInfo.csv", as.is = TRUE, header = TRUE)
>
> #create network object.
>
> mnet<-network(mEL, matrix.type="edgelist", vertex.attr = mInfo,
> directed = TRUE)
>
> Thanks for any insight you can provide,
>
> --------------------------------------------
>
> *Chuck Hayward*(he/him)
>
> Senior Professional Research Assistant
>
> /Ethnography & Evaluation Research (E&ER)/
>
> /University of Colorado Boulder/
>
> https://www.colorado.edu/eer/
>
> Schedule a meeting with me:
>
> https://calendly.com/chuck-hayward
>
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From laura.wolbring at kit.edu Wed Apr 14 08:57:54 2021
From: laura.wolbring at kit.edu (Wolbring, Laura (IFSS))
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] Add odds ratios and 95% confidence interval to
ergm summary table
In-Reply-To:
References:
Message-ID: <23609ade11cd43ae9a5fd479f7707a91@kit.edu>
Dear Pavel,
Thanks a lot, that worked and helped me out!
Just to be sure: Are the odds ratios and confidence intervals calculated with the code you suggest also meaningful for the dependence terms (gwdegree, gwdsp, gwesp)?
I ask because I get the same values for odds ratios and confidence intervals with the code you suggest as when I use a code from Harris (Jenine K. Harris (2014). An Introduction to Exponential Random Graph Modeling. Thousand Oaks: Sage) which she says cannot be applied to models that contain dependence terms.
Thank you very much and best regards,
Laura
Von: Pavel N. Krivitsky
Gesendet: Dienstag, 30. M?rz 2021 01:17
An: Wolbring, Laura (IFSS) ; statnet_help@u.washington.edu
Betreff: Re: [statnet_help] Add odds ratios and 95% confidence interval to ergm summary table
Dear Laura,
The quickest way to calculate the odds ratios is probably to use the built-in confint() function to get the intervals for the parameters and then call exp() on those.
You can extract the summary table by calling coef(summary(ergm_fit)).
Putting it all together, something like
cbind(coef(summary(ergm_fit)), exp(confint(ergm_fit)))
should do the trick.
I hope this helps,
Pavel
On Mon, 2021-03-29 at 11:57 +0000, Wolbring, Laura (IFSS) wrote:
Dear all,
I want to add odds ratios and a 95% confidence interval to my ergm summary table including dependence terms (gwdsp, gwdegree, gwesp).
To calculate odds ratios, I already used the exp() and sqrt() functions but I have read that standard errors for dependence models are adjusted due to the use of MCMC estimation. Therefore, odds ratios for dependence models should only be calculated by editing the underlying code of the summary table.
I would be very grateful for any advice on how to add odds ratios to the ergm summary table.
Thanks in advance and best regards
Laura
Karlsruhe Institute of Technology (KIT)
Institute of Sports and Sports Science (IfSS)
Laura Wolbring
Research Assistant
Engler-Bunte-Ring 15
Building 40.40
Room -110
76131 Karlsruhe
Phone : +49 721 608-46696
E-Mail:laura.wolbring@kit.edu
Web: www.sport.kit.edu
KIT - Die Forschungsuniversit?t in der Helmholtz-Gemeinschaft
Das KIT ist seit 2010 als familiengerechte Hochschule zertifiziert.
_______________________________________________
statnet_help mailing list
statnet_help@u.washington.edu
http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From JIMI.ADAMS at UCDENVER.EDU Wed Apr 14 12:37:54 2021
From: JIMI.ADAMS at UCDENVER.EDU (Adams, Jimi)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] mutuality interaction
Message-ID: <8910D8CA-CC9A-4906-A074-251F0B0CDB75@ucdenver.edu>
HI all,
I have a potentially naive question about how to estimate an effect of interest to us. Effectively, the question is whether homophily on an attribute amplifies social closure processes; for now, let?s just focus on the mutuality (dyadic) version.
?> So we want to estimate whether there?s an extra effect *on mutuality* for those who are the same on an attribute compared to differing on that attribute.
Given some of the options in the mutuality term, we kluged together an approach, but I suspect I may be missing something about how those options combine, and therefore this might not be what we?re after.
?> So, the question is whether either of the options we?ve tried actually gets at what we?re after, and/or if there?s a simpler way to do this correctly?
What we did was estimate a model using the ?same? option in the mutuality term, combined with a general mutual term. So we have:
m3 <- ergm(net~edges + mutual + mutual(same=?attr?))
While I was hoping that combo of effects would (a) be estimable (it was!) & (b) be essentially what we are after, diagnostics and thinking about the opportunities for those to arise led us to think we also needed the ?main effects? so we added those:
m3a <- ergm(net~edges + nodeifactor(?attr?) + nodeofactor(?attr?) + nodematch(?attr?) + mutual + mutual(same=?attr?))
Like I said above, I suspect there?s something I?m missing here in the combinations, and therefore isn?t really testing for what we?re after. So, I?m wondering if that?s the case. And if so, is there a way to actually do what we?re trying to?
Thanks!
jimi
jimi adams
(on leave 2020-21)
Associate Professor, Health & Behavioral Sciences
University of Colorado Denver
o. https://ucdenver.zoom.us/my/jimiadams
p. 303.315.7177 (messages only)
e. jimi.adams@ucdenver.edu
w. jimiadams.com
From goodreau at uw.edu Fri Apr 16 15:48:21 2021
From: goodreau at uw.edu (Steven Goodreau)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] mutuality interaction
In-Reply-To: <8910D8CA-CC9A-4906-A074-251F0B0CDB75@ucdenver.edu>
References: <8910D8CA-CC9A-4906-A074-251F0B0CDB75@ucdenver.edu>
Message-ID: <55ff55d2-5f98-6d48-3749-621b4140417d@uw.edu>
hi jimi, hope all is well with you.
i think the way you're approaching this makes lots of sense.? many folks
would do it the first way you're doing it, but you're right that
including all of the lower-order effects involving that attribute will
provide a purer estimate of the specific phenomenon you're interested
in.? i don't know of an easier way to do this with ergm terms than what
you've done, although that doesn't mean it doesn't exist!
so: was the mutual(same="attr") coefficient positive???
best,
steve
On 4/14/2021 12:37 PM, Adams, Jimi wrote:
> HI all,
> I have a potentially naive question about how to estimate an effect of interest to us. Effectively, the question is whether homophily on an attribute amplifies social closure processes; for now, let?s just focus on the mutuality (dyadic) version.
> ?> So we want to estimate whether there?s an extra effect *on mutuality* for those who are the same on an attribute compared to differing on that attribute.
>
> Given some of the options in the mutuality term, we kluged together an approach, but I suspect I may be missing something about how those options combine, and therefore this might not be what we?re after.
> ?> So, the question is whether either of the options we?ve tried actually gets at what we?re after, and/or if there?s a simpler way to do this correctly?
>
> What we did was estimate a model using the ?same? option in the mutuality term, combined with a general mutual term. So we have:
>
> m3 <- ergm(net~edges + mutual + mutual(same=?attr?))
>
> While I was hoping that combo of effects would (a) be estimable (it was!) & (b) be essentially what we are after, diagnostics and thinking about the opportunities for those to arise led us to think we also needed the ?main effects? so we added those:
>
> m3a <- ergm(net~edges + nodeifactor(?attr?) + nodeofactor(?attr?) + nodematch(?attr?) + mutual + mutual(same=?attr?))
>
> Like I said above, I suspect there?s something I?m missing here in the combinations, and therefore isn?t really testing for what we?re after. So, I?m wondering if that?s the case. And if so, is there a way to actually do what we?re trying to?
>
> Thanks!
> jimi
>
>
> jimi adams
> (on leave 2020-21)
> Associate Professor, Health & Behavioral Sciences
> University of Colorado Denver
>
> o. https://ucdenver.zoom.us/my/jimiadams
> p. 303.315.7177 (messages only)
> e. jimi.adams@ucdenver.edu
> w. jimiadams.com
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
--
*****************************************************************
Steven M. Goodreau / Professor / Dept. of Anthropology
Physical address: Denny Hall M236
Mailing address: Campus Box 353100 / 4216 Memorial Way NE
Univ. of Washington / Seattle WA 98195
1-206-685-3870 (phone) /1-206-543-3285 (fax)
http://faculty.washington.edu/goodreau
*****************************************************************
From jskvoretz at usf.edu Thu Apr 22 06:18:09 2021
From: jskvoretz at usf.edu (Skvoretz, John JR.)
Date: Tue Aug 3 21:58:16 2021
Subject: [statnet_help] advice on interpretation valued graphs
Message-ID:
As an exercise I use Lazaga lawyer data to set up a valued directed graph with a count on edges from 0 to 3 based on the number of tie types between i and j, eg. 3 if i names j as a friend, an advisor, and a collaborator. I then estimate a model with effects for sum and for matching on "rank" which has the two values P and A for partner and associate. so nothing fancy. I pick discrete uniform as the reference distribution (and would appreciate advice on what to consider in making this choice vs the Binomial). Here are the results
vm1r <- ergm(law.tot ~ sum + nodematch("rank"), response="nr", reference=~DiscUnif(0,3))
Monte Carlo MLE Results
Estimate Std. Error MCMC % z value Pr(>|z|)
sum -1.19084 0.02886 0 -41.265 <1e-04 *** nodematch.sum.rank 0.37575 0.03877 0 9.693 <1e-04 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Null Deviance: 0 on 4970 degrees of freedom
Residual Deviance: -4401 on 4968 degrees of freedom
I would like to have a local interpretation of this model for an ij pair that parallels the interpretation of a model for binary data. For binary data, one way to think about a model?s prediction for the probability of an ij tie is to assign a vector of 0s to the model features representing the absence of an ij tie, getting a total score of 0 for this outcome and then calculate the similar total score for vector of model features if an ij tie is present. Then the probability of the ij tie is the latter exponentiated score divided by the sum of it and 1, the value of exp(0).
Back to the valued example, I consider an ij pair that match on rank: if there is no ij tie then the profile for that pair on the features sum and nodematch.sum.rank is (0,0) and so the score for that outcome is 0, an ij tie of value 1 has a profile (1,1) and so a score of (-1.19084+0.37575) for that outcome, an ij tie of value 2 has a profile (2,2) and so a score of 2*(-1.19084+0.37575) and finally an ij tie of value 3 corresponds to a profile of (3,3) and so a score of 3*(-1.19084+0.37575).
Now exp(score) is proportional to the probability of the network in which the ij dyad has one of these four scores and the rest of the pairs are fixed at their observed outcomes. So I calculate the probability of each of the four outcomes as the exp of its score divided by the sum of the exp of the four scores yielding
Pr(y_ij=0|Y_-ij & rank_i=rank_j) = 0.57964401
Pr(y_ij=1|Y_-ij & rank_i=rank_j) = 0.25655021
Pr(y_ij=2|Y_-ij & rank_i=rank_j) = 0.11354903
Pr(y_ij=3|Y_-ij & rank_i=rank_j) = 0.05025676
Which can be compared with the values when the ij pair do not match on rank:
Pr(y_ij=0|Y_-ij & rank_i<>rank_j) = 0.70202741
Pr(y_ij=1|Y_-ij & rank_i<>rank_j) = 0.21339225
Pr(y_ij=2|Y_-ij & rank_i<>rank_j) = 0.06486393
Pr(y_ij=3|Y_-ij & rank_i<>rank_j) = 0.01971641
So my questions are: (1) is this an acceptable way of interpretation for discrete limited count data and (2) what considerations should be used to select a reference distribution for such data?
John Skvoretz, AAAS Fellow
Distinguished University Professor of Sociology
Distinguished University Professor of Computer Science and Engineering (by courtesy)
Carolina Distinguished Professor Emeritus
Editor, Journal of Social Structure
Sociology
USF
4202 E Fowler Ave CPR 107
Tampa, FL 33620
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From buttsc at uci.edu Fri Apr 23 23:24:32 2021
From: buttsc at uci.edu (Carter T. Butts)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] advice on interpretation valued graphs
In-Reply-To:
References:
Message-ID:
Hi, John -
On 4/22/21 6:18 AM, Skvoretz, John JR. wrote:
>
> As an exercise I use Lazaga lawyer data to set up a valued directed
> graph with a count on edges from 0 to 3 based on the number of tie
> types between i and j, eg. 3 if i names j as a friend, an advisor, and
> a collaborator.I then estimate a model with effects for sum and for
> matching on "rank" which has the two values P and A for partner and
> associate.so nothing fancy.I pick discrete uniform as the reference
> distribution (and would appreciate advice on what to consider in
> making this choice vs the Binomial).Here are the results
>
>
I would use a Binomial reference here, because you are modeling this as
the sum of indicators for three different interaction types, any of
which could be present.? The reference model for the edge value would
then be the sum of three IID Bernoulli trials, which is obviously
Binomial in character.
> vm1r <- ergm(law.tot ~ sum + nodematch("rank"), response="nr",
> reference=~DiscUnif(0,3))
>
> Monte Carlo MLE Results
>
> Estimate Std. Error MCMC % z value Pr(>|z|)
>
> sum-1.190840.028860 -41.265<1e-04
> ***nodematch.sum.rank0.375750.0387709.693<1e-04 ***
>
> ---
>
> Signif. codes:0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>
> Null Deviance:0on 4970degrees of freedom
>
> Residual Deviance: -4401on 4968degrees of freedom
>
> I would like to have a local interpretation of this model for an ij
> pair that parallels the interpretation of a model for binary data.
>
The conditional probability of Yij=yij here is given by
exp(-1.19 yij + 0.38 yij Xij)/[sum_{k=0}^3 exp(-1.19 k + 0.38 k Xij)]
where Xij is 1 if rank_i=rank_j, and 0 otherwise.? This is just like the
binary case, except that our "local normalizing factor" has to sum over
a larger set of edge values (0 to 3).? Note that, if we were using a
different reference measure, we would need to factor that in to both the
numerator and denominator.? To get a changescore-like representation, we
can divide by the numerator:
[sum_{k=0}^3 exp(-1.19 (k-yij) + 0.38 Xij (k-yij) )]^-1
which is, sadly, not as lovely as the binary case.? But it's there.?
Pavel has proposed some other ways to think about the local
interpretations of these terms, which could be worth looking up; it may
also be worth stealing the sorts of visualizations used to depict energy
level diagrams to visualize these effects.? For instance:
The "edge potential," as I am calling it here, is the log numerator of
the conditional edge value probability.? This ladder representation can
be helpful in getting a feel for the relative gaps in probability
between possible states; for instance, if state A is about 0.7 units
above state B, then it is about twice as likely to be observed
(conditionally speaking).? For a network of order N, drawing a line
about log(choose(N,2)) units below the top state could in some cases be
informative in eliminating states that are rare enough that you'd expect
to get one of them on average if the entire network were composed of
edges with the same potential "ladder."? Doubtless there are other
creative ways to draw heuristic thresholds that would give a sense of
which states were plausible and which states were not.? Anyway, here's
the code for this picture - it's thrown together, but could be useful:
eleva<- -1.190*(0:3)
plot(1,0,main="Edge Potential by Edge Value and Nodematch", type="n",
xlim=c(0,3), axes=F, xlab="", ylab="Edge Potential",
ylim=c(3*-1.2,0.38), font.lab=2, font.axis=2)
axis(2,font.axis=2)
axis(1,at=c(1,2),font.axis=2,label=c("nodematch(rank)=0","nodematch(rank)=1"))
segments(0.75,eleva,1.25,eleva,lwd=3);
text(0.5,eleva,label=paste("Y_ij=",0:3,sep=""),font=2)
segments(1.75,eleva+0.37575,2.25,eleva+0.37575,lwd=3,col=2)
text(2.5,eleva+0.37575,label=paste("Y_ij=",0:3,sep=""),font=2,col=2)
(Super pedantic aside for people who are interested in this: in an
actual energy level diagram, lower numbers would indicate more favorable
states (not less), but since we usually like to think in terms of
probabilities, I'm putting more probable values higher up.? But there is
a genuine connection - what I'm plotting here is the equivalent of
-E/(kT), where k is Boltzmann's constant and T is the temperature.? In
the absence of an externally meaningful scale, nothing prevents us from
declaring kT=1 by fiat, and then these are just negative energies (in
our idiosyncratic units). This actually does matter when applying ERGMs
in physical applications, but I bring it up here in case anyone gets
interested in this type of visualization, goes looking for examples, and
gets confused because the graphs they find online seem to be the wrong
way around.? For whatever reason, it is conventional to speak of social
things as wanting to go up and physical things as wanting to go down.? :-))
>
> So my questions are:(1) is this an acceptable way of interpretation
> for discrete limited count data
>
Yes - I get the same numbers from the above as for your calculations.
> and (2) what considerations should be used to select a reference
> distribution for such data?
>
>
The reference distribution can be thought of in more than one way.? One
view is that it sets the form of the conditional distribution.? Another
view is that it tells you what the "baseline" distribution looks like,
before other effects are added to the model.? And yet another is that it
tells you about the entropy of each graph state - a substantive
interpretation of which is that there are hidden degrees of freedom that
give you more ways to get some graphs than others (irrespective of the
model parameters), whose collective effect is captured by the reference
measure.? For instance, in this case we envision a generative process in
which there are three discrete ways of interacting, each of which
independently is present (or not) within a dyad.? That means that there
are more ways for the count of interaction modes to be 1 or 2 than 0 or
3; we can't see the individual interaction modes (those are our hidden
degrees of freedom), but they still affect the network by "biasing" the
counts.? A Binomial reference measure accounts for the influence of that
type of process.? If we use a uniform reference, by contrast, we are
assuming that each edge value is a priori equally likely: there are no
hidden factors that give us more ways to get a 0 versus a 1 or a 2.
This logic also applies to reference measures in the binary case.? Some
rather beautiful theory comes of it, in my highly biased opinion.? :-)
Hope that helps!
-Carter
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: iaimjooannoblbbg.png
Type: image/png
Size: 35280 bytes
Desc: not available
URL:
From p.krivitsky at unsw.edu.au Sun Apr 25 23:45:21 2021
From: p.krivitsky at unsw.edu.au (Pavel Krivitsky)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Add odds ratios and 95% confidence interval to
ergm summary table
In-Reply-To: <23609ade11cd43ae9a5fd479f7707a91@kit.edu>
References:
<23609ade11cd43ae9a5fd479f7707a91@kit.edu>
Message-ID: <75db4a6723aa093175054cefec0f72f0abd00095.camel@unsw.edu.au>
Dear Laura,
For dyad-dependent models, they become conditional odds ratios given the rest of the network. That is, the "odds" in questions are not P(Yij = 1)/P(Yij = 0) but something more like P(network with Yij = 1)/P(otherwise identical network with Yij = 0).
I hope this helps,
Pavel
On Wed, 2021-04-14 at 15:57 +0000, Wolbring, Laura (IFSS) wrote:
Dear Pavel,
Thanks a lot, that worked and helped me out!
Just to be sure: Are the odds ratios and confidence intervals calculated with the code you suggest also meaningful for the dependence terms (gwdegree, gwdsp, gwesp)?
I ask because I get the same values for odds ratios and confidence intervals with the code you suggest as when I use a code from Harris (Jenine K. Harris (2014).An Introduction to Exponential Random Graph Modeling. Thousand Oaks: Sage) which she says cannot be applied to models that contain dependence terms.
Thank you very much and best regards,
Laura
Von: Pavel N. Krivitsky
Gesendet: Dienstag, 30. M?rz 2021 01:17
An: Wolbring, Laura (IFSS) ; statnet_help@u.washington.edu
Betreff: Re: [statnet_help] Add odds ratios and 95% confidence interval to ergm summary table
Dear Laura,
The quickest way to calculate the odds ratios is probably to use the built-inconfint() function to get the intervals for the parameters and then callexp() on those.
You can extract the summary table by callingcoef(summary(ergm_fit)).
Putting it all together, something like
cbind(coef(summary(ergm_fit)), exp(confint(ergm_fit)))
should do the trick.
I hope this helps,
Pavel
On Mon, 2021-03-29 at 11:57 +0000, Wolbring, Laura (IFSS) wrote:
Dear all,
I want to add odds ratios and a 95% confidence interval to my ergm summary table including dependence terms (gwdsp, gwdegree, gwesp).
To calculate odds ratios, I already used the exp() and sqrt() functions but I have read that standard errors for dependence models are adjusted due to the use of MCMC estimation. Therefore, odds ratios for dependence models should only be calculated by editing the underlying code of the summary table.
I would be very grateful for any advice on how to add odds ratios to the ergm summary table.
Thanks in advance and best regards
Laura
Karlsruhe Institute of Technology (KIT)
Institute of Sports and Sports Science (IfSS)
Laura Wolbring
Research Assistant
Engler-Bunte-Ring 15
Building 40.40
Room -110
76131 Karlsruhe
Phone : +49 721 608-46696
E-Mail:laura.wolbring@kit.edu
Web: www.sport.kit.edu
KIT - Die Forschungsuniversit?t in der Helmholtz-Gemeinschaft
Das KIT ist seit 2010 als familiengerechte Hochschule zertifiziert.
_______________________________________________
statnet_help mailing list
statnet_help@u.washington.edu
http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From ymolin2 at uic.edu Wed Apr 28 10:54:49 2021
From: ymolin2 at uic.edu (Molina, Yamile)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Question regarding exploring mediation with ERGMS
Message-ID:
Hello,
I was wondering if anyone had experience or resources regarding how to test multiple mediation pathways with ERGMS.
Here is a brief description of the analysis I am hoping to conduct. I am interested in examining potential mediators (e.g., ego's network position, network density, ego-alter relationship type, communication/message quality) that may underlie interventions' effects on breast cancer screening among alters of participants (who are the egos).
Thank you,
Yami
--
Yamil? Molina, MS, MPH, PhD
Pronouns: They/Their/Them
Assistant Professor, Division of Community Health Sciences, School of Public Health
649 SPHPI MC923
Ph: 312-355-2679
Faculty Affiliate, Center for Research on Women and Gender
507 DHSP MC980
Ph: 312-355-1791
Initiative Director for Policy/Advocacy and Partnership Development, Cancer Center
University of Illinois Chicago
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From yaxincui2023 at u.northwestern.edu Tue May 4 23:40:34 2021
From: yaxincui2023 at u.northwestern.edu (Yaxin Cui)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Valued ERGM Geometric reference distribution
Message-ID:
Hi all,
I am running a valued ERGM model on my unidimensional network. The link
strengths in my network are integers and unbounded (which has an
approximate range of [0,400] and the link strength is highly screwed. Based
on the sample space of the link weights, I think both Poisson distribution
and Geometric distribution could work. I have tried both, but when I use
Geometric distribution, it always gives me the error after several
iterations.
Error in eigen(crossprod(x1c), symmetric = TRUE) :
> infinite or missing values in 'x'
This is one of my models:
ergm(choice_network ~ sum + nodeicovar +
nodeifactor("Import") + nodeicov("Price",form = "sum") +
nodeifactor("MakeOrigin",form = "sum"),
response = "weight", reference = ~Geometric,
control = control.ergm(MCMC.interval=1024, MCMC.burnin=20480,
MCMLE.maxit=1000,
seed=214,parallel = 6))
I have tried with different model terms but the error keeps. Has anyone met
the same issue? Any suggestions and feedback is welcome!
Thank you so much!
Best,
Yaxin
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From fweiler08 at jhubc.it Thu May 20 09:51:18 2021
From: fweiler08 at jhubc.it (Florian Weiler)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Difference homophily valued network model vs Poisson
model
Message-ID:
Dear list,
Colleagues and I are running valued network models on preferential voting
data. Each voter has 5 preferential votes, a tie is registered for two
candidates selected by the same voter. If 10 voters indicate the same (two)
candidates, this edge will have the value of 10.
We want to see which role gender plays in this selection process. But we
were unsure how to model this, so we set up a little simulation to see how
the model reacts. We tried to code this in a way that some voters have a
preference for women, and others a preference for men. So we expected the
term *nodematch("female", diff = TRUE) *to give us positive results for
both genders compared to the base category (mixed dyads). But running this
simulation multiple times resulted in
mostly insignificant results (depending on the run) for both genders.
We checked this by transforming the network into a dyadic dataset with the
count of joint votes as dependent variable and the independent variable
coded as Diff, Female and Male. When running conventional Poisson models,
we clearly found the expected difference every time.
The code of the simulation is below and can be run in well under a minute.
I would really appreciate any help or explanation of what is going on here.
It is very possible that there are obvious mistakes in our reasoning, we
are not the biggest network experts. In that case, I apologize in advance
for our incompetence!
Here is the code:
####Build candidate data setset
cand<-c(rep("m",50),rep("f",50))
#Randomise candidate order
cand<-sample(cand)
#Assign list position to candidates
cand<-as.data.frame(cand)
names(cand)<-"gender"
cand$listpos<-rownames(cand)
####Build voter dataset with positive male and female homophily (compared
to mixed gender)
c1<-data.frame(nrow=500,ncol=5)
c2<-data.frame(nrow=500,ncol=5)
#Sub-sample with "male bias"
probs<-rep(0.015,100)
probs[cand$gender=="f"]<-0.005
for (i in 1:500){
c1[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
}
names(c1)<-c("vote1","vote2","vote3","vote4","vote5")
c1$id<-rownames(c1)
#Sub-sample "female bias"
probs<-rep(0.005,100)
probs[cand$gender=="f"]<-0.015
for (i in 1:500){
c2[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
}
names(c2)<-c("vote1","vote2","vote3","vote4","vote5")
c2$id<-as.numeric(rownames(c2))+500
#Combined sample
c<-rbind(c1,c2)
####Function to convert to network objects
netf<-function(d,cand){
cand$female<-0
cand$female[cand$gender=="f"]<-1
##Voting data manipulation
#Creating a stacked dataset
d11<-d[c(6,1)]
d12<-d[c(6,2)]
d13<-d[c(6,3)]
d14<-d[c(6,4)]
d15<-d[c(6,5)]
names(d11)<-c("id","vote")
names(d12)<-c("id","vote")
names(d13)<-c("id","vote")
names(d14)<-c("id","vote")
names(d15)<-c("id","vote")
d1<-rbind(d11,d12,d13,d14,d15)
##Creating a network object for candidate voting data
V <- crossprod(table(d1[1:2]))#/(nrow(d1)/5)
diag(V)<-NA
V1 <<- V
g <- graph.adjacency(V, weighted=TRUE, mode ='undirected')
g <- simplify(g)
##Add other candidate characteristics to the network object
V(g)$gender<-cand$gender
V(g)$female<-cand$female
#Convert to Network object
net<- asNetwork(g)
res<-list(net)
names(res)<-"net"
return(res)
}
####Apply function to data
cnet<-netf(c,cand)
####Run valued network model (gender coefficients not significant in most
simulations)
m<-ergm(cnet$net ~ sum
+nodematch("female", diff=TRUE),
response="weight",
reference = ~Poisson,
control = control.ergm(MCMC.samplesize = 1000,
MCMLE.maxit = 50))
summary(m)
####Transform network data into dyadic dataset
data<-list()
for (i in 1:100){
data[[i]]<-as.data.frame(matrix(nrow=100))
data[[i]]$cand1<-cand$listpos[i]
data[[i]]$gender1<-cand$gender[i]
data[[i]]$cand2<-NA
data[[i]]$gender2<-NA
data[[i]]$countvotes<-NA
for (j in 1:100){
data[[i]]$cand2[j]<-cand$listpos[j]
data[[i]]$gender2[j]<-cand$gender[j]
data[[i]]$countvotes[j]<-V1[rownames(V1)==cand$listpos[i],
colnames(V1)==cand$listpos[j]]
}
}
data<-rbind.fill(data)
#Get rid of some dyads (double and self)
for (i in 1:length(data[,1])){
data$cand11[i]<-min(data$cand1[i],data$cand2[i])
data$cand21[i]<-max(data$cand1[i],data$cand2[i])
}
data<-data[duplicated(paste(data$cand11,data$cand21))=="FALSE",]
data<-data[data$cand11!=data$cand21,]
#Code homophily variable
data$gendermatch<-"Diff"
data$gendermatch[data$gender1=="m" & data$gender2=="m"]<-"Male"
data$gendermatch[data$gender1=="f" & data$gender2=="f"]<-"Female"
#Check results (show that both genders have pos & sign coefficient)
tapply(data$countvotes,data$gendermatch,mean)
summary(glm(countvotes~gendermatch,family="poisson",data=data))
Thanks and best,
Florian
--
Please note that the *@johnshopkins.it *
domain is given
to alumni, research fellows and former employees of the
SAIS Europe/Bologna
Center campus.? Emails sent from this domain do not
represent official
communications of the Johns Hopkins University or its
SAIS Europe Campus. ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From kraft.tom at gmail.com Thu May 20 10:12:14 2021
From: kraft.tom at gmail.com (Tom Kraft)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Difference homophily valued network model vs
Poisson model
In-Reply-To:
References:
Message-ID:
Hi Florian,
One quick clarification about your question: I notice that the network you
are simulating has 100 nodes and ~4200 edges. This is quite close to
saturation for an undirected network ( ((100)*(100-1))/2 = 4950 possible
edges). I'm not sure whether this lies at the heart of the issue, but
wanted to check and make sure this was intended (or perhaps you've tried
with lower density already?).
Tom
On Thu, May 20, 2021 at 12:52 PM Florian Weiler wrote:
> Dear list,
>
> Colleagues and I are running valued network models on preferential voting
> data. Each voter has 5 preferential votes, a tie is registered for two
> candidates selected by the same voter. If 10 voters indicate the same (two)
> candidates, this edge will have the value of 10.
>
> We want to see which role gender plays in this selection process. But we
> were unsure how to model this, so we set up a little simulation to see how
> the model reacts. We tried to code this in a way that some voters have a
> preference for women, and others a preference for men. So we expected the
> term *nodematch("female", diff = TRUE) *to give us positive results for
> both genders compared to the base category (mixed dyads). But running this
> simulation multiple times resulted in
> mostly insignificant results (depending on the run) for both genders.
>
> We checked this by transforming the network into a dyadic dataset with the
> count of joint votes as dependent variable and the independent variable
> coded as Diff, Female and Male. When running conventional Poisson models,
> we clearly found the expected difference every time.
>
> The code of the simulation is below and can be run in well under a minute.
> I would really appreciate any help or explanation of what is going on here.
>
> It is very possible that there are obvious mistakes in our reasoning, we
> are not the biggest network experts. In that case, I apologize in advance
> for our incompetence!
>
> Here is the code:
>
> ####Build candidate data setset
> cand<-c(rep("m",50),rep("f",50))
> #Randomise candidate order
> cand<-sample(cand)
> #Assign list position to candidates
> cand<-as.data.frame(cand)
> names(cand)<-"gender"
> cand$listpos<-rownames(cand)
>
> ####Build voter dataset with positive male and female homophily (compared
> to mixed gender)
> c1<-data.frame(nrow=500,ncol=5)
> c2<-data.frame(nrow=500,ncol=5)
> #Sub-sample with "male bias"
> probs<-rep(0.015,100)
> probs[cand$gender=="f"]<-0.005
> for (i in 1:500){
> c1[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
> }
> names(c1)<-c("vote1","vote2","vote3","vote4","vote5")
> c1$id<-rownames(c1)
> #Sub-sample "female bias"
> probs<-rep(0.005,100)
> probs[cand$gender=="f"]<-0.015
> for (i in 1:500){
> c2[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
> }
> names(c2)<-c("vote1","vote2","vote3","vote4","vote5")
> c2$id<-as.numeric(rownames(c2))+500
> #Combined sample
> c<-rbind(c1,c2)
>
> ####Function to convert to network objects
> netf<-function(d,cand){
> cand$female<-0
> cand$female[cand$gender=="f"]<-1
>
> ##Voting data manipulation
> #Creating a stacked dataset
> d11<-d[c(6,1)]
> d12<-d[c(6,2)]
> d13<-d[c(6,3)]
> d14<-d[c(6,4)]
> d15<-d[c(6,5)]
> names(d11)<-c("id","vote")
> names(d12)<-c("id","vote")
> names(d13)<-c("id","vote")
> names(d14)<-c("id","vote")
> names(d15)<-c("id","vote")
> d1<-rbind(d11,d12,d13,d14,d15)
>
> ##Creating a network object for candidate voting data
> V <- crossprod(table(d1[1:2]))#/(nrow(d1)/5)
> diag(V)<-NA
> V1 <<- V
> g <- graph.adjacency(V, weighted=TRUE, mode ='undirected')
> g <- simplify(g)
> ##Add other candidate characteristics to the network object
> V(g)$gender<-cand$gender
> V(g)$female<-cand$female
>
> #Convert to Network object
> net<- asNetwork(g)
> res<-list(net)
> names(res)<-"net"
> return(res)
> }
>
> ####Apply function to data
> cnet<-netf(c,cand)
>
> ####Run valued network model (gender coefficients not significant in most
> simulations)
> m<-ergm(cnet$net ~ sum
> +nodematch("female", diff=TRUE),
> response="weight",
> reference = ~Poisson,
> control = control.ergm(MCMC.samplesize = 1000,
> MCMLE.maxit = 50))
> summary(m)
>
> ####Transform network data into dyadic dataset
> data<-list()
> for (i in 1:100){
> data[[i]]<-as.data.frame(matrix(nrow=100))
> data[[i]]$cand1<-cand$listpos[i]
> data[[i]]$gender1<-cand$gender[i]
> data[[i]]$cand2<-NA
> data[[i]]$gender2<-NA
> data[[i]]$countvotes<-NA
> for (j in 1:100){
> data[[i]]$cand2[j]<-cand$listpos[j]
> data[[i]]$gender2[j]<-cand$gender[j]
> data[[i]]$countvotes[j]<-V1[rownames(V1)==cand$listpos[i],
> colnames(V1)==cand$listpos[j]]
> }
> }
> data<-rbind.fill(data)
> #Get rid of some dyads (double and self)
> for (i in 1:length(data[,1])){
> data$cand11[i]<-min(data$cand1[i],data$cand2[i])
> data$cand21[i]<-max(data$cand1[i],data$cand2[i])
> }
> data<-data[duplicated(paste(data$cand11,data$cand21))=="FALSE",]
> data<-data[data$cand11!=data$cand21,]
>
> #Code homophily variable
> data$gendermatch<-"Diff"
> data$gendermatch[data$gender1=="m" & data$gender2=="m"]<-"Male"
> data$gendermatch[data$gender1=="f" & data$gender2=="f"]<-"Female"
>
> #Check results (show that both genders have pos & sign coefficient)
> tapply(data$countvotes,data$gendermatch,mean)
> summary(glm(countvotes~gendermatch,family="poisson",data=data))
>
>
> Thanks and best,
> Florian
>
>
>
>
> Please note that the *@johnshopkins.it * domain
> is given to alumni, research fellows and former employees of the SAIS
> Europe/Bologna Center campus. Emails sent from this domain do not
> represent official communications of the Johns Hopkins University or its
> SAIS Europe Campus.
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From fweiler08 at jhubc.it Thu May 20 11:09:14 2021
From: fweiler08 at jhubc.it (Florian Weiler)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Difference homophily valued network model vs
Poisson model
In-Reply-To:
References:
Message-ID:
Thank you Tom, I was not aware this could be an issue. I just checked by
tweaking the code to obtain a network with only 900 edges. I did this by
just using two of the five randomly generated votes, changing line 49 of
the code I sent to d1<-rbind(d14,d15). It is true this reflects the real
data much better, which has many more zeros than in the simulation I sent
around first.
But the issue is the same in the new simulation. The ergm shows
insignificant results, the Poisson highly significant ones.
Thanks again!
Florian
On Thu, May 20, 2021 at 7:12 PM Tom Kraft wrote:
> Hi Florian,
>
> One quick clarification about your question: I notice that the network you
> are simulating has 100 nodes and ~4200 edges. This is quite close to
> saturation for an undirected network ( ((100)*(100-1))/2 = 4950 possible
> edges). I'm not sure whether this lies at the heart of the issue, but
> wanted to check and make sure this was intended (or perhaps you've tried
> with lower density already?).
>
> Tom
>
> On Thu, May 20, 2021 at 12:52 PM Florian Weiler
> wrote:
>
>> Dear list,
>>
>> Colleagues and I are running valued network models on preferential voting
>> data. Each voter has 5 preferential votes, a tie is registered for two
>> candidates selected by the same voter. If 10 voters indicate the same (two)
>> candidates, this edge will have the value of 10.
>>
>> We want to see which role gender plays in this selection process. But we
>> were unsure how to model this, so we set up a little simulation to see how
>> the model reacts. We tried to code this in a way that some voters have a
>> preference for women, and others a preference for men. So we expected the
>> term *nodematch("female", diff = TRUE) *to give us positive results for
>> both genders compared to the base category (mixed dyads). But running this
>> simulation multiple times resulted in
>> mostly insignificant results (depending on the run) for both genders.
>>
>> We checked this by transforming the network into a dyadic dataset with
>> the count of joint votes as dependent variable and the independent variable
>> coded as Diff, Female and Male. When running conventional Poisson models,
>> we clearly found the expected difference every time.
>>
>> The code of the simulation is below and can be run in well under a
>> minute. I would really appreciate any help or explanation of what is going
>> on here.
>>
>> It is very possible that there are obvious mistakes in our reasoning, we
>> are not the biggest network experts. In that case, I apologize in advance
>> for our incompetence!
>>
>> Here is the code:
>>
>> ####Build candidate data setset
>> cand<-c(rep("m",50),rep("f",50))
>> #Randomise candidate order
>> cand<-sample(cand)
>> #Assign list position to candidates
>> cand<-as.data.frame(cand)
>> names(cand)<-"gender"
>> cand$listpos<-rownames(cand)
>>
>> ####Build voter dataset with positive male and female homophily (compared
>> to mixed gender)
>> c1<-data.frame(nrow=500,ncol=5)
>> c2<-data.frame(nrow=500,ncol=5)
>> #Sub-sample with "male bias"
>> probs<-rep(0.015,100)
>> probs[cand$gender=="f"]<-0.005
>> for (i in 1:500){
>> c1[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
>> }
>> names(c1)<-c("vote1","vote2","vote3","vote4","vote5")
>> c1$id<-rownames(c1)
>> #Sub-sample "female bias"
>> probs<-rep(0.005,100)
>> probs[cand$gender=="f"]<-0.015
>> for (i in 1:500){
>> c2[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
>> }
>> names(c2)<-c("vote1","vote2","vote3","vote4","vote5")
>> c2$id<-as.numeric(rownames(c2))+500
>> #Combined sample
>> c<-rbind(c1,c2)
>>
>> ####Function to convert to network objects
>> netf<-function(d,cand){
>> cand$female<-0
>> cand$female[cand$gender=="f"]<-1
>>
>> ##Voting data manipulation
>> #Creating a stacked dataset
>> d11<-d[c(6,1)]
>> d12<-d[c(6,2)]
>> d13<-d[c(6,3)]
>> d14<-d[c(6,4)]
>> d15<-d[c(6,5)]
>> names(d11)<-c("id","vote")
>> names(d12)<-c("id","vote")
>> names(d13)<-c("id","vote")
>> names(d14)<-c("id","vote")
>> names(d15)<-c("id","vote")
>> d1<-rbind(d11,d12,d13,d14,d15)
>>
>> ##Creating a network object for candidate voting data
>> V <- crossprod(table(d1[1:2]))#/(nrow(d1)/5)
>> diag(V)<-NA
>> V1 <<- V
>> g <- graph.adjacency(V, weighted=TRUE, mode ='undirected')
>> g <- simplify(g)
>> ##Add other candidate characteristics to the network object
>> V(g)$gender<-cand$gender
>> V(g)$female<-cand$female
>>
>> #Convert to Network object
>> net<- asNetwork(g)
>> res<-list(net)
>> names(res)<-"net"
>> return(res)
>> }
>>
>> ####Apply function to data
>> cnet<-netf(c,cand)
>>
>> ####Run valued network model (gender coefficients not significant in most
>> simulations)
>> m<-ergm(cnet$net ~ sum
>> +nodematch("female", diff=TRUE),
>> response="weight",
>> reference = ~Poisson,
>> control = control.ergm(MCMC.samplesize = 1000,
>> MCMLE.maxit = 50))
>> summary(m)
>>
>> ####Transform network data into dyadic dataset
>> data<-list()
>> for (i in 1:100){
>> data[[i]]<-as.data.frame(matrix(nrow=100))
>> data[[i]]$cand1<-cand$listpos[i]
>> data[[i]]$gender1<-cand$gender[i]
>> data[[i]]$cand2<-NA
>> data[[i]]$gender2<-NA
>> data[[i]]$countvotes<-NA
>> for (j in 1:100){
>> data[[i]]$cand2[j]<-cand$listpos[j]
>> data[[i]]$gender2[j]<-cand$gender[j]
>> data[[i]]$countvotes[j]<-V1[rownames(V1)==cand$listpos[i],
>> colnames(V1)==cand$listpos[j]]
>> }
>> }
>> data<-rbind.fill(data)
>> #Get rid of some dyads (double and self)
>> for (i in 1:length(data[,1])){
>> data$cand11[i]<-min(data$cand1[i],data$cand2[i])
>> data$cand21[i]<-max(data$cand1[i],data$cand2[i])
>> }
>> data<-data[duplicated(paste(data$cand11,data$cand21))=="FALSE",]
>> data<-data[data$cand11!=data$cand21,]
>>
>> #Code homophily variable
>> data$gendermatch<-"Diff"
>> data$gendermatch[data$gender1=="m" & data$gender2=="m"]<-"Male"
>> data$gendermatch[data$gender1=="f" & data$gender2=="f"]<-"Female"
>>
>> #Check results (show that both genders have pos & sign coefficient)
>> tapply(data$countvotes,data$gendermatch,mean)
>> summary(glm(countvotes~gendermatch,family="poisson",data=data))
>>
>>
>> Thanks and best,
>> Florian
>>
>>
>>
>>
>> Please note that the *@johnshopkins.it * domain
>> is given to alumni, research fellows and former employees of the SAIS
>> Europe/Bologna Center campus. Emails sent from this domain do not
>> represent official communications of the Johns Hopkins University or its
>> SAIS Europe Campus.
>> _______________________________________________
>> statnet_help mailing list
>> statnet_help@u.washington.edu
>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>>
>
--
Please note that the *@johnshopkins.it *
domain is given
to alumni, research fellows and former employees of the
SAIS Europe/Bologna
Center campus.? Emails sent from this domain do not
represent official
communications of the Johns Hopkins University or its
SAIS Europe Campus. ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From buttsc at uci.edu Thu May 20 12:23:39 2021
From: buttsc at uci.edu (Carter T. Butts)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Difference homophily valued network model vs
Poisson model
In-Reply-To:
References:
Message-ID: <8b3687c6-b496-2241-4cb7-86faca13f524@uci.edu>
Hi, Florian -
Thanks for providing an example!? Unfortunately, however, your example
seems to be using statnet-incompatible libraries, so I cannot evaluate
it.? Can you provide a statnet-friendly case?
-Carter
On 5/20/21 9:51 AM, Florian Weiler wrote:
> Dear list,
>
> Colleagues and I are running valued network models on
> preferential?voting data. Each voter has 5 preferential?votes, a tie
> is registered for two candidates selected by the same voter. If 10
> voters indicate the same (two) candidates, this edge will have the
> value of 10.
>
> We want to see which role gender plays in this selection process. But
> we were unsure how to model this, so we set up a little simulation to
> see how the model reacts. We tried to code this in a way that some
> voters have a preference for women, and others a preference for men.
> So we expected the term /nodematch("female", diff = TRUE) /to give us
> positive results for both genders compared to the base category (mixed
> dyads). But running this simulation multiple times resulted in
> mostly?insignificant?results?(depending on the run) for both genders.
>
> We checked this by transforming the network into a dyadic dataset with
> the count of joint votes as dependent variable and the independent
> variable coded as Diff, Female and Male. When running conventional
> Poisson models, we clearly found the expected difference every time.
>
> The code of the simulation is below and can be run in well under a
> minute. I would really appreciate any help or explanation of what is
> going on here.
>
> It is very possible that there are obvious mistakes in our reasoning,
> we are not the biggest network experts. In that case, I apologize in
> advance for?our incompetence!
>
> Here is the code:
>
> ####Build candidate data setset
> cand<-c(rep("m",50),rep("f",50))
> #Randomise candidate order
> cand<-sample(cand)
> #Assign list position to candidates
> cand<-as.data.frame(cand)
> names(cand)<-"gender"
> cand$listpos<-rownames(cand)
>
> ####Build voter dataset with positive male and female homophily
> (compared to mixed gender)
> c1<-data.frame(nrow=500,ncol=5)
> c2<-data.frame(nrow=500,ncol=5)
> #Sub-sample with "male bias"
> probs<-rep(0.015,100)
> probs[cand$gender=="f"]<-0.005
> for (i in 1:500){
> c1[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
> }
> names(c1)<-c("vote1","vote2","vote3","vote4","vote5")
> c1$id<-rownames(c1)
> #Sub-sample "female bias"
> probs<-rep(0.005,100)
> probs[cand$gender=="f"]<-0.015
> for (i in 1:500){
> c2[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
> }
> names(c2)<-c("vote1","vote2","vote3","vote4","vote5")
> c2$id<-as.numeric(rownames(c2))+500
> #Combined sample
> c<-rbind(c1,c2)
>
> ####Function to convert to network objects
> netf<-function(d,cand){
> ? cand$female<-0
> ? cand$female[cand$gender=="f"]<-1
>
> ? ##Voting data manipulation
> ? #Creating a stacked dataset
> ? d11<-d[c(6,1)]
> ? d12<-d[c(6,2)]
> ? d13<-d[c(6,3)]
> ? d14<-d[c(6,4)]
> ? d15<-d[c(6,5)]
> ? names(d11)<-c("id","vote")
> ? names(d12)<-c("id","vote")
> ? names(d13)<-c("id","vote")
> ? names(d14)<-c("id","vote")
> ? names(d15)<-c("id","vote")
> ? d1<-rbind(d11,d12,d13,d14,d15)
>
> ? ##Creating a network object for candidate voting data
> ? V <- crossprod(table(d1[1:2]))#/(nrow(d1)/5)
> ? diag(V)<-NA
> ? V1 <<- V
> ? g <- graph.adjacency(V, weighted=TRUE, mode ='undirected')
> ? g <- simplify(g)
> ? ##Add other candidate characteristics to the network object
> ? V(g)$gender<-cand$gender
> ? V(g)$female<-cand$female
>
> ? #Convert to Network object
> ? net<- asNetwork(g)
> ? res<-list(net)
> ? names(res)<-"net"
> ? return(res)
> }
>
> ####Apply function to data
> cnet<-netf(c,cand)
>
> ####Run valued network model (gender coefficients not significant in
> most simulations)
> m<-ergm(cnet$net ~ sum
> ? ? ? ? ?+nodematch("female", diff=TRUE),
> ? ? ? ? ?response="weight",
> ? ? ? ? ?reference = ~Poisson,
> ? ? ? ? ?control = control.ergm(MCMC.samplesize = 1000,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? MCMLE.maxit = 50))
> summary(m)
>
> ####Transform network data into dyadic dataset
> data<-list()
> for (i in 1:100){
> ? data[[i]]<-as.data.frame(matrix(nrow=100))
> ? data[[i]]$cand1<-cand$listpos[i]
> ? data[[i]]$gender1<-cand$gender[i]
> ? data[[i]]$cand2<-NA
> ? data[[i]]$gender2<-NA
> ? data[[i]]$countvotes<-NA
> ? for (j in 1:100){
> ? ? data[[i]]$cand2[j]<-cand$listpos[j]
> ? ? data[[i]]$gender2[j]<-cand$gender[j]
> data[[i]]$countvotes[j]<-V1[rownames(V1)==cand$listpos[i],
> ?colnames(V1)==cand$listpos[j]]
> ? }
> }
> data<-rbind.fill(data)
> #Get rid of some dyads (double and self)
> for (i in 1:length(data[,1])){
> ? data$cand11[i]<-min(data$cand1[i],data$cand2[i])
> ? data$cand21[i]<-max(data$cand1[i],data$cand2[i])
> }
> data<-data[duplicated(paste(data$cand11,data$cand21))=="FALSE",]
> data<-data[data$cand11!=data$cand21,]
>
> #Code homophily variable
> data$gendermatch<-"Diff"
> data$gendermatch[data$gender1=="m" & data$gender2=="m"]<-"Male"
> data$gendermatch[data$gender1=="f" & data$gender2=="f"]<-"Female"
>
> #Check results (show that both genders have pos & sign coefficient)
> tapply(data$countvotes,data$gendermatch,mean)
> summary(glm(countvotes~gendermatch,family="poisson",data=data))
>
>
> Thanks and best,
> Florian
>
>
>
>
> Please note that the *@johnshopkins.it *
> domain is given to alumni, research fellows and former employees of
> the SAIS Europe/Bologna Center campus.? Emails sent from this domain
> do not represent official communications of the Johns Hopkins
> University or its SAIS Europe Campus.
>
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From kraft.tom at gmail.com Thu May 20 12:30:19 2021
From: kraft.tom at gmail.com (Tom Kraft)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Difference homophily valued network model vs
Poisson model
In-Reply-To: <64cc4548a2b74e71aed6b960c7f35f3e@DM5PR03MB3339.namprd03.prod.outlook.com>
References:
<64cc4548a2b74e71aed6b960c7f35f3e@DM5PR03MB3339.namprd03.prod.outlook.com>
Message-ID:
Hi again,
Carter -- I think it only requires the additional 2 libraries:
library(statnet)
library(igraph)
library(intergraph)
Florian -- Sorry for another question, but when I look at the mixing matrix
from your updated example I'm seeing numbers that would suggest the
expected trend. Have you looked at: mixingmatrix(cnet$net, "female"). Based
on that I would be surprised if you saw a significant effect, unless I'm
missing something?
Tom
On Thu, May 20, 2021 at 3:24 PM Carter T. Butts wrote:
> Hi, Florian -
>
> Thanks for providing an example! Unfortunately, however, your example
> seems to be using statnet-incompatible libraries, so I cannot evaluate it.
> Can you provide a statnet-friendly case?
>
> -Carter
> On 5/20/21 9:51 AM, Florian Weiler wrote:
>
> Dear list,
>
> Colleagues and I are running valued network models on preferential voting
> data. Each voter has 5 preferential votes, a tie is registered for two
> candidates selected by the same voter. If 10 voters indicate the same (two)
> candidates, this edge will have the value of 10.
>
> We want to see which role gender plays in this selection process. But we
> were unsure how to model this, so we set up a little simulation to see how
> the model reacts. We tried to code this in a way that some voters have a
> preference for women, and others a preference for men. So we expected the
> term *nodematch("female", diff = TRUE) *to give us positive results for
> both genders compared to the base category (mixed dyads). But running this
> simulation multiple times resulted in
> mostly insignificant results (depending on the run) for both genders.
>
> We checked this by transforming the network into a dyadic dataset with the
> count of joint votes as dependent variable and the independent variable
> coded as Diff, Female and Male. When running conventional Poisson models,
> we clearly found the expected difference every time.
>
> The code of the simulation is below and can be run in well under a minute.
> I would really appreciate any help or explanation of what is going on here.
>
> It is very possible that there are obvious mistakes in our reasoning, we
> are not the biggest network experts. In that case, I apologize in advance
> for our incompetence!
>
> Here is the code:
>
> ####Build candidate data setset
> cand<-c(rep("m",50),rep("f",50))
> #Randomise candidate order
> cand<-sample(cand)
> #Assign list position to candidates
> cand<-as.data.frame(cand)
> names(cand)<-"gender"
> cand$listpos<-rownames(cand)
>
> ####Build voter dataset with positive male and female homophily (compared
> to mixed gender)
> c1<-data.frame(nrow=500,ncol=5)
> c2<-data.frame(nrow=500,ncol=5)
> #Sub-sample with "male bias"
> probs<-rep(0.015,100)
> probs[cand$gender=="f"]<-0.005
> for (i in 1:500){
> c1[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
> }
> names(c1)<-c("vote1","vote2","vote3","vote4","vote5")
> c1$id<-rownames(c1)
> #Sub-sample "female bias"
> probs<-rep(0.005,100)
> probs[cand$gender=="f"]<-0.015
> for (i in 1:500){
> c2[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
> }
> names(c2)<-c("vote1","vote2","vote3","vote4","vote5")
> c2$id<-as.numeric(rownames(c2))+500
> #Combined sample
> c<-rbind(c1,c2)
>
> ####Function to convert to network objects
> netf<-function(d,cand){
> cand$female<-0
> cand$female[cand$gender=="f"]<-1
>
> ##Voting data manipulation
> #Creating a stacked dataset
> d11<-d[c(6,1)]
> d12<-d[c(6,2)]
> d13<-d[c(6,3)]
> d14<-d[c(6,4)]
> d15<-d[c(6,5)]
> names(d11)<-c("id","vote")
> names(d12)<-c("id","vote")
> names(d13)<-c("id","vote")
> names(d14)<-c("id","vote")
> names(d15)<-c("id","vote")
> d1<-rbind(d11,d12,d13,d14,d15)
>
> ##Creating a network object for candidate voting data
> V <- crossprod(table(d1[1:2]))#/(nrow(d1)/5)
> diag(V)<-NA
> V1 <<- V
> g <- graph.adjacency(V, weighted=TRUE, mode ='undirected')
> g <- simplify(g)
> ##Add other candidate characteristics to the network object
> V(g)$gender<-cand$gender
> V(g)$female<-cand$female
>
> #Convert to Network object
> net<- asNetwork(g)
> res<-list(net)
> names(res)<-"net"
> return(res)
> }
>
> ####Apply function to data
> cnet<-netf(c,cand)
>
> ####Run valued network model (gender coefficients not significant in most
> simulations)
> m<-ergm(cnet$net ~ sum
> +nodematch("female", diff=TRUE),
> response="weight",
> reference = ~Poisson,
> control = control.ergm(MCMC.samplesize = 1000,
> MCMLE.maxit = 50))
> summary(m)
>
> ####Transform network data into dyadic dataset
> data<-list()
> for (i in 1:100){
> data[[i]]<-as.data.frame(matrix(nrow=100))
> data[[i]]$cand1<-cand$listpos[i]
> data[[i]]$gender1<-cand$gender[i]
> data[[i]]$cand2<-NA
> data[[i]]$gender2<-NA
> data[[i]]$countvotes<-NA
> for (j in 1:100){
> data[[i]]$cand2[j]<-cand$listpos[j]
> data[[i]]$gender2[j]<-cand$gender[j]
> data[[i]]$countvotes[j]<-V1[rownames(V1)==cand$listpos[i],
> colnames(V1)==cand$listpos[j]]
> }
> }
> data<-rbind.fill(data)
> #Get rid of some dyads (double and self)
> for (i in 1:length(data[,1])){
> data$cand11[i]<-min(data$cand1[i],data$cand2[i])
> data$cand21[i]<-max(data$cand1[i],data$cand2[i])
> }
> data<-data[duplicated(paste(data$cand11,data$cand21))=="FALSE",]
> data<-data[data$cand11!=data$cand21,]
>
> #Code homophily variable
> data$gendermatch<-"Diff"
> data$gendermatch[data$gender1=="m" & data$gender2=="m"]<-"Male"
> data$gendermatch[data$gender1=="f" & data$gender2=="f"]<-"Female"
>
> #Check results (show that both genders have pos & sign coefficient)
> tapply(data$countvotes,data$gendermatch,mean)
> summary(glm(countvotes~gendermatch,family="poisson",data=data))
>
>
> Thanks and best,
> Florian
>
>
>
>
> Please note that the *@johnshopkins.it * domain
> is given to alumni, research fellows and former employees of the SAIS
> Europe/Bologna Center campus. Emails sent from this domain do not
> represent official communications of the Johns Hopkins University or its
> SAIS Europe Campus.
>
> _______________________________________________
> statnet_help mailing liststatnet_help@u.washington.eduhttp://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From fweiler08 at jhubc.it Thu May 20 13:24:51 2021
From: fweiler08 at jhubc.it (Florian Weiler)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Difference homophily valued network model vs
Poisson model
In-Reply-To:
References:
Message-ID:
I only seem to have gotten the summary for the other two messages, so will
reply here.
Yes, Carter, the additional libraries provided by Tom should make the code
work.
Tom, I think I don't fully understand, the documentation of mixingmatrix is
rather vague ("Most of these are not to be called by the user" according to
the help page). But yes, by design the distribution of the the gender
variable (the three possible combinations) is quite balanced. But should
the effect not depend on the values of the dependent variable associated
with this gender variable, i.e. the network? And here we tried to simulate
the network in a way that for the male-male and the female-female vertices
we see, on average, higher count values. The simple Poisson model as well
as the cross tabulation in the last two lines of the code I
sent clearly show that this is the case. For this simple reason, we also
expected to see similar significant results in the ERGM.
If, based on the mixingmatrix posted by Tom, no effect should be expected,
clearly the problem lies on my side and my misunderstanding how a valued
network model (using also a Poisson distribution) works.
Best,
Florian
On Thu, May 20, 2021 at 8:09 PM Florian Weiler wrote:
> Thank you Tom, I was not aware this could be an issue. I just checked by
> tweaking the code to obtain a network with only 900 edges. I did this by
> just using two of the five randomly generated votes, changing line 49 of
> the code I sent to d1<-rbind(d14,d15). It is true this reflects the real
> data much better, which has many more zeros than in the simulation I sent
> around first.
>
> But the issue is the same in the new simulation. The ergm shows
> insignificant results, the Poisson highly significant ones.
>
> Thanks again!
> Florian
>
>
>
> On Thu, May 20, 2021 at 7:12 PM Tom Kraft wrote:
>
>> Hi Florian,
>>
>> One quick clarification about your question: I notice that the network
>> you are simulating has 100 nodes and ~4200 edges. This is quite close to
>> saturation for an undirected network ( ((100)*(100-1))/2 = 4950 possible
>> edges). I'm not sure whether this lies at the heart of the issue, but
>> wanted to check and make sure this was intended (or perhaps you've tried
>> with lower density already?).
>>
>> Tom
>>
>> On Thu, May 20, 2021 at 12:52 PM Florian Weiler
>> wrote:
>>
>>> Dear list,
>>>
>>> Colleagues and I are running valued network models on
>>> preferential voting data. Each voter has 5 preferential votes, a tie is
>>> registered for two candidates selected by the same voter. If 10 voters
>>> indicate the same (two) candidates, this edge will have the value of 10.
>>>
>>> We want to see which role gender plays in this selection process. But we
>>> were unsure how to model this, so we set up a little simulation to see how
>>> the model reacts. We tried to code this in a way that some voters have a
>>> preference for women, and others a preference for men. So we expected the
>>> term *nodematch("female", diff = TRUE) *to give us positive results for
>>> both genders compared to the base category (mixed dyads). But running this
>>> simulation multiple times resulted in
>>> mostly insignificant results (depending on the run) for both genders.
>>>
>>> We checked this by transforming the network into a dyadic dataset with
>>> the count of joint votes as dependent variable and the independent variable
>>> coded as Diff, Female and Male. When running conventional Poisson models,
>>> we clearly found the expected difference every time.
>>>
>>> The code of the simulation is below and can be run in well under a
>>> minute. I would really appreciate any help or explanation of what is going
>>> on here.
>>>
>>> It is very possible that there are obvious mistakes in our reasoning, we
>>> are not the biggest network experts. In that case, I apologize in advance
>>> for our incompetence!
>>>
>>> Here is the code:
>>>
>>> ####Build candidate data setset
>>> cand<-c(rep("m",50),rep("f",50))
>>> #Randomise candidate order
>>> cand<-sample(cand)
>>> #Assign list position to candidates
>>> cand<-as.data.frame(cand)
>>> names(cand)<-"gender"
>>> cand$listpos<-rownames(cand)
>>>
>>> ####Build voter dataset with positive male and female homophily
>>> (compared to mixed gender)
>>> c1<-data.frame(nrow=500,ncol=5)
>>> c2<-data.frame(nrow=500,ncol=5)
>>> #Sub-sample with "male bias"
>>> probs<-rep(0.015,100)
>>> probs[cand$gender=="f"]<-0.005
>>> for (i in 1:500){
>>> c1[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
>>> }
>>> names(c1)<-c("vote1","vote2","vote3","vote4","vote5")
>>> c1$id<-rownames(c1)
>>> #Sub-sample "female bias"
>>> probs<-rep(0.005,100)
>>> probs[cand$gender=="f"]<-0.015
>>> for (i in 1:500){
>>> c2[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
>>> }
>>> names(c2)<-c("vote1","vote2","vote3","vote4","vote5")
>>> c2$id<-as.numeric(rownames(c2))+500
>>> #Combined sample
>>> c<-rbind(c1,c2)
>>>
>>> ####Function to convert to network objects
>>> netf<-function(d,cand){
>>> cand$female<-0
>>> cand$female[cand$gender=="f"]<-1
>>>
>>> ##Voting data manipulation
>>> #Creating a stacked dataset
>>> d11<-d[c(6,1)]
>>> d12<-d[c(6,2)]
>>> d13<-d[c(6,3)]
>>> d14<-d[c(6,4)]
>>> d15<-d[c(6,5)]
>>> names(d11)<-c("id","vote")
>>> names(d12)<-c("id","vote")
>>> names(d13)<-c("id","vote")
>>> names(d14)<-c("id","vote")
>>> names(d15)<-c("id","vote")
>>> d1<-rbind(d11,d12,d13,d14,d15)
>>>
>>> ##Creating a network object for candidate voting data
>>> V <- crossprod(table(d1[1:2]))#/(nrow(d1)/5)
>>> diag(V)<-NA
>>> V1 <<- V
>>> g <- graph.adjacency(V, weighted=TRUE, mode ='undirected')
>>> g <- simplify(g)
>>> ##Add other candidate characteristics to the network object
>>> V(g)$gender<-cand$gender
>>> V(g)$female<-cand$female
>>>
>>> #Convert to Network object
>>> net<- asNetwork(g)
>>> res<-list(net)
>>> names(res)<-"net"
>>> return(res)
>>> }
>>>
>>> ####Apply function to data
>>> cnet<-netf(c,cand)
>>>
>>> ####Run valued network model (gender coefficients not significant in
>>> most simulations)
>>> m<-ergm(cnet$net ~ sum
>>> +nodematch("female", diff=TRUE),
>>> response="weight",
>>> reference = ~Poisson,
>>> control = control.ergm(MCMC.samplesize = 1000,
>>> MCMLE.maxit = 50))
>>> summary(m)
>>>
>>> ####Transform network data into dyadic dataset
>>> data<-list()
>>> for (i in 1:100){
>>> data[[i]]<-as.data.frame(matrix(nrow=100))
>>> data[[i]]$cand1<-cand$listpos[i]
>>> data[[i]]$gender1<-cand$gender[i]
>>> data[[i]]$cand2<-NA
>>> data[[i]]$gender2<-NA
>>> data[[i]]$countvotes<-NA
>>> for (j in 1:100){
>>> data[[i]]$cand2[j]<-cand$listpos[j]
>>> data[[i]]$gender2[j]<-cand$gender[j]
>>> data[[i]]$countvotes[j]<-V1[rownames(V1)==cand$listpos[i],
>>> colnames(V1)==cand$listpos[j]]
>>> }
>>> }
>>> data<-rbind.fill(data)
>>> #Get rid of some dyads (double and self)
>>> for (i in 1:length(data[,1])){
>>> data$cand11[i]<-min(data$cand1[i],data$cand2[i])
>>> data$cand21[i]<-max(data$cand1[i],data$cand2[i])
>>> }
>>> data<-data[duplicated(paste(data$cand11,data$cand21))=="FALSE",]
>>> data<-data[data$cand11!=data$cand21,]
>>>
>>> #Code homophily variable
>>> data$gendermatch<-"Diff"
>>> data$gendermatch[data$gender1=="m" & data$gender2=="m"]<-"Male"
>>> data$gendermatch[data$gender1=="f" & data$gender2=="f"]<-"Female"
>>>
>>> #Check results (show that both genders have pos & sign coefficient)
>>> tapply(data$countvotes,data$gendermatch,mean)
>>> summary(glm(countvotes~gendermatch,family="poisson",data=data))
>>>
>>>
>>> Thanks and best,
>>> Florian
>>>
>>>
>>>
>>>
>>> Please note that the *@johnshopkins.it * domain
>>> is given to alumni, research fellows and former employees of the SAIS
>>> Europe/Bologna Center campus. Emails sent from this domain do not
>>> represent official communications of the Johns Hopkins University or its
>>> SAIS Europe Campus.
>>> _______________________________________________
>>> statnet_help mailing list
>>> statnet_help@u.washington.edu
>>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>>>
>>
--
Please note that the *@johnshopkins.it *
domain is given
to alumni, research fellows and former employees of the
SAIS Europe/Bologna
Center campus.? Emails sent from this domain do not
represent official
communications of the Johns Hopkins University or its
SAIS Europe Campus. ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From kraft.tom at gmail.com Thu May 20 15:07:14 2021
From: kraft.tom at gmail.com (Tom Kraft)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Difference homophily valued network model vs
Poisson model
In-Reply-To:
References:
Message-ID:
Hi Florian,
Please note that I am worried that I might be leading you astray, and after
this I will leave it to the real experts to comment. However, as far as I
can tell, the initial network dataset you are creating (using your
modification to lower the number of edges) does not appear to be generating
the expectation that you want. I base this off the cobbled together code
below, which directly modifies your network object to be an edgelist
(perhaps I made an error?). When I use this same edgelist it returns null
results for both the ERGM and poisson case.
Try running that code and see if you agree. My sincere apologies if I've
made an obvious mistake here.
```
dd <- as.matrix.network.adjacency(cnet$net, "weight")
dd[upper.tri(dd)] <- NA
dd <- melt(dd, na.rm=T)
dd$female1 <- (cnet$net %v% "female")[match(dd$Var1, (cnet$net %v%
"vertex.names"))]
dd$female2 <- (cnet$net %v% "female")[match(dd$Var2, (cnet$net %v%
"vertex.names"))]
dd$fm <- ifelse(dd$female1 == dd$female2, "same", "diff")
dd$fm[which(dd$fm == "same" & dd$female1 == 1)] <- "female"
dd$fm[which(dd$fm == "same" & dd$female1 == 0)] <- "male"
#dd$weight <- cnet$net %e% "weight"
library(dplyr)
dd %>% group_by(fm) %>% summarise(mu = mean(value))
summary(glm(value~fm,family="poisson",data=dd)) #no significant results
```
On Thu, May 20, 2021 at 4:28 PM Florian Weiler wrote:
> I only seem to have gotten the summary for the other two messages, so will
> reply here.
>
> Yes, Carter, the additional libraries provided by Tom should make the code
> work.
>
> Tom, I think I don't fully understand, the documentation of mixingmatrix
> is rather vague ("Most of these are not to be called by the user" according
> to the help page). But yes, by design the distribution of the the gender
> variable (the three possible combinations) is quite balanced. But should
> the effect not depend on the values of the dependent variable associated
> with this gender variable, i.e. the network? And here we tried to simulate
> the network in a way that for the male-male and the female-female vertices
> we see, on average, higher count values. The simple Poisson model as well
> as the cross tabulation in the last two lines of the code I
> sent clearly show that this is the case. For this simple reason, we also
> expected to see similar significant results in the ERGM.
>
> If, based on the mixingmatrix posted by Tom, no effect should be expected,
> clearly the problem lies on my side and my misunderstanding how a valued
> network model (using also a Poisson distribution) works.
>
> Best,
> Florian
>
> On Thu, May 20, 2021 at 8:09 PM Florian Weiler wrote:
>
>> Thank you Tom, I was not aware this could be an issue. I just checked by
>> tweaking the code to obtain a network with only 900 edges. I did this by
>> just using two of the five randomly generated votes, changing line 49 of
>> the code I sent to d1<-rbind(d14,d15). It is true this reflects the real
>> data much better, which has many more zeros than in the simulation I sent
>> around first.
>>
>> But the issue is the same in the new simulation. The ergm shows
>> insignificant results, the Poisson highly significant ones.
>>
>> Thanks again!
>> Florian
>>
>>
>>
>> On Thu, May 20, 2021 at 7:12 PM Tom Kraft wrote:
>>
>>> Hi Florian,
>>>
>>> One quick clarification about your question: I notice that the network
>>> you are simulating has 100 nodes and ~4200 edges. This is quite close to
>>> saturation for an undirected network ( ((100)*(100-1))/2 = 4950 possible
>>> edges). I'm not sure whether this lies at the heart of the issue, but
>>> wanted to check and make sure this was intended (or perhaps you've tried
>>> with lower density already?).
>>>
>>> Tom
>>>
>>> On Thu, May 20, 2021 at 12:52 PM Florian Weiler
>>> wrote:
>>>
>>>> Dear list,
>>>>
>>>> Colleagues and I are running valued network models on
>>>> preferential voting data. Each voter has 5 preferential votes, a tie is
>>>> registered for two candidates selected by the same voter. If 10 voters
>>>> indicate the same (two) candidates, this edge will have the value of 10.
>>>>
>>>> We want to see which role gender plays in this selection process. But
>>>> we were unsure how to model this, so we set up a little simulation to see
>>>> how the model reacts. We tried to code this in a way that some voters have
>>>> a preference for women, and others a preference for men. So we expected the
>>>> term *nodematch("female", diff = TRUE) *to give us positive results
>>>> for both genders compared to the base category (mixed dyads). But running
>>>> this simulation multiple times resulted in
>>>> mostly insignificant results (depending on the run) for both genders.
>>>>
>>>> We checked this by transforming the network into a dyadic dataset with
>>>> the count of joint votes as dependent variable and the independent variable
>>>> coded as Diff, Female and Male. When running conventional Poisson models,
>>>> we clearly found the expected difference every time.
>>>>
>>>> The code of the simulation is below and can be run in well under a
>>>> minute. I would really appreciate any help or explanation of what is going
>>>> on here.
>>>>
>>>> It is very possible that there are obvious mistakes in our reasoning,
>>>> we are not the biggest network experts. In that case, I apologize in
>>>> advance for our incompetence!
>>>>
>>>> Here is the code:
>>>>
>>>> ####Build candidate data setset
>>>> cand<-c(rep("m",50),rep("f",50))
>>>> #Randomise candidate order
>>>> cand<-sample(cand)
>>>> #Assign list position to candidates
>>>> cand<-as.data.frame(cand)
>>>> names(cand)<-"gender"
>>>> cand$listpos<-rownames(cand)
>>>>
>>>> ####Build voter dataset with positive male and female homophily
>>>> (compared to mixed gender)
>>>> c1<-data.frame(nrow=500,ncol=5)
>>>> c2<-data.frame(nrow=500,ncol=5)
>>>> #Sub-sample with "male bias"
>>>> probs<-rep(0.015,100)
>>>> probs[cand$gender=="f"]<-0.005
>>>> for (i in 1:500){
>>>> c1[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
>>>> }
>>>> names(c1)<-c("vote1","vote2","vote3","vote4","vote5")
>>>> c1$id<-rownames(c1)
>>>> #Sub-sample "female bias"
>>>> probs<-rep(0.005,100)
>>>> probs[cand$gender=="f"]<-0.015
>>>> for (i in 1:500){
>>>> c2[i,1:5]<-sample(cand$listpos,5,replace=FALSE,prob=probs)
>>>> }
>>>> names(c2)<-c("vote1","vote2","vote3","vote4","vote5")
>>>> c2$id<-as.numeric(rownames(c2))+500
>>>> #Combined sample
>>>> c<-rbind(c1,c2)
>>>>
>>>> ####Function to convert to network objects
>>>> netf<-function(d,cand){
>>>> cand$female<-0
>>>> cand$female[cand$gender=="f"]<-1
>>>>
>>>> ##Voting data manipulation
>>>> #Creating a stacked dataset
>>>> d11<-d[c(6,1)]
>>>> d12<-d[c(6,2)]
>>>> d13<-d[c(6,3)]
>>>> d14<-d[c(6,4)]
>>>> d15<-d[c(6,5)]
>>>> names(d11)<-c("id","vote")
>>>> names(d12)<-c("id","vote")
>>>> names(d13)<-c("id","vote")
>>>> names(d14)<-c("id","vote")
>>>> names(d15)<-c("id","vote")
>>>> d1<-rbind(d11,d12,d13,d14,d15)
>>>>
>>>> ##Creating a network object for candidate voting data
>>>> V <- crossprod(table(d1[1:2]))#/(nrow(d1)/5)
>>>> diag(V)<-NA
>>>> V1 <<- V
>>>> g <- graph.adjacency(V, weighted=TRUE, mode ='undirected')
>>>> g <- simplify(g)
>>>> ##Add other candidate characteristics to the network object
>>>> V(g)$gender<-cand$gender
>>>> V(g)$female<-cand$female
>>>>
>>>> #Convert to Network object
>>>> net<- asNetwork(g)
>>>> res<-list(net)
>>>> names(res)<-"net"
>>>> return(res)
>>>> }
>>>>
>>>> ####Apply function to data
>>>> cnet<-netf(c,cand)
>>>>
>>>> ####Run valued network model (gender coefficients not significant in
>>>> most simulations)
>>>> m<-ergm(cnet$net ~ sum
>>>> +nodematch("female", diff=TRUE),
>>>> response="weight",
>>>> reference = ~Poisson,
>>>> control = control.ergm(MCMC.samplesize = 1000,
>>>> MCMLE.maxit = 50))
>>>> summary(m)
>>>>
>>>> ####Transform network data into dyadic dataset
>>>> data<-list()
>>>> for (i in 1:100){
>>>> data[[i]]<-as.data.frame(matrix(nrow=100))
>>>> data[[i]]$cand1<-cand$listpos[i]
>>>> data[[i]]$gender1<-cand$gender[i]
>>>> data[[i]]$cand2<-NA
>>>> data[[i]]$gender2<-NA
>>>> data[[i]]$countvotes<-NA
>>>> for (j in 1:100){
>>>> data[[i]]$cand2[j]<-cand$listpos[j]
>>>> data[[i]]$gender2[j]<-cand$gender[j]
>>>> data[[i]]$countvotes[j]<-V1[rownames(V1)==cand$listpos[i],
>>>> colnames(V1)==cand$listpos[j]]
>>>> }
>>>> }
>>>> data<-rbind.fill(data)
>>>> #Get rid of some dyads (double and self)
>>>> for (i in 1:length(data[,1])){
>>>> data$cand11[i]<-min(data$cand1[i],data$cand2[i])
>>>> data$cand21[i]<-max(data$cand1[i],data$cand2[i])
>>>> }
>>>> data<-data[duplicated(paste(data$cand11,data$cand21))=="FALSE",]
>>>> data<-data[data$cand11!=data$cand21,]
>>>>
>>>> #Code homophily variable
>>>> data$gendermatch<-"Diff"
>>>> data$gendermatch[data$gender1=="m" & data$gender2=="m"]<-"Male"
>>>> data$gendermatch[data$gender1=="f" & data$gender2=="f"]<-"Female"
>>>>
>>>> #Check results (show that both genders have pos & sign coefficient)
>>>> tapply(data$countvotes,data$gendermatch,mean)
>>>> summary(glm(countvotes~gendermatch,family="poisson",data=data))
>>>>
>>>>
>>>> Thanks and best,
>>>> Florian
>>>>
>>>>
>>>>
>>>>
>>>> Please note that the *@johnshopkins.it *
>>>> domain is given to alumni, research fellows and former employees of the
>>>> SAIS Europe/Bologna Center campus. Emails sent from this domain do not
>>>> represent official communications of the Johns Hopkins University or its
>>>> SAIS Europe Campus.
>>>> _______________________________________________
>>>> statnet_help mailing list
>>>> statnet_help@u.washington.edu
>>>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>>>>
>>>
>
> Please note that the *@johnshopkins.it * domain
> is given to alumni, research fellows and former employees of the SAIS
> Europe/Bologna Center campus. Emails sent from this domain do not
> represent official communications of the Johns Hopkins University or its
> SAIS Europe Campus.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From morrism at uw.edu Wed Jun 9 23:24:02 2021
From: morrism at uw.edu (martina morris)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] New versions of the statnet packages coming this week!
Message-ID:
Hi folks,
We're excited to announce the upcoming new releases, which are making
their way to CRAN this week and next. The biggest changes are in the
foundational stat packages: ergm and tergm. Both have been given major
updates that extend functionality and provide dramatic improvements in
computational efficiency, especially for large sparse networks (1M nodes).
The updates to ergm are reviewed in this preprint:
http://arxiv.org/abs/2106.04997
The updates to tergm will be reflected in the tergm workshop tutorial
(available in time for Sunbelt), and a paper in due course.
Despite the magnitude of the changes, and the development of new user
interfaces, the updates are almost entirely backwards compatible. That
said, we will be deprecating some of the older functions (like stergm())
in the future, so it's worth learning how to access this functionality
using the new syntax.
If you have questions about the new versions, this statnet_help list is
still the place to ask them.
If you need to report a bug, the GitHub repositories are the place for
that (they're listed at github.org/statnet)
If you want to thank someone, thank Pavel -- he remains our lead developer
and maintainer. It's a crazy amount of work.
Other resources can be found on our website: http://statnet.org/
Happy modeling!
The statnet Development Team
From adamhaber at gmail.com Thu Jun 24 05:31:25 2021
From: adamhaber at gmail.com (Adam Haber)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Question about egocentric networks
Message-ID:
Hello,
I have measured connectivity data in a biological network, and I want to
understand if it is reasonable to try and model it using ergm.ego.
The data is of the following form: I have a big biological network that I
can't fully observe, and whose size is unknown to me (but I can provide
an order-of-magnitude estimate).
For a single, random node in this network, I conducted an experiment in
which I measured if it is connected to a random sample of N *other* nodes.
It was connected to some, but not all. I also measured various node/edge
level covariates of the sampled "ego" and potential "alters".
Finally, I repeated this analysis K times, so I have K "egos" and K sets
(of different sizes) of "potential alters", some of which were connected to
the corresponding ego and some were not, as well as the relevant covariates.
I have never used ergm.ego, and curious to know if it can be used in such
an analysis. Specifically, from the tutorials I have found online it is not
clear to me how to use the information in the sampled-but-not-connected
nodes in each of the K experiments (and their covariates).
Any help would be much appreciated!
Best,
Adam
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From morrism at uw.edu Thu Jun 24 14:28:59 2021
From: morrism at uw.edu (martina morris)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Question about egocentric networks
In-Reply-To:
References:
Message-ID:
Hi Adam,
Your data is a bit different than what is currently designed for
ergm.ego, in that you have information on a sample of ego's non-ties, as
well as their ties.
It would be straightforward to use ergm.ego on the sample of ego/alter tie
data (so, ignoring the non-tie info). So you might consider starting
there. You can, with this approach, use the non-tie data as validation
data: simulate from your models and see if they reproduce the tie vs.
non-tie distributions.
I'm cc'ing Pavel Krivitsky and Michal Bojanowski who may have some ideas
about how to use the non-tie egodata.
Finally, we are teaching the ergm.ego workshop next week: Mon., 6/28
20:00-23:00 ET for the Sunbelt remote workshop. More info here:
https://networks2021.net/workshops
best,
mm
On Thu, 24 Jun 2021, Adam Haber wrote:
> Hello,
>
> I have measured connectivity data in a biological network, and I want to understand if it is reasonable
> to try and model it using ergm.ego.
>
> The data is of the following form: I have a big biological network that I can't fully observe,?and whose
> size is unknown to?me (but I can provide an?order-of-magnitude estimate).
> For a single, random node in this network, I conducted an experiment in which I measured if it is
> connected to a random sample of N other nodes. It was connected to some, but not all. I also measured
> various node/edge level covariates of the sampled "ego" and potential "alters".
> Finally, I repeated this analysis K times, so I have K "egos" and K sets (of different sizes) of
> "potential alters", some of which were connected to the corresponding ego and some were not, as well as
> the relevant covariates.
>
> I have never used ergm.ego, and curious to know if it can be used in such an analysis. Specifically,
> from the tutorials I have found online it is not clear to me how to use the information in the
> sampled-but-not-connected nodes in each of the K experiments (and their covariates).
>
> Any help would be much appreciated!
> Best,
> Adam
>
>
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
Office: (206) 685-3402
Dept Office: (206) 543-5882, 543-7237
Fax: (206) 685-7419
morrism@u.washington.edu
http://faculty.washington.edu/morrism/
From mj2417 at gmail.com Tue Jun 29 04:17:59 2021
From: mj2417 at gmail.com (Mohsen Jafari Songhori)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Modeling and simulation of bi-variate networks with
statnet/ergm
Message-ID:
Dear all,
I am currently conducting my research on generating/simulating a large
number of bivariate (and not bipartite) networks. My (observed) networks
have 40-60 nodes, and, my (ERGM modeling) approach toward them is very
similar to the one taken in the following paper: Can Informal Communication
Networks Disrupt Coordination in New Product Development Projects? |
Organization Science (informs.org)
.
I have attempted to generate bi-variate graphs using XPNet, and
unfortunately, despite the description in its manual, the software does not
generate any sample of bi-variate graph files .
I would be very grateful for any advice on how to model/simulate bi-variate
networks (i.e., a fixed set of nodes that there can be two types of ties
among them) with Statnet/ERGM/ or other R libraries.
Thanks a lot in advance.
Kind Regards,
Mohsen
--
*Postdoc Researcher*
*Eindhoven University of Technology *
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From michal2992 at gmail.com Thu Jul 1 09:56:51 2021
From: michal2992 at gmail.com (=?UTF-8?Q?Micha=C5=82_Bojanowski?=)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] Question about egocentric networks
In-Reply-To:
References:
Message-ID:
Hi all,
This is interesting! Unless I am missing something (sorry in advance
if this is the case) Adam's data is essentially a simple random sample
of *dyads* from the population network.
If only dyad-independent ERGMs are of interest and K*K is much smaller
than the number of dyads in the population network (assuming we have a
good guess about that) a simple approximation would be to estimate
dyadic GLMs for K*K dyads with binary dependent variable whether there
is a tie in the dyad or not and with predictors pretending to
attributes of egos, alters, dyadic attributes etc.
I think ignoring the no-tie dyads and using ergm.ego directly might be
problematic. E.g. while the (halved) sum over egos of the number of
ties in ego-centric census is equal to the number of ties in the
population network, the sum of ties from K sampled dyads incident on
each ego in a egocentric census is not. However, I can easily imagine
that it should be possible to derive (some of) the suffstatistics from
Adams design and then use ergm() to fit a model to a vector such
suffstats...
Michal
On Fri, Jun 25, 2021 at 1:43 AM martina morris wrote:
>
> Hi Adam,
>
> Your data is a bit different than what is currently designed for
> ergm.ego, in that you have information on a sample of ego's non-ties, as
> well as their ties.
>
> It would be straightforward to use ergm.ego on the sample of ego/alter tie
> data (so, ignoring the non-tie info). So you might consider starting
> there. You can, with this approach, use the non-tie data as validation
> data: simulate from your models and see if they reproduce the tie vs.
> non-tie distributions.
>
> I'm cc'ing Pavel Krivitsky and Michal Bojanowski who may have some ideas
> about how to use the non-tie egodata.
>
> Finally, we are teaching the ergm.ego workshop next week: Mon., 6/28
> 20:00-23:00 ET for the Sunbelt remote workshop. More info here:
> https://networks2021.net/workshops
>
> best,
> mm
>
>
> On Thu, 24 Jun 2021, Adam Haber wrote:
>
> > Hello,
> >
> > I have measured connectivity data in a biological network, and I want to understand if it is reasonable
> > to try and model it using ergm.ego.
> >
> > The data is of the following form: I have a big biological network that I can't fully observe, and whose
> > size is unknown to me (but I can provide an order-of-magnitude estimate).
> > For a single, random node in this network, I conducted an experiment in which I measured if it is
> > connected to a random sample of N other nodes. It was connected to some, but not all. I also measured
> > various node/edge level covariates of the sampled "ego" and potential "alters".
> > Finally, I repeated this analysis K times, so I have K "egos" and K sets (of different sizes) of
> > "potential alters", some of which were connected to the corresponding ego and some were not, as well as
> > the relevant covariates.
> >
> > I have never used ergm.ego, and curious to know if it can be used in such an analysis. Specifically,
> > from the tutorials I have found online it is not clear to me how to use the information in the
> > sampled-but-not-connected nodes in each of the K experiments (and their covariates).
> >
> > Any help would be much appreciated!
> > Best,
> > Adam
> >
> >
> >
>
> ****************************************************************
> Professor Emerita of Sociology and Statistics
> Box 354322
> University of Washington
> Seattle, WA 98195-4322
>
> Office: (206) 685-3402
> Dept Office: (206) 543-5882, 543-7237
> Fax: (206) 685-7419
>
> morrism@u.washington.edu
> http://faculty.washington.edu/morrism/_______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
From laura.wolbring at kit.edu Tue Jul 13 09:02:51 2021
From: laura.wolbring at kit.edu (Wolbring, Laura (IFSS))
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] MCMC diagnostics function to assess convergence of
ergm fit
Message-ID: <848aba16bf994281a9290674a8962061@kit.edu>
Dear all,
I would like to assess model convergence of my ergm fit including dyad dependent terms.
Since I installed the new statnet update, the mcmc.diagnostics function produces two pairs of plots for each statistic (e.g. two different trace plots and two different deviation plots for the gwdsp term). Can anyone tell what might be the difference between the two pairs of plots and which pair is the right one to check for convergence?
Thanks in advance and best regards
Laura
Karlsruhe Institute of Technology (KIT)
Institute of Sports and Sports Science (IfSS)
Laura Wolbring
Research Assistant
Engler-Bunte-Ring 15
Building 40.40
Room -110
76131 Karlsruhe
Phone : +49 721 608-46696
E-Mail:laura.wolbring@kit.edu
Web: www.sport.kit.edu
KIT - Die Forschungsuniversit?t in der Helmholtz-Gemeinschaft
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From r2atuic at gmail.com Thu Jul 15 11:46:05 2021
From: r2atuic at gmail.com (Meng-Hao Li)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] How to obtain predicted probability in stergm
Message-ID:
Dear all,
I am running stergm models. I have no problems obtaining the fitted results
but am not sure how to calculate predicted probability between two nodes.
The interpret() function in xergm looks like what I am looking for, but I
was not able to find a similar function for the stergm fitted object.
I would be very grateful for any advice on how to calculate predicted
probability between two nodes in stergm models.
Kind regards,
Meng-Hao
--
Meng-hao Li
PhD candidate in Public Policy
George Mason University
Google Scholar Profile
LinkedIn Profile
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From morrism at uw.edu Tue Jul 20 17:11:43 2021
From: morrism at uw.edu (martina morris)
Date: Tue Aug 3 21:58:17 2021
Subject: [statnet_help] MCMC diagnostics function to assess convergence
of ergm fit
In-Reply-To: <848aba16bf994281a9290674a8962061@kit.edu>
References: <848aba16bf994281a9290674a8962061@kit.edu>
Message-ID:
Hi Laura,
Can you please include a small reproducible example with one of the
builtin datasets (like faux.mesa.high)?
thx,
Martina
On Tue, 13 Jul 2021, Wolbring, Laura (IFSS) wrote:
>
> Dear all,
>
>
>
> I would like to assess model convergence of my ergm fit including dyad dependent terms.
>
> Since I installed the new statnet update, the mcmc.diagnostics function produces two pairs of plots for
> each statistic (e.g. two different trace plots and two different deviation plots for the gwdsp term).
> Can anyone tell what might be the difference between the two pairs of plots and which pair is the right
> one to check for convergence?
>
>
>
> Thanks in advance and best regards
>
> Laura
>
>
>
>
>
> Karlsruhe Institute of Technology (KIT)
>
> Institute of Sports and Sports Science (IfSS)
>
>
>
> Laura Wolbring
>
> Research Assistant
>
>
>
> Engler-Bunte-Ring 15
>
> Building 40.40
>
> Room -110
>
> 76131 Karlsruhe
>
> Phone : +49 721 608-46696
>
>
>
> E-Mail:laura.wolbring@kit.edu
>
> Web: www.sport.kit.edu
>
>
>
> KIT - Die Forschungsuniversit?t in der Helmholtz-Gemeinschaft
>
>
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
Office: (206) 685-3402
Dept Office: (206) 543-5882, 543-7237
Fax: (206) 685-7419
morrism@u.washington.edu
http://faculty.washington.edu/morrism/
From adamhaber at gmail.com Fri Aug 6 13:20:19 2021
From: adamhaber at gmail.com (Adam Haber)
Date: Fri Aug 6 13:20:36 2021
Subject: [statnet_help] Convergence issue with dyadnoise constraint
Message-ID:
(reposting here in addition to the discussion
on GitHub in case users
who read the mailing list but don't read the GH discussions might help)
Hi,
I'm working with noisy network data, and I'm interested in explicitly
modeling the ties measurement error. I have reasonable prior knowledge
regarding the error rates (estimating a non-tie as a tie or vice versa). I
was quite excited when I saw `dyadnoise` - I *think* this is exactly what I
am looking for. Before trying this functionality on my network, I wanted to
play with the various "standard" datasets.
I ran the following code:
```
data(faux.mesa.high)
mesa <- faux.mesa.high
fauxmodel.01 <- ergm(mesa ~edges +
nodefactor('Grade') + nodematch('Grade',diff=T) +
nodefactor('Race') + nodematch('Race',diff=T))
fauxmodel.02 <- ergm(mesa ~edges +
nodefactor('Grade') + nodematch('Grade',diff=T) +
nodefactor('Race') + nodematch('Race',diff=T),
obs.constraints = ~dyadnoise(0.01, 0.01))
```
and was not able to get the model to fit:
1. CD ran for 60 iterations (not sure if it's supposed to stop upon
convergence or should it always run for all 60 iterations).
2. Got various "Number of informative dyads is too low. Using default
imputation density." messages.
3. After 18 iterations of MCMLE, I got "Error in h(simpleError(msg,
call)) :
error in evaluating the argument 'object' in selecting a method for
function 'isSymmetric': non-conformable arrays" and inference failed.
Would really appreciate any help on this. Am I doing something obviously
wrong? Any references for papers/tutorials (beyond Karwa et al.,2016) that
use dyadnoise would also be much appreciated This seems like such a useful
constraint for so many real-life applications!
Best,
Adam Haber
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From buttsc at uci.edu Fri Aug 6 14:02:17 2021
From: buttsc at uci.edu (Carter T. Butts)
Date: Fri Aug 6 14:02:42 2021
Subject: [statnet_help] Convergence issue with dyadnoise constraint
In-Reply-To:
References:
Message-ID:
Hi, Adam -
FWIW, if you want to /model/ measurement error, you need to use an
explicit measurement error model.? Just changing the support will not do
that.? Johan Koskinen's group has been doing some very nice work on
measurement models grafted to ERGMs in recent years, so I would take a
look at his stuff.? If you have multiple measurements, it is possible
(albeit a little wonky) to use a two-stage scheme where you use the
network inference model in sna (bbnam) - though right now I would
suggest using the one based on graph mixtures, which is at
https://github.com/CarterButts/bbnam.mix - and then fit an ergm to the
result.? One can also partially account for the resulting uncertainty by
considering the posterior predictive distribution of the ergm
coefficients, obtained by fitting the model to a sample of posterior
draws.? (There are some caveats to what that procedure does and does not
do, but depending on your situation it may be a step forward.? IIRC,
Johan has the "proper" version that simultaneously estimates
coefficients and network structure, which I have not implemented.? There
are some special cases where one might actually prefer the two-stage
scheme, but I think that the integrated approach is more often the one
that is desired.)
Hope that helps,
-Carter
On 8/6/21 1:20 PM, Adam Haber wrote:
> (reposting here in addition to the discussion
> on GitHub in case
> users who read the mailing list but don't read the GH discussions
> might help)
>
> Hi,
>
> I'm working with noisy network data, and I'm interested in explicitly
> modeling the ties measurement error. I have reasonable prior knowledge
> regarding the error rates (estimating a non-tie as a tie or vice
> versa). I was quite excited when I saw `dyadnoise` - I /think/ this is
> exactly what I am looking for. Before trying this functionality on my
> network, I wanted to play with the various "standard" datasets.
>
> I ran the following code:
> ```
> data(faux.mesa.high)
> mesa <- faux.mesa.high
> fauxmodel.01 <- ergm(mesa ~edges +
> ? ? ? ? ? ? ? ? ? ? ? ?nodefactor('Grade') + nodematch('Grade',diff=T) +
> ? ? ? ? ? ? ? ? ? ? ? ?nodefactor('Race') + nodematch('Race',diff=T))
> fauxmodel.02 <- ergm(mesa ~edges +
> ? ? ? ? ? ? ? ? ? ? ? ?nodefactor('Grade') + nodematch('Grade',diff=T) +
> ? ? ? ? ? ? ? ? ? ? ? ?nodefactor('Race') + nodematch('Race',diff=T),
> obs.constraints = ~dyadnoise(0.01, 0.01))
> ```
> and was not able to get the model to fit:
>
> 1. CD ran for 60 iterations (not sure if it's supposed to stop upon
> convergence or should it always run for all 60 iterations).
> 2. Got various "Number of informative dyads is too low. Using default
> imputation density." messages.
> 3. After 18 iterations of MCMLE, I got "Error in h(simpleError(msg,
> call)) :
> ? error in evaluating the argument 'object' in selecting a method
> for function 'isSymmetric': non-conformable arrays" and inference
> failed.
>
> Would really appreciate any help on this. Am I doing something
> obviously wrong? Any references for papers/tutorials (beyond Karwa et
> al.,2016) that use dyadnoise?would also be much appreciated? This
> seems like such a useful constraint for so many real-life applications!
>
> Best,
> Adam Haber
>
>
>
>
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From p.krivitsky at unsw.edu.au Sun Aug 8 15:25:51 2021
From: p.krivitsky at unsw.edu.au (Pavel N. Krivitsky)
Date: Sun Aug 8 15:26:24 2021
Subject: [statnet_help] Convergence issue with dyadnoise constraint
In-Reply-To:
References:
Message-ID:
Hi, All,
In this particular case, it's not support that's being changed but
rather the constrained (conditional on observed) MCMC sampling. The
regular missing data MLE is a special case of that. If you know the
measurement error process parameters, then you can do that. (See Karwa,
Krivitsky, Slavkovic (2017).)
As for why it's not fitting, there are several bugs that this example
appears to be exposing, and I am looking into it.
Best,
Pavel
On Fri, 2021-08-06 at 14:02 -0700, Carter T. Butts wrote:
> Hi, Adam -
> FWIW, if you want to model measurement error, you need to use an
> explicit measurement error model.? Just changing the support will not
> do that.? Johan Koskinen's group has been doing some very nice work
> on measurement models grafted to ERGMs in recent years, so I would
> take a look at his stuff.? If you have multiple measurements, it is
> possible (albeit a little wonky) to use a two-stage scheme where you
> use the network inference model in sna (bbnam) - though right now I
> would suggest using the one based on graph mixtures, which is at
> https://github.com/CarterButts/bbnam.mix - and then fit an ergm to
> the result.? One can also partially account for the resulting
> uncertainty by considering the posterior predictive distribution of
> the ergm coefficients, obtained by fitting the model to a sample of
> posterior draws.? (There are some caveats to what that procedure does
> and does not do, but depending on your situation it may be a step
> forward.? IIRC, Johan has the "proper" version that simultaneously
> estimates coefficients and network structure, which I have not
> implemented.? There are some special cases where one might actually
> prefer the two-stage scheme, but I think that the integrated approach
> is more often the one that is desired.)
> Hope that helps,
>
> -Carter
>
> On 8/6/21 1:20 PM, Adam Haber wrote:
>
> (reposting here in addition to the discussion on GitHub in case users
> who read the mailing list but don't read the GH discussions might
> help)
> Hi,
>
> I'm working with noisy network data, and I'm interested in explicitly
> modeling the ties measurement error. I have reasonable prior
> knowledge regarding the error rates (estimating a non-tie as a tie or
> vice versa). I was quite excited when I saw `dyadnoise` - I think
> this is exactly what I am looking for. Before trying this
> functionality on my network, I wanted to play with the various
> "standard" datasets.
>
> I ran the following code:
> ```
> data(faux.mesa.high)
> mesa <- faux.mesa.high
> fauxmodel.01 <- ergm(mesa ~edges +
> ? ? ? ? ? ? ? ? ? ? ? ?nodefactor('Grade') +
> nodematch('Grade',diff=T) +
> ? ? ? ? ? ? ? ? ? ? ? ?nodefactor('Race') + nodematch('Race',diff=T))
> fauxmodel.02 <- ergm(mesa ~edges +
> ? ? ? ? ? ? ? ? ? ? ? ?nodefactor('Grade') +
> nodematch('Grade',diff=T) +
> ? ? ? ? ? ? ? ? ? ? ? ?nodefactor('Race') + nodematch('Race',diff=T),
> obs.constraints = ~dyadnoise(0.01, 0.01))?
> ```
> and was not able to get the model to fit:
> 1. CD ran for 60 iterations (not sure if it's supposed to stop upon
> convergence or should it always run for all 60 iterations).
> 2. Got various "Number of informative dyads is too low. Using
> default imputation density." messages.
> 3. After 18 iterations of MCMLE, I got "Error in h(simpleError(msg,
> call)) :
> ? error in evaluating the argument 'object' in selecting a method
> for function 'isSymmetric': non-conformable arrays" and inference
> failed.
> Would really appreciate any help on this. Am I doing something
> obviously wrong? Any references for papers/tutorials (beyond Karwa et
> al.,2016) that use dyadnoise?would also be much appreciated? This
> seems like such a useful constraint for so many real-life
> applications!
>
> Best,
> Adam Haber
>
>
>
>
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From p.krivitsky at unsw.edu.au Sun Aug 8 15:27:13 2021
From: p.krivitsky at unsw.edu.au (Pavel N. Krivitsky)
Date: Sun Aug 8 15:27:59 2021
Subject: [statnet_help] How to obtain predicted probability in stergm
In-Reply-To:
References:
Message-ID: <47c8ba8b7cef59690aea87ecb993de9a2b7b9e9a.camel@unsw.edu.au>
Dear Meng-Hao,
Please take a look at predict.ergm .
I hope this helps,
Pavel
On Thu, 2021-07-15 at 14:46 -0400, Meng-Hao Li wrote:
> Dear all,
>
> I am running stergm models. I have no problems obtaining the fitted
> results but am not sure how to calculate predicted probability
> between two nodes.
>
> The interpret() function in xergm looks like what I am looking for,
> but I was not able to find a similar function for?the?stergm fitted
> object.
>
> I would be very grateful for any advice on how to calculate predicted
> probability between two nodes in stergm models.
>
> Kind regards,
> Meng-Hao
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From jszhao at yeah.net Sat Aug 28 07:42:14 2021
From: jszhao at yeah.net (Jinsong Zhao)
Date: Sat Aug 28 07:42:50 2021
Subject: [statnet_help] asymmetric covariate in dyadcov;
using upper triangle only
Message-ID:
Hi there,
I am try to understand the usage of dyadcov. The help page said:
dyadcov(x, attrname=NULL)
Dyadic covariate: The x argument is either a square matrix of
covariates, one for each possible edge in the network, the name of a
network attribute of covariates, or a network;
For directed network, the edge list is:
el <- matrix(c(1,1,2,3,2,3,3,1), ncol = 2)
el.net <- network(el, directed = TRUE)
The covariates for the el.net is:
> z <- matrix(0, nrow = 3, ncol = 3)
> z[1,2] <- z[2,3] <- 1
> z[3,1] <- 2
> z[1,3] <- 3
> z
[,1] [,2] [,3]
[1,] 0 1 3
[2,] 0 0 1
[3,] 2 0 0
when I try something like:
ergm(formula = el.net ~ dyadcov(z))
It gives the warning messages: In term ?dyadcov? in package ?ergm?:
asymmetric covariate in dyadcov; using upper triangle only
I do not understand the warning message. z is not a symmetric matrix,
why and how to use upper triangle only? The most possible thing is I do
the wrong thing. However, I don't find any workable example.
Any hint, comment or suggestion will be really appreciated. Thanks in
advance.
Best,
Jinsong
--
Jinsong Zhao
Associate Professor
College of Resources and Environment, Huazhong Agricultural University
Wuhan 430070, PR China
From robert.w.krause at fu-berlin.de Mon Aug 30 00:17:17 2021
From: robert.w.krause at fu-berlin.de (Krause, Robert)
Date: Mon Aug 30 00:17:30 2021
Subject: [statnet_help] asymmetric covariate in dyadcov;
using upper triangle only
In-Reply-To:
References:
Message-ID: <7ed5db9f382041f5945cb119f35ca90a@fu-berlin.de>
Dear Jinsong,
A dyadcovariate must be the same in both directions, that is, the dyad-cov-network must be symmetric. After all, it is an attribute of the dyad must thus work the same in both directions.
The warning tells you that the provided data is not symmetric and will thus only use the upper triangle.
Could it be that you do not want dyadcov but edgecov? A covariate for each edge, which must not be symmetric?
What are the covariate and network in question? What is the hypotheses you want to test?
Cheers
Robert
________________________________
Von: statnet_help im Auftrag von Jinsong Zhao
Gesendet: Samstag, 28. August 2021 16:42:14
An: statnet_help@u.washington.edu
Betreff: [statnet_help] asymmetric covariate in dyadcov; using upper triangle only
Hi there,
I am try to understand the usage of dyadcov. The help page said:
dyadcov(x, attrname=NULL)
Dyadic covariate: The x argument is either a square matrix of
covariates, one for each possible edge in the network, the name of a
network attribute of covariates, or a network;
For directed network, the edge list is:
el <- matrix(c(1,1,2,3,2,3,3,1), ncol = 2)
el.net <- network(el, directed = TRUE)
The covariates for the el.net is:
> z <- matrix(0, nrow = 3, ncol = 3)
> z[1,2] <- z[2,3] <- 1
> z[3,1] <- 2
> z[1,3] <- 3
> z
[,1] [,2] [,3]
[1,] 0 1 3
[2,] 0 0 1
[3,] 2 0 0
when I try something like:
ergm(formula = el.net ~ dyadcov(z))
It gives the warning messages: In term ?dyadcov? in package ?ergm?:
asymmetric covariate in dyadcov; using upper triangle only
I do not understand the warning message. z is not a symmetric matrix,
why and how to use upper triangle only? The most possible thing is I do
the wrong thing. However, I don't find any workable example.
Any hint, comment or suggestion will be really appreciated. Thanks in
advance.
Best,
Jinsong
--
Jinsong Zhao
Associate Professor
College of Resources and Environment, Huazhong Agricultural University
Wuhan 430070, PR China
_______________________________________________
statnet_help mailing list
statnet_help@u.washington.edu
http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From jszhao at yeah.net Mon Aug 30 01:45:47 2021
From: jszhao at yeah.net (Jinsong Zhao)
Date: Mon Aug 30 01:46:43 2021
Subject: [statnet_help] asymmetric covariate in dyadcov;
using upper triangle only
In-Reply-To: <7ed5db9f382041f5945cb119f35ca90a@fu-berlin.de>
References:
<7ed5db9f382041f5945cb119f35ca90a@fu-berlin.de>
Message-ID: <9b04e4f0-36e0-3da2-a8a0-23c9c67b3a13@yeah.net>
Dear Robert,
Thank you very much for the reply. It clears some fuzziness about dyadic
covariate.
The question I asked is just a demo of my confusion, since I did not
find informative documents on the topic of dyadcov or edgecov. I
searched the Internet, the only place that give me demo on how to
specify edgecov or dyadcov is:
https://mjh4.blogspot.com/2012/09/ergm-edgecov-and-dyadcov-specifications.html
And I think I could set up edgecov terms now.
Back to my question, I still don't how to construct a dyadic covariate,
for example, the figure in
https://weblab.com.cityu.edu.hk/blog/chengjun/tag/ergm/
I am new to social networks and ERGM, and try to learn by running
example. Thanks for your time and patience.
Best,
Jinsong
On 2021/8/30 15:17, Krause, Robert wrote:
> Dear Jinsong,
>
>
> A dyadcovariate must be the same in both directions, that is, the
> dyad-cov-network must be symmetric. After all, it is an attribute of the
> dyad must thus work the same in both directions.
>
> The warning tells you that the provided data is not symmetric and will
> thus only use the upper triangle.
>
>
> Could it be that you do not want dyadcov but edgecov? A covariate for
> each edge, which must not be symmetric?
>
>
> What are?the covariate and network in question? What is the hypotheses
> you want to test?
>
>
> Cheers
> Robert
>
> ------------------------------------------------------------------------
> *Von:* statnet_help im
> Auftrag von Jinsong Zhao
> *Gesendet:* Samstag, 28. August 2021 16:42:14
> *An:* statnet_help@u.washington.edu
> *Betreff:* [statnet_help] asymmetric covariate in dyadcov; using upper
> triangle only
> Hi there,
>
> I am try to understand the usage of dyadcov. The help page said:
>
> dyadcov(x, attrname=NULL)
> Dyadic covariate: The x argument is either a square matrix of
> covariates, one for each possible edge in the network, the name of a
> network attribute of covariates, or a network;
>
> For directed network, the edge list is:
>
> el <- matrix(c(1,1,2,3,2,3,3,1), ncol = 2)
> el.net <- network(el, directed = TRUE)
>
> The covariates for the el.net is:
>
> ?> z <- matrix(0, nrow = 3, ncol = 3)
> ?> z[1,2] <- z[2,3] <- 1
> ?> z[3,1] <- 2
> ?> z[1,3] <- 3
> ?> z
> ????? [,1] [,2] [,3]
> [1,]??? 0??? 1??? 3
> [2,]??? 0??? 0??? 1
> [3,]??? 2??? 0??? 0
>
> when I try something like:
>
> ergm(formula = el.net ~ dyadcov(z))
>
> It gives the warning messages: In term ?dyadcov? in package ?ergm?:
> asymmetric covariate in dyadcov; using upper triangle only
>
> I do not understand the warning message. z is not a symmetric matrix,
> why and how to use upper triangle only? The most possible thing is I do
> the wrong thing. However, I don't find any workable example.
>
> Any hint, comment or suggestion will be really appreciated. Thanks in
> advance.
>
> Best,
>
> Jinsong
> --
> Jinsong Zhao
> Associate Professor
> College of Resources and Environment, Huazhong Agricultural University
> Wuhan 430070, PR China
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>
From morrism at uw.edu Mon Aug 30 13:08:56 2021
From: morrism at uw.edu (martina morris)
Date: Mon Aug 30 13:10:31 2021
Subject: [statnet_help] asymmetric covariate in dyadcov;
using upper triangle only
In-Reply-To: <9b04e4f0-36e0-3da2-a8a0-23c9c67b3a13@yeah.net>
References:
<7ed5db9f382041f5945cb119f35ca90a@fu-berlin.de>
<9b04e4f0-36e0-3da2-a8a0-23c9c67b3a13@yeah.net>
Message-ID:
Hi Jinsong,
First, thank you Robert for stepping in to help here.
And yes, Jinsong, we do need more documentation on this, and on many other
parts of the statnet package functionality. It's a massive task, given
the number of packages we maintain, and we just haven't had the
person-power to get to everything.
To everyone -- if you have good examples of specific analyses that might
serve as the foundation of a vignette for the statnet community, please
let us know. We'd be happy to discuss the best way to make this
available. In addition to the traditional R package vignette, there are
many other options these days. A good example is the EpiModel Gallery on
GitHub: https://github.com/EpiModel/EpiModel-Gallery. It contains scripts
you can download and run to learn how to use the package for specific
contexts. And the user community has contributed some of these.
thanks,
mm
On Mon, 30 Aug 2021, Jinsong Zhao wrote:
> Dear Robert,
>
> Thank you very much for the reply. It clears some fuzziness about dyadic
> covariate.
>
> The question I asked is just a demo of my confusion, since I did not find
> informative documents on the topic of dyadcov or edgecov. I searched the
> Internet, the only place that give me demo on how to specify edgecov or
> dyadcov is:
> https://mjh4.blogspot.com/2012/09/ergm-edgecov-and-dyadcov-specifications.html
> And I think I could set up edgecov terms now.
>
> Back to my question, I still don't how to construct a dyadic covariate, for
> example, the figure in
> https://weblab.com.cityu.edu.hk/blog/chengjun/tag/ergm/
>
> I am new to social networks and ERGM, and try to learn by running example.
> Thanks for your time and patience.
>
> Best,
> Jinsong
>
>
> On 2021/8/30 15:17, Krause, Robert wrote:
>> Dear Jinsong,
>>
>>
>> A dyadcovariate must be the same in both directions, that is, the
>> dyad-cov-network must be symmetric. After all, it is an attribute of the
>> dyad must thus work the same in both directions.
>>
>> The warning tells you that the provided data is not symmetric and will thus
>> only use the upper triangle.
>>
>>
>> Could it be that you do not want dyadcov but edgecov? A covariate for each
>> edge, which must not be symmetric?
>>
>>
>> What are?the covariate and network in question? What is the hypotheses you
>> want to test?
>>
>>
>> Cheers
>> Robert
>>
>> ------------------------------------------------------------------------
>> *Von:* statnet_help im
>> Auftrag von Jinsong Zhao
>> *Gesendet:* Samstag, 28. August 2021 16:42:14
>> *An:* statnet_help@u.washington.edu
>> *Betreff:* [statnet_help] asymmetric covariate in dyadcov; using upper
>> triangle only
>> Hi there,
>>
>> I am try to understand the usage of dyadcov. The help page said:
>>
>> dyadcov(x, attrname=NULL)
>> Dyadic covariate: The x argument is either a square matrix of
>> covariates, one for each possible edge in the network, the name of a
>> network attribute of covariates, or a network;
>>
>> For directed network, the edge list is:
>>
>> el <- matrix(c(1,1,2,3,2,3,3,1), ncol = 2)
>> el.net <- network(el, directed = TRUE)
>>
>> The covariates for the el.net is:
>>
>> ?> z <- matrix(0, nrow = 3, ncol = 3)
>> ?> z[1,2] <- z[2,3] <- 1
>> ?> z[3,1] <- 2
>> ?> z[1,3] <- 3
>> ?> z
>> ????? [,1] [,2] [,3]
>> [1,]??? 0??? 1??? 3
>> [2,]??? 0??? 0??? 1
>> [3,]??? 2??? 0??? 0
>>
>> when I try something like:
>>
>> ergm(formula = el.net ~ dyadcov(z))
>>
>> It gives the warning messages: In term ?dyadcov? in package ?ergm?:
>> asymmetric covariate in dyadcov; using upper triangle only
>>
>> I do not understand the warning message. z is not a symmetric matrix,
>> why and how to use upper triangle only? The most possible thing is I do
>> the wrong thing. However, I don't find any workable example.
>>
>> Any hint, comment or suggestion will be really appreciated. Thanks in
>> advance.
>>
>> Best,
>>
>> Jinsong
>> --
>> Jinsong Zhao
>> Associate Professor
>> College of Resources and Environment, Huazhong Agricultural University
>> Wuhan 430070, PR China
>>
>> _______________________________________________
>> statnet_help mailing list
>> statnet_help@u.washington.edu
>> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>>
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
From mcope at msu.edu Tue Aug 31 06:51:14 2021
From: mcope at msu.edu (Copeland, Molly)
Date: Tue Aug 31 06:51:33 2021
Subject: [statnet_help] STERGM network eids error
Message-ID:
Hi all,
My co-authors and I are returning to a project using a STERGM. We've converted to the new tergm 4 format, but we're getting this error when we try to run the model with the stergm or tergm syntax:
Error in lapply(nwl[[b]]$mel, ?[[?, ?atl?) [eids] :
invalid subscript type ?list?
In troubleshooting we've found: we get the same error if we try to make our list of networks into a NetSeries manually, but it works fine if we run each edgelist alone as an ergm, and if we make the networks a NetworkDynamic object, the stergm/tergm runs, but simple models that should take 10 minutes take > 2 days, which seems likely incorrect. We haven't changed our networks (or underlying edgelists) from what previously worked in the STERGM, with the same nodes in both waves. When we compare our constructed networks to the Sampson data, the only difference in the printout is that the samp networks say "no edge attributes" and ours says "edge attribute names not shown" even though it's an unweighted network.
Any clues as to what might be different or where the problem might be, and how to fix?
Thanks!
Molly
Molly Copeland, Ph.D.
(she/her)
Assistant Professor
Department of Sociology
Michigan State University
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From morrism at uw.edu Wed Sep 1 09:36:19 2021
From: morrism at uw.edu (martina morris)
Date: Wed Sep 1 09:36:46 2021
Subject: [statnet_help] STERGM network eids error
In-Reply-To:
References:
Message-ID:
Hi Molly,
Sorry to hear you're having problems with the updated tergm package. It's
hard to diagnose what is happening without a more "reproducible example".
Are you able to share your data? Or, if not, can you create a dummy
dataset with the same structure that generates the same errors, and send
it to us along with the scripts that are generating the error?
If you can do the latter, our preferred method would be to file this as an
"issue" on the tergm GitHub repository: https://github.com/statnet/tergm
This requires that you have a GitHub account (easy enough to sign up if
you don't). If all of that seems too daunting, just email us the dummy
data and script, and we'll take a look at it.
best,
Martina
On Tue, 31 Aug 2021, Copeland, Molly wrote:
> Hi all,
>
> My co-authors and I are returning to a project using a STERGM. We've converted to the new tergm 4 format, but we're
> getting this error when we try to run the model with the stergm or tergm syntax:
>
> Error in lapply(nwl[[b]]$mel, ?[[?, ?atl?) [eids] :
>
> invalid subscript type ?list?
>
>
> In troubleshooting we've found: we get the same error if we try to make our list of networks into a NetSeries manually,
> but it works fine if we run each edgelist alone as an ergm, and if we make the networks a NetworkDynamic object, the
> stergm/tergm runs, but simple models that should take 10 minutes take > 2 days, which seems likely incorrect. We haven't
> changed our networks (or underlying edgelists) from what previously worked in the STERGM, with the same nodes in both
> waves. When we compare our constructed networks to the Sampson data, the only difference in the printout is that the samp
> networks say "no edge attributes" and ours says "edge attribute names not shown" even though it's an unweighted network.
>
> Any clues as to what might be different or where the problem might be, and how to fix?
>
> Thanks!
> Molly
>
> Molly Copeland, Ph.D.
> (she/her)
> Assistant Professor
> Department of Sociology
> Michigan State University
>
>
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
From njatel at limnology.ca Sun Sep 19 11:25:56 2021
From: njatel at limnology.ca (Nelson Jatel)
Date: Sun Sep 19 11:26:10 2021
Subject: [statnet_help] help - "loop error" - statnet
Message-ID: <9EAFEC27-F2AB-4245-94AD-B78528A3378E@hxcore.ol>
An HTML attachment was scrubbed...
URL:
From buttsc at uci.edu Sun Sep 19 14:30:56 2021
From: buttsc at uci.edu (Carter T. Butts)
Date: Sun Sep 19 14:31:26 2021
Subject: [statnet_help] help - "loop error" - statnet
In-Reply-To: <9EAFEC27-F2AB-4245-94AD-B78528A3378E@hxcore.ol>
References: <9EAFEC27-F2AB-4245-94AD-B78528A3378E@hxcore.ol>
Message-ID:
Hi, Nelson -
A new coercion method has been added for data frames, which regrettably
is not handling two-mode adjacency data correctly at present.?
Converting your data to matrix form should restore the expected
behavior.? It is also possible to explicitly call
as.network.matrix(mydat,bipartite=TRUE), which will force the use of the
venerable conversion method.
Hope that helps,
-Carter
On 9/19/21 11:25 AM, Nelson Jatel wrote:
>
> Hi,
>
> I?m trying to re-analyze some of my network data using ERGMs,
> something I?ve done many times before.
>
> When I try and download my data and convert my data.frame to a network
> frame (as.frame), I get the following ?contains loops? errors:
>
> Example 1.
>
> > net_69 <- as.network(data_69, bipartite=T, directed=F, loops = F)
>
> Error: `loops` is `FALSE`, but `x` contains loops.
>
> The following values are affected:
>
> ??????????????? - `x[2, 1:2]`
>
> ??????????????? - `x[3, 1:2]`
>
> ??????????????? - `x[5, 1:2]`
>
> ??????????????? - `x[7, 1:2]`
>
> ??????????????? - `x[12, 1:2]`
>
> ??????????????? - `x[13, 1:2]
>
> Example 2.
>
> > class(data1_weak) #data.frame
>
> [1] "data.frame"
>
> > data1_weak
>
> ? A B C D E F G
>
> A 0 1 0 0 0 0 0
>
> B 0 0 0 0 1 1 0
>
> C 0 1 0 0 0 0 0
>
> D 0 1 0 0 0 0 0
>
> E 0 1 0 0 0 1 0
>
> F 0 1 0 0 0 0 0
>
> G 0 1 0 0 0 0 0
>
> > plot(data1_weak)
>
> > net_weak <- as.network(data1_weak, bipartite=FALSE, directed=TRUE)
>
> Error: `loops` is `FALSE`, but `x` contains loops.
>
> The following values are affected:
>
> - `x[2, 1:2]`
>
> Please advise.
>
> Nelson Jatel, /doctoral candidate/
>
>
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From adamhaber at gmail.com Thu Oct 7 13:07:59 2021
From: adamhaber at gmail.com (Adam Haber)
Date: Thu Oct 7 13:08:29 2021
Subject: [statnet_help] Computing logLik for a different network
Message-ID:
Hi,
I'm trying to compute the log-likelihood of a network G2 under a model
fitted to a different network G1; that is, I'm trying to compute logP(G2 |
theta_mle) such that theta_mle = argmax_theta{P(G1 | theta)} for some ERGM
model P. G1 and G2 have the same attributes, but with different values.
At the moment, I'm using:
```
#fit the real model
m1 <- ergm(g1 ~ edges + nodemix(~type) + mutual)
#create a "dummy" model
m2 <- ergm(g2 ~ edges + nodemix(~type) + mutual, estimate = c("MPLE"))
#change m2's properties so we use the real, MCMLE theta
m2$coefficients <- coefficients(m1)
m2$estimate <- "MLE"
logLik(m2, force.reeval = T)
```
Is there a better way to do this? Also, any references to prior work that
includes/studies similar "train/validation" splits would be much
appreciated!
Best wishes,
Adam Haber
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From njatel at limnology.ca Tue Oct 12 06:22:20 2021
From: njatel at limnology.ca (Nelson Jatel)
Date: Tue Oct 12 06:22:53 2021
Subject: [statnet_help] ERGM correlation values
Message-ID:
Hi,
For my doctoral research, I'm exploring longitudinal panel data for an
organization and modelling correlation between ERGM variables within my
model. For an individual panel (1 year) of data, the statnet package
outputs correlation data (Figure 1 below).
I've reviewed the statnet manual, but didn't find answers to the following
questions:
(1) What data is being correlated in the Statnet output?
(2) My data is not normally distributed and I'm applying a spearman
correlation to the longitudinal ERGM panel data. How is the Statnet
correlation data calculated, in my case for a single year of panel data?
Thanks,
*Nelson Jatel*, *Dr. (cand.)*
Figure 1. Correlation data for 1969 watershed organization panel data
generated by Statnet.
[image: Rplot_1969_correlation_model2.png]
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplot_1969_correlation_model2.png
Type: image/png
Size: 22137 bytes
Desc: not available
URL:
From morrism at uw.edu Tue Oct 12 10:00:38 2021
From: morrism at uw.edu (martina morris)
Date: Tue Oct 12 10:02:15 2021
Subject: [statnet_help] ERGM correlation values
In-Reply-To:
References:
Message-ID:
Hi Nelson,
So we understand where this plot came from, can you please include the
script that produces the plot?
thx,
Martina
On Tue, 12 Oct 2021, Nelson Jatel wrote:
> Hi,
> For my doctoral research, I'm exploring longitudinal panel data for an organization and modelling correlation between
> ERGM variables within my model.? For an individual?panel (1 year) of data, the statnet package outputs correlation data
> (Figure 1 below).
>
> I've reviewed the statnet manual, but didn't find answers to the following questions:
>
> (1) What data is being correlated in the Statnet output?
> (2) My data is not normally distributed and I'm applying a spearman correlation to the longitudinal ERGM panel data.? How
> is the Statnet correlation data calculated,?in my case for a single year of panel data?
>
> Thanks,
>
> Nelson Jatel,?Dr. (cand.)
>
> Figure 1.? Correlation data for 1969 watershed organization panel data generated by Statnet.
> Rplot_1969_correlation_model2.png
>
>
****************************************************************
Professor Emerita of Sociology and Statistics
Box 354322
University of Washington
Seattle, WA 98195-4322
From p.krivitsky at unsw.edu.au Fri Oct 15 04:17:53 2021
From: p.krivitsky at unsw.edu.au (Pavel N. Krivitsky)
Date: Fri Oct 15 04:18:13 2021
Subject: [statnet_help] Computing logLik for a different network
In-Reply-To:
References:
Message-ID:
Hi, Adam,
It depends. If the two networks only differ in their edges (and not in
their attributes), the quickest approach is to take advantage of the
fact that
log P(G2 | theta) = g(G2) * theta - k(theta)
?= (g(G2) - g(G1) + g(G1)) * theta - k(theta)
?= (g(G2) - g(G1)) * theta + g(G1) * theta - k(theta)
?= (g(G2) - g(G1)) * theta + log P(G1 | theta)
so
logLik(m1) + sum((summary(m1$formula, basis=g2) - summary(m1$formula, basis=g1) * coef(m1))
should do the trick.
If not, take a look at the low-level function
ergm.bridge.dindstart.llk(), which is used by logLik.ergm() for most
binary ERGMs.
I hope this helps,
Pavel
On Thu, 2021-10-07 at 23:07 +0300, Adam Haber wrote:
> Hi,
>
> I'm trying to compute the log-likelihood of a network G2 under a
> model fitted to a different network G1; that is, I'm trying to
> compute logP(G2 | theta_mle) such that theta_mle = argmax_theta{P(G1
> | theta)} for some ERGM model P. G1 and G2 have the same attributes,
> but with different values.
>
> At the moment, I'm using:
> ```
> #fit the real model
> m1 <- ergm(g1 ~ edges + nodemix(~type) + mutual)
>
> #create a "dummy" model
> m2 <- ergm(g2 ~ edges + nodemix(~type) + mutual, estimate =
> c("MPLE"))?
>
> #change m2's properties so we use the real, MCMLE theta
> m2$coefficients <- coefficients(m1)
> m2$estimate <- "MLE"
> logLik(m2, force.reeval = T)
> ```
> Is there a better way to do this? Also, any references to prior work
> that includes/studies similar "train/validation"?splits would be much
> appreciated!
>
> Best wishes,
> Adam Haber
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From njatel at limnology.ca Fri Oct 15 07:02:27 2021
From: njatel at limnology.ca (Nelson Jatel)
Date: Fri Oct 15 07:02:59 2021
Subject: [statnet_help] gof error
Message-ID:
Hi,
When i run my gof() for any of my ERG models, i get the following error:
Error in 0:nb2 : NA/NaN argument
In addition: Warning message:
In 0:nb2 : numerical expression has 88 elements: only the first used
> summary(ergm69)
Call:
ergm(formula = net_69 ~ edges + b1star(2), bipartite = T)
Monte Carlo Maximum Likelihood Results:
Estimate Std. Error MCMC % z value Pr(>|z|)
edges -2.12020 0.13504 0 -15.701 <1e-04 ***
b1star2 0.44054 0.04416 0 9.977 <1e-04 ***
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Null Deviance: 985.7 on 711 degrees of freedom
Residual Deviance: 736.2 on 709 degrees of freedom
AIC: 740.2 BIC: 749.4 (Smaller is better. MC Std. Err. = 0.701)
Thanks,
*Nelson Jatel*, *Dr. (cand.), P.Ag.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From adamhaber at gmail.com Sat Oct 16 10:39:18 2021
From: adamhaber at gmail.com (Adam Haber)
Date: Sat Oct 16 10:39:33 2021
Subject: [statnet_help] Computing logLik for a different network
In-Reply-To:
References:
Message-ID:
Thanks Pavel, that's exactly what I needed!
To clarify - when you say "not differing in their attributes", you mean
(for example) that both G1 and G2 have a node attribute called foo, but not
necessarily that G1%v%"foo" = G2%v%"foo", right? That is, they have
identical node/edge attribute *names, *while potentially having different
values within these attributes?
Best,
Adam
On Fri, Oct 15, 2021 at 2:17 PM Pavel N. Krivitsky
wrote:
> Hi, Adam,
>
> It depends. If the two networks only differ in their edges (and not in
> their attributes), the quickest approach is to take advantage of the fact
> that
>
> log P(G2 | theta) = g(G2) * theta - k(theta)
> = (g(G2) - g(G1) + g(G1)) * theta - k(theta)
> = (g(G2) - g(G1)) * theta + g(G1) * theta - k(theta)
> = (g(G2) - g(G1)) * theta + log P(G1 | theta)
>
> so
>
> logLik(m1) + sum((summary(m1$formula, basis=g2) - summary(m1$formula, basis=g1) * coef(m1))
>
>
> should do the trick.
>
> If not, take a look at the low-level function ergm.bridge.dindstart.llk(),
> which is used by logLik.ergm() for most binary ERGMs.
>
> I hope this helps,
> Pavel
>
> On Thu, 2021-10-07 at 23:07 +0300, Adam Haber wrote:
>
> Hi,
>
> I'm trying to compute the log-likelihood of a network G2 under a model
> fitted to a different network G1; that is, I'm trying to compute logP(G2 |
> theta_mle) such that theta_mle = argmax_theta{P(G1 | theta)} for some ERGM
> model P. G1 and G2 have the same attributes, but with different values.
>
> At the moment, I'm using:
> ```
> #fit the real model
> m1 <- ergm(g1 ~ edges + nodemix(~type) + mutual)
>
> #create a "dummy" model
> m2 <- ergm(g2 ~ edges + nodemix(~type) + mutual, estimate = c("MPLE"))
>
> #change m2's properties so we use the real, MCMLE theta
> m2$coefficients <- coefficients(m1)
> m2$estimate <- "MLE"
> logLik(m2, force.reeval = T)
> ```
> Is there a better way to do this? Also, any references to prior work that
> includes/studies similar "train/validation" splits would be much
> appreciated!
>
> Best wishes,
> Adam Haber
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From p.krivitsky at unsw.edu.au Sat Oct 16 22:55:17 2021
From: p.krivitsky at unsw.edu.au (Pavel N. Krivitsky)
Date: Sat Oct 16 22:55:40 2021
Subject: [statnet_help] Computing logLik for a different network
In-Reply-To:
References:
Message-ID: <10908971e0bcd34c22270a7449089e42c49ade3f.camel@unsw.edu.au>
Hi, Adam,
No, they have to have the same size and nodal attributes (to the extent
that they are used in the model). This is necessary for k(theta) to be
the same for both.
I hope this helps,
Pavel
On Sat, 2021-10-16 at 20:39 +0300, Adam Haber wrote:
> Thanks?Pavel, that's exactly what I needed!
>
> To clarify - when you say "not differing in their attributes", you
> mean (for example) that both G1 and G2 have a node attribute called
> foo, but not necessarily that G1%v%"foo" = G2%v%"foo", right? That
> is, they have identical node/edge attribute names, while potentially
> having different values within these attributes?
>
> Best,
> Adam
>
> On Fri, Oct 15, 2021 at 2:17 PM Pavel N. Krivitsky
> wrote:
> > Hi, Adam,
> >
> > It depends. If the two networks only differ in their edges (and not
> > in their attributes), the quickest approach is to take advantage of
> > the fact that
> >
> > log P(G2 | theta) = g(G2) * theta - k(theta)
> > ?= (g(G2) - g(G1) + g(G1)) * theta - k(theta)
> > ?= (g(G2) - g(G1)) * theta + g(G1) * theta - k(theta)
> > ?= (g(G2) - g(G1)) * theta + log P(G1 | theta)
> >
> > so
> >
> > logLik(m1) + sum((summary(m1$formula, basis=g2) -
> > summary(m1$formula, basis=g1) * coef(m1))
> >
> >
> > should do the trick.
> >
> > If not, take a look at the low-level function
> > ergm.bridge.dindstart.llk(), which is used by logLik.ergm() for
> > most binary ERGMs.
> >
> > I hope this helps,
> > Pavel
> >
> > On Thu, 2021-10-07 at 23:07 +0300, Adam Haber wrote:
> > > Hi,
> > >
> > > I'm trying to compute the log-likelihood of a network G2 under a
> > > model fitted to a different network G1; that is, I'm trying to
> > > compute logP(G2 | theta_mle) such that theta_mle =
> > > argmax_theta{P(G1 | theta)} for some ERGM model P. G1 and G2 have
> > > the same attributes, but with different values.
> > >
> > > At the moment, I'm using:
> > > ```
> > > #fit the real model
> > > m1 <- ergm(g1 ~ edges + nodemix(~type) + mutual)
> > >
> > > #create a "dummy" model
> > > m2 <- ergm(g2 ~ edges + nodemix(~type) + mutual, estimate =
> > > c("MPLE"))?
> > >
> > > #change m2's properties so we use the real, MCMLE theta
> > > m2$coefficients <- coefficients(m1)
> > > m2$estimate <- "MLE"
> > > logLik(m2, force.reeval = T)
> > > ```
> > > Is there a better way to do this? Also, any references to prior
> > > work that includes/studies similar "train/validation"?splits
> > > would be much appreciated!
> > >
> > > Best wishes,
> > > Adam Haber
> > > _______________________________________________
> > > statnet_help mailing list
> > > statnet_help@u.washington.edu
> > > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
> >
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From adamhaber at gmail.com Sat Oct 16 23:19:26 2021
From: adamhaber at gmail.com (Adam Haber)
Date: Sat Oct 16 23:20:04 2021
Subject: [statnet_help] Computing logLik for a different network
In-Reply-To: <10908971e0bcd34c22270a7449089e42c49ade3f.camel@unsw.edu.au>
References:
<10908971e0bcd34c22270a7449089e42c49ade3f.camel@unsw.edu.au>
Message-ID:
I'm sorry for re-iterating, the terminology is a little confusing for me.
To make this more explicit - let's say I have a graph G with 100 nodes, and
each node has categorical nodal "type", which has 3 levels, and a
continuous nodal attribute "age".
I split the graph into 2 subnetworks - G1 (nodes 1, 3, 5, ... 99 in G) and
G2 (nodes 2, 4, 6, ... 100 in G). Now both G1 and G2 will have a
"type"/"age" nodal attributes, but possibly with different values
(G1%v%"type" != G2%v%"type", same for "age"). They are of the same size
(N=50) and have different connectivity. Assuming the model includes "type"
and "age" - for example, m <- ergm(G1 ~ edges + nodemix(~type) +
nodeicov("age") + mutual), can I use the formula? If not - why not?
Thanks again for your help,
Adam
On Sun, Oct 17, 2021 at 8:55 AM Pavel N. Krivitsky
wrote:
> Hi, Adam,
>
> No, they have to have the same size and nodal attributes (to the extent
> that they are used in the model). This is necessary for k(theta) to be the
> same for both.
>
> I hope this helps,
> Pavel
>
> On Sat, 2021-10-16 at 20:39 +0300, Adam Haber wrote:
>
> Thanks Pavel, that's exactly what I needed!
>
> To clarify - when you say "not differing in their attributes", you mean
> (for example) that both G1 and G2 have a node attribute called foo, but not
> necessarily that G1%v%"foo" = G2%v%"foo", right? That is, they have
> identical node/edge attribute *names, *while potentially having different
> values within these attributes?
>
> Best,
> Adam
>
> On Fri, Oct 15, 2021 at 2:17 PM Pavel N. Krivitsky <
> p.krivitsky@unsw.edu.au> wrote:
>
> Hi, Adam,
>
> It depends. If the two networks only differ in their edges (and not in
> their attributes), the quickest approach is to take advantage of the fact
> that
>
> log P(G2 | theta) = g(G2) * theta - k(theta)
> = (g(G2) - g(G1) + g(G1)) * theta - k(theta)
> = (g(G2) - g(G1)) * theta + g(G1) * theta - k(theta)
> = (g(G2) - g(G1)) * theta + log P(G1 | theta)
>
> so
>
> logLik(m1) + sum((summary(m1$formula, basis=g2) - summary(m1$formula, basis=g1) * coef(m1))
>
>
> should do the trick.
>
> If not, take a look at the low-level function ergm.bridge.dindstart.llk(),
> which is used by logLik.ergm() for most binary ERGMs.
>
> I hope this helps,
> Pavel
>
> On Thu, 2021-10-07 at 23:07 +0300, Adam Haber wrote:
>
> Hi,
>
> I'm trying to compute the log-likelihood of a network G2 under a model
> fitted to a different network G1; that is, I'm trying to compute logP(G2 |
> theta_mle) such that theta_mle = argmax_theta{P(G1 | theta)} for some ERGM
> model P. G1 and G2 have the same attributes, but with different values.
>
> At the moment, I'm using:
> ```
> #fit the real model
> m1 <- ergm(g1 ~ edges + nodemix(~type) + mutual)
>
> #create a "dummy" model
> m2 <- ergm(g2 ~ edges + nodemix(~type) + mutual, estimate = c("MPLE"))
>
> #change m2's properties so we use the real, MCMLE theta
> m2$coefficients <- coefficients(m1)
> m2$estimate <- "MLE"
> logLik(m2, force.reeval = T)
> ```
> Is there a better way to do this? Also, any references to prior work that
> includes/studies similar "train/validation" splits would be much
> appreciated!
>
> Best wishes,
> Adam Haber
> _______________________________________________
> statnet_help mailing list
> statnet_help@u.washington.edu
> http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From p.krivitsky at unsw.edu.au Sun Oct 17 03:39:40 2021
From: p.krivitsky at unsw.edu.au (Pavel N. Krivitsky)
Date: Sun Oct 17 03:40:03 2021
Subject: [statnet_help] Computing logLik for a different network
In-Reply-To:
References:
<10908971e0bcd34c22270a7449089e42c49ade3f.camel@unsw.edu.au>
Message-ID:
Hi, Adam,
Different values for attributes mean that you can't use the quick
approach, because k(theta) is going to be different. For that matter,
if you have networks with different composition, it's not clear if
their parameters are comparable as well, though they probably are
depending on the social process (see Krivitsky, Handcock, Morris 2011).
How do you define the boundary of the split between G1 and G2?
Best Regards,
Pavel
On Sun, 2021-10-17 at 09:19 +0300, Adam Haber wrote:
> I'm sorry for re-iterating, the?terminology is a little confusing for
> me. To make this more explicit - let's say I have a graph G with 100
> nodes, and each node has categorical nodal? "type", which has 3
> levels, and a continuous nodal?attribute "age".
> I split the graph into 2 subnetworks - G1 (nodes 1, 3, 5, ... 99 in
> G) and G2 (nodes 2, 4, 6, ... 100 in G). Now both G1 and G2 will have
> a "type"/"age" nodal attributes, but possibly with different values
> (G1%v%"type" != G2%v%"type", same for "age"). They are of the same
> size (N=50) and have different connectivity. Assuming the model
> includes "type" and "age" - for example, m <- ergm(G1 ~ edges?+
> nodemix(~type) + nodeicov("age") + mutual), can I use the formula? If
> not - why not?
>
> Thanks again for your help,
> Adam
>
>
> On Sun, Oct 17, 2021 at 8:55 AM Pavel N. Krivitsky
> wrote:
> > Hi, Adam,
> >
> > No, they have to have the same size and nodal attributes (to the
> > extent that they are used in the model). This is necessary for
> > k(theta) to be the same for both.
> >
> > I hope this helps,
> > Pavel
> >
> > On Sat, 2021-10-16 at 20:39 +0300, Adam Haber wrote:
> > > Thanks?Pavel, that's exactly what I needed!
> > >
> > > To clarify - when you say "not differing in their attributes",
> > > you mean (for example) that both G1 and G2 have a node attribute
> > > called foo, but not necessarily that G1%v%"foo" = G2%v%"foo",
> > > right? That is, they have identical node/edge attribute names,
> > > while potentially having different values within these
> > > attributes?
> > >
> > > Best,
> > > Adam
> > >
> > > On Fri, Oct 15, 2021 at 2:17 PM Pavel N. Krivitsky
> > > wrote:
> > > > Hi, Adam,
> > > >
> > > > It depends. If the two networks only differ in their edges (and
> > > > not in their attributes), the quickest approach is to take
> > > > advantage of the fact that
> > > >
> > > > log P(G2 | theta) = g(G2) * theta - k(theta)
> > > > ?= (g(G2) - g(G1) + g(G1)) * theta - k(theta)
> > > > ?= (g(G2) - g(G1)) * theta + g(G1) * theta - k(theta)
> > > > ?= (g(G2) - g(G1)) * theta + log P(G1 | theta)
> > > >
> > > > so
> > > >
> > > > logLik(m1) + sum((summary(m1$formula, basis=g2) -
> > > > summary(m1$formula, basis=g1) * coef(m1))
> > > >
> > > >
> > > > should do the trick.
> > > >
> > > > If not, take a look at the low-level function
> > > > ergm.bridge.dindstart.llk(), which is used by logLik.ergm() for
> > > > most binary ERGMs.
> > > >
> > > > I hope this helps,
> > > > Pavel
> > > >
> > > > On Thu, 2021-10-07 at 23:07 +0300, Adam Haber wrote:
> > > > > Hi,
> > > > >
> > > > > I'm trying to compute the log-likelihood of a network G2
> > > > > under a model fitted to a different network G1; that is, I'm
> > > > > trying to compute logP(G2 | theta_mle) such that theta_mle =
> > > > > argmax_theta{P(G1 | theta)} for some ERGM model P. G1 and G2
> > > > > have the same attributes, but with different values.
> > > > >
> > > > > At the moment, I'm using:
> > > > > ```
> > > > > #fit the real model
> > > > > m1 <- ergm(g1 ~ edges + nodemix(~type) + mutual)
> > > > >
> > > > > #create a "dummy" model
> > > > > m2 <- ergm(g2 ~ edges + nodemix(~type) + mutual, estimate =
> > > > > c("MPLE"))?
> > > > >
> > > > > #change m2's properties so we use the real, MCMLE theta
> > > > > m2$coefficients <- coefficients(m1)
> > > > > m2$estimate <- "MLE"
> > > > > logLik(m2, force.reeval = T)
> > > > > ```
> > > > > Is there a better way to do this? Also, any references to
> > > > > prior work that includes/studies similar
> > > > > "train/validation"?splits would be much appreciated!
> > > > >
> > > > > Best wishes,
> > > > > Adam Haber
> > > > > _______________________________________________
> > > > > statnet_help mailing list
> > > > > statnet_help@u.washington.edu
> > > > >
> > > > http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
> > > >
> > > >
> >
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From robert.w.krause at fu-berlin.de Sun Oct 17 03:46:26 2021
From: robert.w.krause at fu-berlin.de (Krause, Robert)
Date: Sun Oct 17 03:46:48 2021
Subject: [statnet_help] Computing logLik for a different network
In-Reply-To:
References:
<10908971e0bcd34c22270a7449089e42c49ade3f.camel@unsw.edu.au>,
Message-ID: <1570238d6a88436b8b2d50eaa998124e@fu-berlin.de>
Dear Adam,
I think you will not get the "correct" likelihood if you do.
The reason is that the distributions of the attributes could be quite different. Hence, parameters related to the attributes will have different values. For example:
Imagine one group with a distribution of 80%,10%,10% for the three levels and the other with 33%, 33%, 34%. Then, having lots of same type edges for type 1 in the first group is very likely, without this being any sign of homophily, while mixing from type 1 to 2 or 3 or even worse, between 2 and 3 is far less likely in a random graph.
In comparison, mixing between the types from any type to any other type is equally likely for all types in group 2. Thus, the same number of cross-type connections is linked to different nodemix parameter values in each of the two groups.
Thus, AFAIK, the likelihood of g(2) under theta(1) is not adjusted for the difference in the distribution of the nodal attribute and might thus be an "unfair" comparison
Cheers
Robert
________________________________
Von: statnet_help im Auftrag von Adam Haber
Gesendet: Sonntag, 17. Oktober 2021 08:19:26
An: Pavel N. Krivitsky
Cc: statnet_help@u.washington.edu
Betreff: Re: [statnet_help] Computing logLik for a different network
I'm sorry for re-iterating, the terminology is a little confusing for me. To make this more explicit - let's say I have a graph G with 100 nodes, and each node has categorical nodal "type", which has 3 levels, and a continuous nodal attribute "age".
I split the graph into 2 subnetworks - G1 (nodes 1, 3, 5, ... 99 in G) and G2 (nodes 2, 4, 6, ... 100 in G). Now both G1 and G2 will have a "type"/"age" nodal attributes, but possibly with different values (G1%v%"type" != G2%v%"type", same for "age"). They are of the same size (N=50) and have different connectivity. Assuming the model includes "type" and "age" - for example, m <- ergm(G1 ~ edges + nodemix(~type) + nodeicov("age") + mutual), can I use the formula? If not - why not?
Thanks again for your help,
Adam
On Sun, Oct 17, 2021 at 8:55 AM Pavel N. Krivitsky > wrote:
Hi, Adam,
No, they have to have the same size and nodal attributes (to the extent that they are used in the model). This is necessary for k(theta) to be the same for both.
I hope this helps,
Pavel
On Sat, 2021-10-16 at 20:39 +0300, Adam Haber wrote:
Thanks Pavel, that's exactly what I needed!
To clarify - when you say "not differing in their attributes", you mean (for example) that both G1 and G2 have a node attribute called foo, but not necessarily that G1%v%"foo" = G2%v%"foo", right? That is, they have identical node/edge attribute names, while potentially having different values within these attributes?
Best,
Adam
On Fri, Oct 15, 2021 at 2:17 PM Pavel N. Krivitsky > wrote:
Hi, Adam,
It depends. If the two networks only differ in their edges (and not in their attributes), the quickest approach is to take advantage of the fact that
log P(G2 | theta) = g(G2) * theta - k(theta)
= (g(G2) - g(G1) + g(G1)) * theta - k(theta)
= (g(G2) - g(G1)) * theta + g(G1) * theta - k(theta)
= (g(G2) - g(G1)) * theta + log P(G1 | theta)
so
logLik(m1) + sum((summary(m1$formula, basis=g2) - summary(m1$formula, basis=g1) * coef(m1))
should do the trick.
If not, take a look at the low-level function ergm.bridge.dindstart.llk(), which is used by logLik.ergm() for most binary ERGMs.
I hope this helps,
Pavel
On Thu, 2021-10-07 at 23:07 +0300, Adam Haber wrote:
Hi,
I'm trying to compute the log-likelihood of a network G2 under a model fitted to a different network G1; that is, I'm trying to compute logP(G2 | theta_mle) such that theta_mle = argmax_theta{P(G1 | theta)} for some ERGM model P. G1 and G2 have the same attributes, but with different values.
At the moment, I'm using:
```
#fit the real model
m1 <- ergm(g1 ~ edges + nodemix(~type) + mutual)
#create a "dummy" model
m2 <- ergm(g2 ~ edges + nodemix(~type) + mutual, estimate = c("MPLE"))
#change m2's properties so we use the real, MCMLE theta
m2$coefficients <- coefficients(m1)
m2$estimate <- "MLE"
logLik(m2, force.reeval = T)
```
Is there a better way to do this? Also, any references to prior work that includes/studies similar "train/validation" splits would be much appreciated!
Best wishes,
Adam Haber
_______________________________________________
statnet_help mailing list
statnet_help@u.washington.edu
http://mailman13.u.washington.edu/mailman/listinfo/statnet_help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: