8  Population Graphs

Population Graphs are a graph-theoretic approach to understanding among-population genetic connectivity (Dyer & Nason 2004). Rather than assuming a specific model of gene flow (e.g., island or stepping-stone), population graphs estimate the conditional genetic covariance structure among populations, retaining only statistically significant edges.

library(gstudio)
library(igraph)

Attaching package: 'igraph'
The following objects are masked from 'package:stats':

    decompose, spectrum
The following object is masked from 'package:base':

    union
data(arapat)

8.1 Constructing a Population Graph

8.1.1 From a Data Frame

The easiest way to create a population graph is with population_graph(), which handles the conversion from genotypes to multivariate data internally:

graph <- population_graph(arapat, stratum = "Species")
Warning in popgraph(data, groups, ...): 1 variables are collinear and being
dropped from the discriminant rotation.
graph
IGRAPH 45b720f UNW- 3 3 -- 
+ attr: name (v/c), size (v/n), weight (e/n)
+ edges from 45b720f (vertex names):
[1] Cape    --Mainland  Cape    --Peninsula Mainland--Peninsula

8.1.2 From a Multivariate Matrix

For more control, convert genotypes to multivariate format first, then call popgraph() directly:

mv <- to_mv(arapat)
groups <- factor(arapat$Species)
graph <- popgraph(mv, groups)
Warning in popgraph(mv, groups): 1 variables are collinear and being dropped
from the discriminant rotation.
graph
IGRAPH b02f179 UNW- 3 3 -- 
+ attr: name (v/c), size (v/n), weight (e/n)
+ edges from b02f179 (vertex names):
[1] Cape    --Mainland  Cape    --Peninsula Mainland--Peninsula

The alpha parameter controls edge retention significance (default 0.05):

graph_strict <- popgraph(mv, groups, alpha = 0.01)
Warning in popgraph(mv, groups, alpha = 0.01): 1 variables are collinear and
being dropped from the discriminant rotation.
graph_strict
IGRAPH 35c848d UNW- 3 3 -- 
+ attr: name (v/c), size (v/n), weight (e/n)
+ edges from 35c848d (vertex names):
[1] Cape    --Mainland  Cape    --Peninsula Mainland--Peninsula

8.1.3 From an Adjacency Matrix

Use as.popgraph() to convert a pre-computed adjacency matrix:

A <- matrix(c(0, 1, 0, 1,
              1, 0, 1, 0,
              0, 1, 0, 1,
              1, 0, 1, 0), nrow = 4)
rownames(A) <- colnames(A) <- LETTERS[1:4]
g <- as.popgraph(A)
g
IGRAPH ec84e43 UNW- 4 4 -- 
+ attr: name (v/c), weight (e/n)
+ edges from ec84e43 (vertex names):
[1] A--B A--D B--C C--D

8.2 Bundled Population Graphs

The package includes two pre-computed population graphs:

data(lopho)
lopho
This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH 6af7beb UNW- 21 52 -- 
+ attr: name (v/c), size (v/n), color (v/c), Region (v/c), weight (e/n)
+ edges from 6af7beb (vertex names):
 [1] BaC--LaV    BaC--Lig    BaC--PtC    BaC--PtP    BaC--SnE    BaC--SnI   
 [7] BaC--StR    Ctv--PtP    Ctv--SLG    Ctv--SnF    Ctv--SenBas LaV--Lig   
[13] LaV--PtC    LaV--SnE    LaV--SnF    LaV--TsS    Lig--PtC    Lig--SnI   
[19] Lig--StR    Lig--TsS    PtC--SnE    PtC--StR    PtC--TsS    PtC--SenBas
[25] PtP--SnF    PtP--SnI    PtP--SenBas SLG--SnF    SLG--SnI    SnE--StR   
[31] SnE--TsS    SnF--SnI    SnI--StR    StR--TsS    StR--SenBas CP --Seri  
[37] CP --SG     CP --SN     CP --TS     LF --PL     LF --SG     LF --SI    
[43] PL --SenBas PL --SG     PL --SI     PL --SN    
+ ... omitted several edges
data(upiga)
upiga
This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
IGRAPH 3157d53 UNW- 18 58 -- 
+ attr: name (v/c), size (v/n), color (v/c), weight (e/n)
+ edges from 3157d53 (vertex names):
 [1] SI  --PL     SI  --LF     SI  --CP     SI  --Ctv    SI  --TrV   
 [6] SI  --SenBas SG  --CP     SG  --Seri   SG  --EH     SG  --LaV   
[11] PL  --LF     PL  --SN     PL  --Seri   PL  --EH     PL  --SenBas
[16] TS  --CP     TS  --SN     TS  --Seri   TS  --EH     TS  --Ctv   
[21] TS  --LaV    TS  --SnF    LF  --CP     LF  --SN     LF  --Ctv   
[26] LF  --SLG    LF  --SnF    LF  --SnE    CP  --SN     CP  --EH    
[31] CP  --LaV    CP  --SenBas SN  --Seri   SN  --SenBas Seri--SLG   
[36] EH  --SLG    EH  --SenBas Ctv --SLG    Ctv --TrV    Ctv --BaC   
+ ... omitted several edges

8.3 Decorating a Graph

The decorate_graph() function merges external data into graph vertex attributes. This is essential for spatial visualization:

data(baja)
head(baja)
  Region Population Latitude Longitude
1   Baja        BaC    26.59   -111.79
2   Baja        Ctv    29.73   -114.72
3   Baja        LaV    24.04   -109.99
4   Baja        Lig    25.73   -111.27
5   Baja        PtC    24.19   -111.15
6   Baja        PtP    29.03   -113.90
lopho_dec <- decorate_graph(lopho, baja, stratum = "Population")
vertex_attr_names(lopho_dec)
[1] "name"      "size"      "color"     "Region"    "Latitude"  "Longitude"

Now the graph nodes carry coordinate and metadata attributes.

8.4 Graph Metrics

8.4.1 Node Properties

# Vertex names
V(lopho)$name
 [1] "BaC"    "Ctv"    "LaV"    "Lig"    "PtC"    "PtP"    "SLG"    "SnE"   
 [9] "SnF"    "SnI"    "StR"    "TsS"    "CP"     "LF"     "PL"     "SenBas"
[17] "Seri"   "SG"     "SI"     "SN"     "TS"    
# Degree (number of edges)
degree(lopho)
   BaC    Ctv    LaV    Lig    PtC    PtP    SLG    SnE    SnF    SnI    StR 
     7      4      6      6      7      5      3      5      5      6      7 
   TsS     CP     LF     PL SenBas   Seri     SG     SI     SN     TS 
     5      4      3      5      5      3      5      5      5      3 
# Betweenness centrality
betweenness(lopho)
   BaC    Ctv    LaV    Lig    PtC    PtP    SLG    SnE    SnF    SnI    StR 
     9     31      4      0      0     25      3      0     16      8     30 
   TsS     CP     LF     PL SenBas   Seri     SG     SI     SN     TS 
     0      2      0     92     98      1     32     15      0      0 

8.4.2 Centroid Distance

The multivariate centroid distance between population centroids in discriminant space:

mv <- to_mv(arapat)
groups <- factor(arapat$Species)
cd <- centroid_distance(mv, groups)
cd
               [,1]      [,2]      [,3]      [,4]       [,5]       [,6]
Cape      0.0800000 0.9200000 0.1333333 0.8533333 0.00000000 0.00000000
Mainland  0.2638889 0.7361111 0.8194444 0.0000000 0.01388889 0.00000000
Peninsula 0.6924603 0.3075397 0.3432540 0.0000000 0.59920635 0.03769841
                 [,7]      [,8]       [,9]      [,10]       [,11]       [,12]
Cape      0.000000000 0.3000000 0.70000000 0.00000000 0.000000000 0.000000000
Mainland  0.000000000 0.2222222 0.00000000 0.63888889 0.111111111 0.000000000
Peninsula 0.003968254 0.9007937 0.04960317 0.02380952 0.009920635 0.007936508
              [,13]      [,14]      [,15]     [,16]       [,17]       [,18]
Cape      0.9666667 0.03333333 0.00000000 0.9333333 0.000000000 0.000000000
Mainland  0.7916667 0.15277778 0.02777778 0.5277778 0.000000000 0.000000000
Peninsula 0.5714286 0.42857143 0.27976190 0.6726190 0.001984127 0.001984127
               [,19]     [,20]       [,21]      [,22]     [,23]      [,24]
Cape      0.08666667 0.5733333 0.286666667 0.00000000 0.0000000 0.00000000
Mainland  0.00000000 0.0000000 0.000000000 0.00000000 0.0000000 0.05555556
Peninsula 0.00000000 0.0000000 0.009920635 0.09722222 0.3174603 0.33730159
              [,25]      [,26]       [,27]     [,28]       [,29]      [,30]
Cape      0.0000000 0.00000000 0.000000000 0.0000000 0.000000000 0.01333333
Mainland  0.0000000 0.01388889 0.361111111 0.1527778 0.000000000 0.00000000
Peninsula 0.1964286 0.01785714 0.001984127 0.0000000 0.001984127 0.00000000
                [,31]      [,32]      [,33] [,34] [,35]      [,36]      [,37]
Cape      0.013333333 0.94000000 0.00000000 0.000  0.02 0.00000000 0.00000000
Mainland  0.861111111 0.01388889 0.05555556 0.000  0.00 0.00000000 0.00000000
Peninsula 0.003968254 0.00000000 0.00000000 0.625  0.00 0.03968254 0.03571429
               [,38]       [,39]      [,40]      [,41]      [,42]      [,43]
Cape      0.01333333 0.000000000 0.00000000 0.00000000 0.00000000 0.00000000
Mainland  0.06944444 0.000000000 0.01388889 0.08333333 0.08333333 0.00000000
Peninsula 0.28968254 0.005952381 0.00000000 0.00000000 0.00000000 0.01190476
               [,44]     [,45]     [,46]      [,47]      [,48]      [,49]
Cape      0.00000000 0.0000000 0.0000000 0.00000000 0.00000000 0.00000000
Mainland  0.01388889 0.0000000 0.0000000 0.00000000 0.02777778 0.04166667
Peninsula 0.38293651 0.1150794 0.3730159 0.03571429 0.00000000 0.01785714
               [,50]     [,51]     [,52]      [,53]       [,54]       [,55]
Cape      0.00000000 0.0000000 0.0000000 0.00000000 0.000000000 0.006666667
Mainland  0.09722222 0.2916667 0.2222222 0.04166667 0.000000000 0.000000000
Peninsula 0.05357143 0.0000000 0.0000000 0.00000000 0.001984127 0.000000000
          [,56]     [,57] [,58]
Cape       0.18 0.7733333  0.04
Mainland   0.00 0.0000000  0.00
Peninsula  0.00 0.0000000  0.00

8.4.3 Centroid Variance

Multivariate variance around each population centroid:

cv <- centroid_variance(mv, groups)
cv
     Cape  Mainland Peninsula 
 1.519820  2.169841  2.886842 

8.4.4 Shannon Entropy

Shannon entropy quantifies the uncertainty in a probability distribution:

# Entropy of the degree distribution
deg <- degree(lopho)
p <- deg / sum(deg)
shannon_entropy(p)
[1] 4.344766

8.4.5 Edge Contortion

A measure of how edges deviate from geographic expectations (requires decorated graph with coordinates):

edge_contortion(lopho_dec)

8.5 Congruence Testing

8.5.1 Topology Congruence

Compare the topology of two population graphs:

data(lopho)
data(upiga)
congruence_topology(lopho, upiga)
This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
This graph was created by an old(er) igraph version.
ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
For now we convert it on the fly...
Warning in congruence_topology(lopho, upiga): These two topologies have
non-overlapping node sets!  Careful on interpretation.
IGRAPH 93b08a9 UN-- 16 19 -- 
+ attr: name (v/c)
+ edges from 93b08a9 (vertex names):
 [1] BaC --LaV    BaC --SnE    BaC --StR    Ctv --SLG    Ctv --SnF   
 [6] LaV --SnE    SLG --SnF    StR --SenBas CP  --SG     CP  --SN    
[11] CP  --TS     LF  --PL     LF  --SI     PL  --SenBas PL  --SI    
[16] PL  --SN     Seri--SG     Seri--SN     SN  --TS    

8.5.2 Statistical Testing

Use test_congruence() to test whether two graphs are more similar in topology than expected by chance:

test_congruence(lopho, upiga, method = "combinatorial")

8.5.3 Permutation

The permute_popgraph() function creates a null distribution of graph topologies by permuting the data:

null_graphs <- permute_popgraph(mv, groups, nperm = 99)

8.6 Subgraph Operations

The - operator computes the difference between two popgraphs (removes edges in common). You can also extract subgraphs using igraph functions:

smaller <- delete_vertices(lopho, "SenBas")
V(smaller)$name
 [1] "BaC"  "Ctv"  "LaV"  "Lig"  "PtC"  "PtP"  "SLG"  "SnE"  "SnF"  "SnI" 
[11] "StR"  "TsS"  "CP"   "LF"   "PL"   "Seri" "SG"   "SI"   "SN"   "TS"  

8.7 Export Formats

8.7.1 To Adjacency Matrix

M <- to_matrix(lopho)
M[1:5, 1:5]
    BaC Ctv LaV Lig PtC
BaC   0   0   1   1   1
Ctv   0   0   0   0   0
LaV   1   0   0   1   1
Lig   1   0   1   0   1
PtC   1   0   1   1   0

8.7.2 To Data Frame

df <- to_df(lopho)
head(df)
  name      size   color Region
1  BaC 12.158810 #FDAE61   Baja
2  Ctv  3.880886 #FDAE61   Baja
3  LaV  3.533757 #FDAE61   Baja
4  Lig  4.731355 #FDAE61   Baja
5  PtC  4.684652 #FDAE61   Baja
6  PtP 10.925375 #FDAE61   Baja

8.7.3 To JSON

json <- to_json(lopho)
cat(json)

8.7.4 To Spatial Formats

# As sf object
sf_obj <- to_sf(lopho_dec)

# As KML
to_kml(lopho_dec, file = "lopho.kml")

8.7.5 To pgraph Format

The .pgraph format is a simple text representation:

write_popgraph(lopho_dec, file = "lopho.pgraph")

8.7.6 Reading a Population Graph

path <- system.file("extdata", "lopho.pgraph", package = "gstudio")
g <- read_popgraph(path)
g
IGRAPH 10edd5e UNW- 21 50 -- 
+ attr: name (v/c), size (v/n), color (v/c), weight (e/n)
+ edges from 10edd5e (vertex names):
 [1] BaC --LaV    BaC --Lig    BaC --PtP    BaC --SnE    BaC --SnI   
 [6] BaC --StR    BaC --SenBas Ctv --PtP    Ctv --SLG    Ctv --SnF   
[11] Ctv --SenBas LaV --Lig    LaV --SnE    LaV --SnF    LaV --TsS   
[16] Lig --PtC    Lig --SnE    Lig --SnI    Lig --StR    PtC --SnE   
[21] PtC --StR    PtC --TsS    PtP --SnF    PtP --SnI    PtP --SenBas
[26] SLG --SnF    SLG --SnI    SnE --StR    SnE --TsS    SnF --SnI   
[31] SnI --StR    StR --TsS    StR --SenBas CP  --LF     CP  --Seri  
[36] CP  --SG     CP  --SN     CP  --TS     LF  --PL     LF  --SG    
+ ... omitted several edges

8.8 Randomization

Generate a random graph with the same number of nodes and a specified number of edges:

rg <- randomize_graph(lopho)
rg
IGRAPH ac565aa UN-- 21 52 -- 
+ attr: name (v/c)
+ edges from ac565aa (vertex names):
 [1] BaC   --SnF    BaC   --SnE    BaC   --SenBas BaC   --SLG    BaC   --PL    
 [6] BaC   --SI     BaC   --TsS    CP    --TS     CP    --SnI    TsS   --CP    
[11] CP    --PtC    SnF   --Ctv    SenBas--Ctv    PtC   --Ctv    TS    --Ctv   
[16] TS    --LaV    TsS   --LaV    LaV   --Lig    SenBas--LaV    LaV   --SN    
[21] SnE   --LF     LF    --SG     SnI   --LF     Lig   --StR    TsS   --Lig   
[26] Lig   --PtP    Lig   --SN     PL    --PtC    SnF   --PL     PL    --SI    
[31] PL    --SN     PtC   --Lig    PtC   --SN     PtC   --SG     PtC   --PtP   
[36] StR   --PtP    PtP   --Seri   SnI   --PtP    SenBas--Seri   SnE   --Seri  
+ ... omitted several edges