gstudio: Spatial Analysis of Genetic Marker Data

Author

Rodney J. Dyer

Published

February 16, 2026

Preface

The gstudio package provides a comprehensive set of tools for the spatial analysis of genetic marker data in R. Originally developed to support research in landscape genetics and population genomics, it offers a unified framework for estimating genetic diversity, population structure, inter-individual and inter-population distances, relatedness, and population graph topology.

At the core of the package is the locus S3 class, a flexible representation of genotypes that supports codominant markers, SNPs, AFLPs, and allozymes. Built around this data type, gstudio provides four main analytical entry points — genetic_diversity(), genetic_structure(), genetic_distance(), and genetic_relatedness() — each dispatching to a range of estimators. Population Graphs, a graph-theoretic approach to understanding among-population connectivity (Dyer & Nason 2004), are fully integrated with construction, analysis, and visualization tools.

Who This Book Is For

This book is intended for population geneticists, landscape ecologists, and conservation biologists who want to analyze genetic marker data in R. It assumes basic familiarity with R and introductory population genetics concepts.

How to Use This Book

Each chapter is self-contained and can be read independently, though they build on each other sequentially. Code examples use live R execution, so you can follow along by running the code in your own R session.

Quick Start

library(gstudio)
data(arapat)
str(arapat, max.level = 1)
'data.frame':   363 obs. of  14 variables:
 $ Species   : Factor w/ 3 levels "Cape","Mainland",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ Cluster   : Factor w/ 5 levels "CBP-C","NBP-C",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ Population: Factor w/ 39 levels "101","102","12",..: 30 30 30 30 30 30 30 30 30 30 ...
 $ ID        : Factor w/ 363 levels "101_10A","101_1A",..: 296 297 298 299 300 301 302 303 304 305 ...
 $ Latitude  : num  29.3 29.3 29.3 29.3 29.3 ...
 $ Longitude : num  -114 -114 -114 -114 -114 ...
 $ LTRS      : 'locus' chr  "01:01" "01:01" "01:01" "01:01" ...
  ..- attr(*, "locus_type")= chr [1:363] "codom" "codom" "codom" "codom" ...
 $ WNT       : 'locus' chr  "" "01:03" "01:03" "01:01" ...
  ..- attr(*, "locus_type")= chr [1:363] "codom" "codom" "codom" "codom" ...
 $ EN        : 'locus' chr  "02:04" "01:01" "01:01" "01:02" ...
  ..- attr(*, "locus_type")= chr [1:363] "codom" "codom" "codom" "codom" ...
 $ EF        : 'locus' chr  "01:02" "01:01" "01:01" "01:01" ...
  ..- attr(*, "locus_type")= chr [1:363] "codom" "codom" "codom" "codom" ...
 $ ZMP       : 'locus' chr  "" "01:01" "01:01" "01:01" ...
  ..- attr(*, "locus_type")= chr [1:363] "codom" "codom" "codom" "codom" ...
 $ AML       : 'locus' chr  "" "08:09" "08:09" "08:09" ...
  ..- attr(*, "locus_type")= chr [1:363] "codom" "codom" "codom" "codom" ...
 $ ATPS      : 'locus' chr  "09:09" "09:09" "09:09" "09:09" ...
  ..- attr(*, "locus_type")= chr [1:363] "codom" "codom" "codom" "codom" ...
 $ MP20      : 'locus' chr  "07:07" "07:07" "05:07" "07:07" ...
  ..- attr(*, "locus_type")= chr [1:363] "codom" "codom" "codom" "codom" ...

The arapat dataset contains multilocus microsatellite genotypes from the cactus beetle Araptus attenuatus and will serve as the primary example dataset throughout this book.

head(arapat)
    Species Cluster Population     ID Latitude Longitude  LTRS   WNT    EN
1 Peninsula   NBP-C         88 88_11A 29.32541 -114.2935 01:01       02:04
2 Peninsula   NBP-C         88 88_12A 29.32541 -114.2935 01:01 01:03 01:01
3 Peninsula   NBP-C         88 88_13A 29.32541 -114.2935 01:01 01:03 01:01
4 Peninsula   NBP-C         88 88_14A 29.32541 -114.2935 01:01 01:01 01:02
5 Peninsula   NBP-C         88 88_15A 29.32541 -114.2935 01:01 01:03 01:02
6 Peninsula   NBP-C         88 88_16A 29.32541 -114.2935 01:01 01:01 01:02
     EF   ZMP   AML  ATPS  MP20
1 01:02             09:09 07:07
2 01:01 01:01 08:09 09:09 07:07
3 01:01 01:01 08:09 09:09 05:07
4 01:01 01:01 08:09 09:09 07:07
5 01:01 01:01 08:09 09:09 05:05
6 01:01 01:01 09:09 09:09 07:07

Citation

If you use gstudio in your research, please cite:

Dyer, R.J. 2009. GStudio: a suite of tools for the spatial analysis of genetic marker data. Molecular Ecology Resources 9(1): 110-113.

Source Code

The source code for gstudio is available on GitHub at https://github.com/dyerlab/gstudio.