Web PopGen


This help page answers some questions about the mechanics of Web PopGen.  The function of buttons will be described as well as some of the algorithms used in generating the graphs.  Evolutionary theory will not be covered.  Click on the links below to get information on a particular aspect of the program



Display, Stochastic/Infinite populations, Population Size, Number of Populations, Number of Generations, Initial Allele Frequency,   Selection/Fitness, Graphing Average Fitness, Mutation, Migration, Inbreeding Bottle Neck Effect, Graph effects - Zoom and Pan
How to copy and paste graphs into a Word Processor Document (Macintosh, Windows).

Please send any "Bug" reports, comments or suggestions to Bob Sheehy

Introduction:


This program is a web based version of Joe Felsenstein's   Simul8 and PopG programs.  There are variations that make the this program more attractive and some that may make it more painful to use.

Variations of this program from PopG/Simul8
 
Future Changes Return to top

Display:

WebPopgen
              settings menue

The upper part of the display contains the input boxes for setting variables (population size, fitness, migration rates etc.).  The lower portion contains two graphs.  If 2 or more populations are modeled the upper graph will plot the frequency of the A1 allele for each population and the lower graph will mirror the upper graph by plotting the frequency of the A2 allele.  If a single population is modeled the frequencies of both the A1 and A2 allele are plotted on the upper graph while the genotype frequencies of  the Homozygote A1A1, the homozygote A2A2 and the heterozygote A1A2 is plotted on the lower graph.

The simulation is started by clicking on the "Go" button.  If the effect you are looking for (for example allele fixation) has not occurred by the completion of the run you could extend the number of generations.

Return to top



Finite (Stochastic)/Infinite populations:

Simulation may be run either with a defined population size or with a theoretically infinite population.  Finite populations are used to examine the effects of population size on genetic drift.  Some other factors which affect allele frequency may become more apparent when using infinite populations.

When running simulations with a finite population size there will be a control population (Infinite Population) that will allow the user to compare the stocastic populations to an infinite population.

The pull-down menu allows the user to toggle between finite and infinite populations.  When in infinite population size mode the user may choose to set the starting allele frequency of each population different from other populations. The number of populations being simulated will determine the number of allele frequency input boxes that are available.  Below (center) is what appears for 5 populations.  Each population may have a different starting allele frequency.  If the user wishes to have the initial allele frequency the same among all populations she may set the allele frequency for the first population and then click the "Set Equal" button. 

Note that when the population is in Infinite mode a Reset Freq button appears.  This allows the user to reset her starting allele frequency(ies) between runs.

finite infinite
                    population settings 
Variable
                    Allele Frequencies  
 
Return to top


Population Size

Population Size
            settings 

Population size between 5 and 10,000 individuals work well.  The larger the population the slower the program.   Populations larger than 10,000 result in much slower response time.  When running the program in stochastic mode all populations start with the same initial allele frequency.  Allele frequency will drift, independently, in each population.
Return to top


Number of Populations

Number of population settings

1 to 10 populations may be simulated at a time.  If 1 population is chosen then both the A1 and A2 allele frequencies are plotted on the upper graph and the three genotype frequencies (A1A1, A2A2 and A1A2) are plotted on the lower graph.  If 2 or more populations are simulated then the frequency of the A1 allele for each population is plotted on the top graph while the frequencies of the A2 allele for each population is plotted on the lower graph.  Each population is given a different color and, in the absence of migration, each population is independent of the others. 

Return to top


Number of Generations


The number of generations may be set to any desired value.  The greater the number of generations the longer it will take the simulation to run (the time between clicking the "Go" button and when the results appear in the graphs below.  Population size, population number, and the number of generations to run are the most influential variables on the speed of the program.

Number of
                    generations settings

Return to top

Initial Allele Frequency

Web PopGen simulates evolution at a single locus with two alleles (A1 and A2).  Since allele frequencies at a locus must sum to 1.0, knowing the frequency of one allele (the A1 allele for example) allows you to calculate the frequency of the alternative allele. 

freq(A1) + freq(A2) = 1.0
Given the frequency of A1 it is not difficult to calculate the frequency of the A2 allele.
freq(A2) = 1-freq(A1)

Hence, the user need only enter initial allele frequencies for the A1 allele.  When simulating stochastic populations all populations will start with the same, user determined allele frequency.


Initial
                    Allele Frequency A1 allele

Return to top


Selection/Fitness

Fitness
            Settings 
Web PopGen may be used to model natural selection of alleles and genotypes.  The fitness of a particular genotype may range from 0 to 1 inclusive.  Fitness provides the relative survival rate of a particular genotype with the most fit genotype(s) being set to 1.0.  For example, suppose the finesses for the three genotypes are  A1A1 = 1, A1A2 = 1 and A2A2 = 0.8.  This would indicate that fitness of the A1 allele behaves as a dominant, That the A1A1 homozygote and the A1A2 heterozygote have an equal fitness (leave, on average, the same number of viable offspring) and that the  A2A2 homozygote leaves, on average 80% of the number of viable offspring as either the A1A1 and the A1A2 genotypes.

The general formula for selection is:

  • Average Fitness (w):
    • w  = p2 * w11 + 2pq * w12 + q2 * w22
      • where p = the frequency of the A1 allele, q = the frequency of the A2 allele, w11 = the fitness of the A1A1 homozygote,  w12 = the relative fitness of the A1A2 heterozygote and  w22 = the relative fitness of the A2A2 homozygote.
  • The frequencies of the various genotypes after selection:
    • Frequency of the A1A1 homozygote = (p2 * w11)/ w
    • Frequency of the A1A2 heterozygote = (2pq * w12)/ w
    • Frequency of the A2A2 homozygote = q2 * w22)/ w
Selecting Infinite population will cause a check box to appear in the Fitness box.  Initially, the fitness for each genotype is the same for all populations.

 

Initially, all populations share the same fitness values as placed in the various boxes.  Clicking on the check box reveals a drop down menu which allows the user to enter different genotypic fitnesses for each of the populations.

 

After entering the desired values clicking on the "OK" button will close the dialogue box.  Only the fitness values for population 1 will be visible.  Note: Fitnessees for population 1 will not change until the simulation is started.

When the "GO" button is clicked, starting the simulation, a key to the various populations will appear to the right of the upper graph.

 
Clicking on the fitness check box will return to all populations sharing the same fitness values.

Return to top

Graphing Average Fitness w, or Δp

When simulating infinite populations with varied fitness the frequency of the A1 allele will be depicted in the upper graph.  The user has the option to choose which data are depicted in the lower graph.  By default the frequency of the A2 allele is depicted here, but by choosing from the drop-down menu in the Set fitnesses window you may choose to graph wBar (w) or the absolute value of the rate of change in p (abs(Δp)).




Migration

Currently two models of migration are possible.
island model
Source Sink
            Model

 
 
 

The migration option can be chosen by clicking on the Migration pull-down menu.  Choosing the Island model you will see the input data box  asking for the rate of migration.  This is a number between 0 and 1 inclusive (0 = no migration, 1 = every individual is a migrant).  If you choose the Source/Sink model you will see an input box for rate of migration and another box for the A1 allele frequency in the source population.  Frequency of the A1 allele in the source population may vary between 0 and 1 inclusive.

The general formula for migration is:


Mutation


Inbreeding occurs when there is mating between two relatives.  This can be by mate choice (Assortative mating) or by chance.  The degree of inbreeding in a population is quantified by the inbreeding coefficient, F which can range from 0 (no inbreeding) to 1 (complete inbreeding).

In population genetics there are several F statistics, also called the fixation statistics, depending on the comparisons being made (FIS - Individual vs Sub Population,  FST - Sub population vs total population, FIT - individual vs Total populations).  In this simulator the F statistic is equivalent to Wright's FIS.  and is the average kinship coefficient between mating pairs of individuals in the previous generation.

Inbreeding does not change allele frequency within a population and therefore, by itself, does not lead to evolution.  It does however alter the genotype frequencies expected under the assumptions of Hardy-Weinberg.  Coupled with selection, inbreeding may affect the rate at which allele frequencies change relative to a population without inbreeding.


Note that inbreeding will result in, on average, the increase in homozygotes and a decrease in heterozygotes.  The F statistic is calculated by comparing the frequency of heterozygtes with a population to the expected frequency under the assumption of Hardy-Weinberg (2pq).

Return to top



Zoom and Pan

To zoom in on a particular region of the graph click and drag within the graph itself.  Double clicking in the graph will zoom out to the original size.

Zoom gif

The graph functions allow you to see exact values for multiple data points or the closest data point to the curser.  This is controlled by clicking on the icons at the top of the graph.  These functions may come in handy when you wish to know exact values at particular region of the graph.







If you are zoomed in to a region of the graph you may use the pan tool to explore regions outside your view.







© 2020 by Bob Sheehy
Radford University
Radford Virginia

Last up dated July 4th 2020