ﺑﺎﺯﮔﺸﺖ ﺑﻪ ﺻﻔﺤﻪ ﻗﺒﻠﯽ
خرید پکیج
تعداد آیتم قابل مشاهده باقیمانده : -25 مورد

Tools for genetics and genomics: Model systems

Tools for genetics and genomics: Model systems
Author:
Robert D Blank, MD, PhD
Section Editor:
Cheryl L Ackert-Bicknell, PhD
Deputy Editor:
Jennifer S Tirnauer, MD
Literature review current through: Apr 2025. | This topic last updated: Mar 27, 2025.

INTRODUCTION — 

Even having the complete sequence of the human genome is insufficient to understand the genetic basis of disease. The sequence data are far more informative if they can be correlated with functional information. This correlation is often accomplished through the use of model systems in which aspects of human disease or human physiology are recapitulated in an animal or other organism. Similarities with human genes span the entire range of life down to unicellular organisms.

This topic discusses model systems used to study the relationship between genotype and functional consequences.

Separate topics discuss other means of evaluating genomic information:

Next-generation sequencing – (See "Next-generation DNA sequencing (NGS): Principles and clinical applications".)

Gene editing – (See "Overview of gene therapy, gene editing, and gene silencing".)

Genome-wide association studies (GWAS) – (See "Genetic association and GWAS studies: Principles and applications".)

Epigenetics – (See "Principles of epigenetics".)

Mendelian randomization – (See "Mendelian randomization".)

Transcriptomics and expression profiling – (See "Tools for genetics and genomics: Gene expression profiling".)

Basic genetics concepts are also discussed separately:

Terminology – (See "Genetics: Glossary of terms".)

DNA regulation – (See "Basic genetics concepts: DNA regulation and gene expression".)

Cell division – (See "Basic genetics concepts: Chromosomes and cell division".)

HISTORICAL BACKGROUND

Human genome — The genome is the full complement of genetic information encoded on a complete haploid set of chromosomes. The human DNA (deoxyribonucleic acid) sequence, as defined by the human genome project, was first published in 2001; this represents a composite compiled using DNA obtained from many individuals [1]. The original sequence assembly had important limitations, including reliance on a small set of samples and lack of functional annotation. Subsequently, much additional information regarding sequence variation among individuals has been collected, addressing in part the preponderance of early data obtained from persons of European descent. Even today, interindividual variation remains incompletely cataloged.

Efforts to define variation between individuals and to relate such variation to disease risk remain central tasks for human geneticists. (See "Basic genetics concepts: DNA regulation and gene expression", section on 'Genetic variation'.)

Much of this investigation is focused on single nucleotide polymorphisms (SNPs); these are sequence variations that occur roughly once in several hundred bases [2]. Other, less ubiquitous types of variations, such as insertion-deletions (indels) and copy number variation, have also received considerable study [3,4].

Two complementary approaches are being pursued to accomplish the goal of relating gene composition to function, or genotype to phenotype:

Epidemiologic studies seek to establish associations between genes and traits.

Functional studies seek to elucidate the mechanisms by which sequence variation leads to phenotypic differences.

Genomes in other organisms — Structural and functional conservation of the genome over the course of evolution makes genomics a fruitful approach to studying human physiology and pathology.

Genes and their protein products are highly conserved. As an example, approximately 30 percent of human genes have an equivalent homolog in yeast.

The percentage of homologous genes increases as the evolutionary distance between humans and model organisms becomes closer. The genome sequence of the chimpanzee shares the highest degree of sequence conservation with the human genome, with only approximately 1.2 percent divergence [5,6].

Another type of evolutionary relationship is evident in the organization of the genome: groups of genes physically clustered together on one chromosome in one organism are more often clustered together in other divergent species (so-called syntenic conservation) [7-10].

Additionally, transcriptional regulatory mechanisms are evolutionarily conserved across organisms and specific pathways [11].

HOW EXPERIMENTAL SYSTEMS ARE CHOSEN

Types of questions — The most appropriate model system to use depends on the experimental question of interest.

Simple studies of a signaling pathway or the effect on a cell of environmental exposure (such as a drug or nutrient) may best be modeled in cell lines or single-celled organisms such as yeast.

Many questions of early development are most easily studied in organisms where all postfertilization maturation happens external to the mother, such as flies and zebrafish.

More complicated studies such as investigations of cross-talk between organ systems or studies of behavior, learning, and memory require a higher vertebrate species such as rodents or a larger animal such as sheep or pigs.

Model systems versus human DNA-based methods — Some questions require use of an intact organism, whereas others can be answered directly from gene sequence.

Model systems – Model systems allow investigators to relate DNA sequence data with functional information to a degree that may not be possible with human studies. Models are useful insofar as they offer the technical means to accomplish studies that could not be undertaken in humans.

Model systems are particularly valuable when essential tissues must be obtained or the questions being studied necessarily threaten life or health. Such investigations could not be performed in human research participants, and even in model systems, there are important ethical limitations. (See 'Ethical considerations and oversight' below.)

Beyond the ethical rationale, model systems often offer additional features that enhance their value. Short generation times, low husbandry expenses, volume and importance of prior work, and scientific community support all factor into model choice.

The value of model systems depends upon establishing a proper balance between the simplification they offer and the extent to which they mirror human physiology. These factors usually are inversely related, as summarized in the table comparing the different systems (table 1).

Human DNA-based methods

Examples of DNA-based methods for studying human gene function and physiology include:

Next-generation sequencing (NGS) – NGS, also called deep sequencing, uses parallel processing of short DNA fragment reads in an automated machine to determine DNA sequence. It allows rapid and relatively low-cost determination of the sequences of a single gene, a panel of genes, a whole exome (protein-coding sequences), or a whole genome. (See "Next-generation DNA sequencing (NGS): Principles and clinical applications".)

Messenger ribonucleic acid (mRNA) can also be sequenced, referred to as RNA-seq [12]. This has become a preferred approach for gene expression profiling. (See "Tools for genetics and genomics: Gene expression profiling", section on 'Transcriptome sequencing (RNA-seq)'.)

Genome-wide association studies (GWAS) – GWAS is a type of study that assesses associations between genetic variants and heritable traits across the genome. Some genotyping platforms allow 1 million genotypes to be assessed in a single experiment. Typical studies consist of genotyping hundreds of thousands of common variants, using DNA microarrays in large case-control populations, with the goal of identifying specific risk alleles that are more prevalent in cases than in controls. (See "Genetic association and GWAS studies: Principles and applications".)

Several databases of GWAS investigations are available. (See 'Online sources of genetic information' below.)

Epigenetics studies – An individual's genome is essentially constant across tissues, but the pattern of gene expression among tissues varies greatly. Transcriptional activity is determined in part by epigenetic "marks" on the genome (epi refers to "above"). These include methylation of DNA bases and of histones, which controls how the DNA is packaged into chromatin and the degree to which individual genes are accessible by the transcriptional machinery. In addition to altering gene expression, these epigenetic modifications control X-chromosome inactivation and imprinting. While they are modifiable (by removing methyl groups), they can also be transmitted to offspring. Epigenetic changes in different tissues or individuals can be measured genome-wide or for a specific set of genes. Epigenetic modifications of histones can be determined following chromatin immunoprecipitation (ChIP). (See "Principles of epigenetics" and "Genetics: Glossary of terms", section on 'X-inactivation' and "Inheritance patterns of monogenic disorders (Mendelian and non-Mendelian)", section on 'Parent-of-origin effects (imprinting)'.)

Mendelian randomization – This is a method for analyzing epidemiologic data to find causal associations between gene variants and clinical phenotypes. It takes advantage of the independent assortment of alleles at a given locus that randomly sorts individuals who do or do not inherit a specific allele that includes a specific gene variant or collection of variants. This mimics the process that occurs in a clinical trial, although it relies on several assumptions that can invalidate inference of causality. (See "Mendelian randomization".)

Transcriptomics and expression profiling – There are several methods that take advantage of the ability to measure RNA as a means of determining and quantifying the level of gene expression. Comparisons can be made between organisms, time-points, body parts, and other variables. (See "Tools for genetics and genomics: Gene expression profiling".)

Examples include:

-RNA-seq – Expression of all genes (whole transcriptome) profiling may be conducted on whole tissues (bulk RNA-seq) or on tissues that have been disassociated into single cells. Transcriptomic profiling done on a cell-by-cell basis is referred to as single-cell RNA-seq. Details are presented separately. (See "Tools for genetics and genomics: Gene expression profiling", section on 'Transcriptome sequencing (RNA-seq)'.)

-eQTL – Expression quantitative trait locus analysis (eQTL) treats the pattern of gene expression within a tissue as a series of quantitative traits. The expression of many genes can be measured simultaneously using microarrays. Either linkage mapping (expression quantitative trait loci mapping, eQTL mapping) or GWAS approaches can be used to map and identify sequence variants that account for differences in the levels of expression of specific genes [13]. Results from such studies can be used to prioritize variants for possible functional studies. Variants that confer changes in gene expression are more likely to lead to phenotypic variation.

-Spatial transcriptomics – Spatial transcriptomics combines histologic information with gene expression data [14]. (See "Tools for genetics and genomics: Gene expression profiling", section on 'Single-cell RNA sequencing'.)

Chromatin immunoprecipitation (ChIP) – Immunoprecipitation uses an antibody to purify proteins or DNA, allowing study of which other proteins are associated. This was initially used to identify protein-protein interactions and subsequently adapted to study DNA-protein interactions, referred to as ChIP. This can include genome-wide studies of transcription factor binding using microarrays ("gene chips," hence the term "ChIP on Chip") [15]. The resulting data constitute a map of the antigen DNA binding site. ChIP-seq is a related method in which the immunoprecipitated DNA is sequenced using NGS instead of being hybridized to a microarray. Newer methods allow such studies to be conducted at a single-cell level (eg, single-cell ChIP-seq). Many such annotations are available through the UCSC Human Genome Browser. (See 'Online sources of genetic information' below.)

Impact of gene editing — The value of model organisms has been enhanced by the development and widespread use of gene editing technologies that allow investigators to make targeted changes to specific DNA sequences; this is especially true for multicellular organisms, where isolating and maintaining germ cells is challenging [16-20]. These are the same methods used to create human therapeutics to overcome monogenic disorders.

Details of available methods and clinical uses are discussed separately. (See "Overview of gene therapy, gene editing, and gene silencing".)

ETHICAL CONSIDERATIONS AND OVERSIGHT — 

Ethical considerations drive the use of model organisms. The critical assumptions are that model organism lives have value but that animal lives are less valuable than human lives.

Furthermore, regulations distinguish "higher" animals, such as nonhuman primates, sheep, pigs, dogs, and cats from "lower" animals such as rats, mice, and zebrafish, with stricter regulations for the former. As invertebrates, both Caenorhabditis elegans (worm) and Drosophila melanogaster (fruit fly) are exempt from research animal regulations.

United States regulations regarding use of animals in research are limited to vertebrates and are summarized in a freely available online book [21].

When research is conducted, vertebrate animal use ethics are overseen by a local Institutional Animal Care and Use Committee (IACUC), which is the analog to the human Institutional Review Board (IRB). IACUC protocols are comparable in depth and detail to IRB protocols, and animal investigators are required to limit their studies to items specified in approved protocols. Animal use is guided by the "3 R's":

Replace – Replace animals with in vitro or computer models when possible and use the "lowest" species possible to address the scientific question.

Reduce – Reduce the number of animals to the smallest number allowing the question to be answered.

Refine – Refine animal use to minimize pain, discomfort, and stress experienced by research animals. Social housing and environmental enrichment fall within the refinement mandate.

To be ethical, animal research must address a legitimate scientific question and not unnecessarily duplicate prior work. IACUC protocols require justification of the species, age, sex, and numbers of the animals to be used, supported by recent structured literature review and power calculations. They also require descriptions of all procedures and interventions that will be performed. For those that have the potential to cause discomfort, details for monitoring and minimizing pain are required. Breeding, husbandry, veterinary care, and euthanasia must all be described and justified as necessary to accomplish the protocol's scientific objectives.

Most IACUCs mandate consultation with a veterinarian during protocol preparation. As is done for IRBs, IACUCs also include lay members who are charged with representing the general public's interests.

VIRUSES, PROKARYOTES, AND YEAST — 

Viruses, prokaryotic (bacterial), and yeast models derive much of their value from the ability to identify rare individuals within large experimental populations. (See 'Bacteriophage lambda' below and 'Yeast' below.)

Standardized protocols can be used for selecting and/or screening for specific mutations. Selection can ensure that only individual organisms possessing a specific metabolic property survive to be examined.

Screening employs identification of a specific metabolic property in some organisms that results in an easily scored difference in phenotype, allowing such organisms to be reliably identified, isolated, and recovered for further study.

These organisms also have very rapid cell division, allowing many generations of progeny to be created in a matter of days.

Bacteriophage lambda — Bacteriophages (phages) are a type of virus that infect bacteria. The bacteriophage lambda model has been used to study transcriptional changes in response to environmental stimuli.

Lambda is a temperate phage that can grow in either a lytic or lysogenic life cycle. This is manifested morphologically by production of turbid plaques, in which surviving infected bacteria are present.

There are various lambda mutants that produce clear plaques in which survival of infected bacteria is much reduced or that behave aberrantly with regard to superinfection. These properties are understood at a detailed molecular level.

Yeast — Yeast are single-celled eukaryotes that contain many well-conserved genes, many of which have human homologs (approximately one-third of human genes have yeast homologs). Eukaryotes contain a distinct cell nucleus separated from the cytoplasm by a membrane, in contrast to prokaryotes, in which genetic material is not separated from the rest of the cell by a nuclear membrane. (See "Basic genetics concepts: Chromosomes and cell division".)

The genome sequence for the baker's yeast Saccharomyces cerevisiae has been available since 1996.

S. cerevisiae is a budding yeast that can be manipulated to create targeted gene deletion or replacement and subjected to an array of selection and screening schemes. The possibility to grow as a haploid or a diploid organism facilitates genetic manipulation and selection. An example of the application of these features was the 1999 report of the effects of systematic deletion of over 2000 yeast genes under a variety of growth conditions [22]. A significant insight arising from yeast studies is the discovery of the sirtuin family of proteins, now recognized as critical determinants of longevity [23].

Protein-protein interactions are important features of virtually every cellular process. The yeast two-hybrid system allows a systematic search for interactions among proteins [24]. This strategy was applied in a study in which approximately 1000 interacting protein pairs were detected [25]. Interacting proteins from any species can be studied in yeast [26].

Protein function depends on each protein assuming its proper conformation during synthesis; misfolded proteins lose function. So-called "chaperone" proteins interact with nascent peptides to assist in their correct folding during protein synthesis. While nascent protein-chaperone interactions can be studied individually and in other models, the yeast model system is particularly well suited to global functional analysis of these interactions [27].

This experimental strategy has been adapted to seek interactions between proteins from other organisms. As approximately one-third of human genes have yeast homologs, understanding of human biology has been greatly advanced by study of these in the more easily manipulated yeast system.

The Saccharomyces genome database (www.yeastgenome.org) collates information about yeast genetics [28].

COMPLEX MULTICELLULAR NON-MAMMALIAN MODEL SYSTEMS

Nematode (C. elegans) — Caenorhabditis elegans is a nematode (worm) with a limited and countable number of cells that is used extensively as a model system [29-37]. Use in developmental biology derives from the ability to trace the cell lineages of every cell [38,39].

C. elegans is a popular model for neurologic disorders, as its complete neural circuitry is known and has provided important insights regarding the role of molecular chaperones in protecting against neurodegenerative diseases [40].

It is also used in studies of aging, as many of the aging-associated genes are conserved in humans and a number of lifespan mutants have been discovered [29-37].

C. elegans data are available through WormBase and the Sanger Centre's Worm Genome page [41,42].

Fruit fly (Drosophila) — The fruit fly Drosophila melanogaster was the first organism in which detailed linkage maps were constructed.

It has been particularly informative in studies of how spatial information is established during development to create the body plan. Multiple major pathways driving development, conserved widely over evolution, were first discovered in Drosophila and are featured targets for many diagnostic, prognostic, and therapeutic tools [43].

An excellent online resource is Flybase [44].

Zebrafish — Zebrafish (Danio rerio) is a small freshwater fish that became popular for genetic studies in the 1980s [45].

The zebrafish is useful because its embryos develop freely, are transparent, and genetic manipulations can be readily performed. Zebrafish are particularly well-suited to studying environmental factors such as drugs and thermal stress because of their small size, aquatic environment, and poikilothermic metabolism. They are also easily amenable to in vivo RNA interference (RNAi), a molecular technique for targeted silencing of specific genes, allowing a wide variety of developmental features to be studied.

As an example, RNAi-mediated suppression of HSP90 (90 kilodalton heat shock protein) across several strains of zebrafish resulted in variable ocular development, with some strains demonstrating severe mutations and others more mild phenotype [46]. The issues addressed in this study fully exploit the advantages of zebrafish as a model system.

Zebrafish are also proving to be very useful for the studies of gene function in diseases such as rare genetic metabolic diseases and more common diseases such as osteoporosis and kidney disease [47]. Zebrafish have at least one ortholog for approximately 71 percent of all human protein-coding genes, limiting which genes can be studied [48].

The Zebrafish Information Network provides further information and additional links [49].

GENETICALLY ENGINEERED MICE — 

Laboratory mice are among the most widely used model systems in biomedical research. The Jackson Laboratory's Mouse Genome Informatics site provides access to information about genes, mutants, inbred strains, mapping and developmental data, and homology to genes in other organisms [50-52]. Data on the rat genome can be accessed at the Rat Genome Database [53].

Mice are often used to study aspects of mammalian physiology that cannot be modeled in invertebrates or nonmammalian vertebrates. Mice are easy to breed, have been extensively studied genetically, and are readily available. There are hundreds of inbred strains, as well as congenic, recombinant inbred, recombinant congenic strains, transgenic, knockout, knock-in, lineage tracing and pathway reporting strains, and immunodeficient and humanized mice.

Transgenic mice terminology and types of manipulation

Transgenic mice – Transgenic mice are those into which foreign DNA has been incorporated into the genome. According to this broad definition, all types of genetically engineered mice are transgenic. In a narrower sense, transgenic mice have an "extra" gene introduced to accomplish any of several experimental objectives such as introducing an abnormal gene to create a disease model, introducing a wild-type gene to confirm functional roles, or introducing a reporter gene to track gene expression [54-58]. The altered gene is referred to as a transgene.

Gene editing tools including zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and the clustered regularly interspersed short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) system have greatly reduced the time and expense of creating knockout and knock-in animals. These methods are discussed separately. (See "Overview of gene therapy, gene editing, and gene silencing", section on 'Gene editing'.)

ES cells – Embryonic stem (ES) cells are the pluripotent cells used to create embryos with the desired germline manipulations. The foreign DNA is introduced into these cells in vitro, either by direct microinjection of DNA into the pronucleus of a fertilized egg (picture 1), or by transfection into the ES cells [59-61]. ES cells can be grown in culture, and specific cells can be selected for, in the same manner as other cultured cells. (See "Overview of stem cells", section on 'Embryonic stem (ES) cells'.)

ES cells shown to carry the desired construct are then injected into mouse blastocysts to yield chimeric embryos. The blastocysts can then be implanted into foster mothers and the transgenic offspring and wild-type littermate controls can be used for functional studies.

Knockout mice – Knockout mice carry a specific germline gene disruption created by disrupting a gene in ES cells to produce a "null" version of the gene [62]. This allows researchers to study the consequences of loss of function of the targeted gene in vivo using phenotypic studies. Additional studies include comparison to other knockout phenotypes and breeding with other knockout mice to determine phenotypes of dual gene knockouts.

Knock-in mice – Knock-in mice carry a specific germline gene alteration in place of the naturally occurring allele. Knock-in technology allows researchers to examine the functional effects of different variants in the same gene. This is particularly informative if the variants thought to cause human disease result in gain of function rather than loss of function (eg, oncogenes) or if the goal is to investigate the effect of a gene variant in a single tissue using a conditional knock-in.

Conditional systems – These are knockout or knock-in transgenic mice in which the genetic change is restricted to a specific tissue (using a tissue-specific or cell type-specific promotor) or a specific point in time (using an inducible promotor). These systems are especially useful when a germline gene disruption is lethal or prevents development of a specific tissue under study, or if the investigator wants to isolate specific effects of the genetic change on a single tissue or process. Collaborative projects in North America and in Europe have developed conditional knockout alleles for thousands of genes [63-66].

Methods include:

Inducible promotors – Expression of the transgene can be placed under the transcriptional control of an inducible promoter such as the bacterial tetracycline resistance operon [67-69]. The investigator can then regulate transgene expression by titrating administration of tetracycline to the transgenic animals.

Tissue-specific knockouts – Expression of the transgene can be restricted to one tissue. This approach uses bacteriophage P1 Cre recombinase to mediate site-specific recombination at specific short sequence elements called loxP sites (Cre-Lox technology) [70].

Two engineered mice must be produced. One strain is transgenic for a construct that introduces the loxP sites into the target gene, flanking sufficient DNA so that its deletion will cause loss of the target gene activity. The second strain is transgenic for a construct that introduces a functional gene for Cre recombinase driven by a tissue-specific promoter in the target tissue.

Mice homozygous for the target gene construct are then crossed with (mated to) mice carrying the tissue-specific Cre construct. Offspring that carry the Cre construct (expected to be 50 percent of the progeny) will express Cre recombinase in a tissue-specific manner, leading to excision of the portion of the target gene flanked by the loxP sites, thus creating a tissue-specific knockout of the gene (figure 1) [71,72].

Chimeric systems – The extraembryonic tissues are important in early developmental steps including gastrulation. One group developed a method by which early embryos can be tetraploidized, and chimeric embryos produced from tetraploidized embryos and cultured ES cells [73,74]. Under these conditions, the tetraploid cells give rise exclusively to extraembryonic tissue, while the diploid ES cells produce all of the embryo proper. This approach allows investigators to overcome some early developmental defects, thus allowing later functions of the disrupted gene to be studied [75,76].

Reporter mice – Reporter mice carry a gene for a reporter that can be used to generate a signal that can assay expression of a gene.

Early work used beta-galactosidase as the reporter, which can be used to generate a blue histochemical dye specifically in tissues where the gene is expressed, and targeted a locus, Rosa26, that allows ubiquitous expression and whose disruption is phenotypically silent [77].

Subsequent work relies on fluorescent reporters, such as green fluorescent protein, that improve resolution and eliminate the need for histochemical processing [78-83].

Expression of a reporter can be restricted by use of a tissue-specific or lineage-specific Cre construct.

Lineage tracing is a common application in tissue-specific or lineage-specific reporter mice [84]. Its use to validate tissue-specificity of Cre activity is considered best practice [85]. Lineage tracing with more elaborate reporters is also used to perform dynamic studies such as tissue healing, stem cell tracking, or localization of signaling pathways [57,86-89].

The International Mouse Phenotyping Consortium seeks to generate floxed knockout alleles for every mouse gene, phenotype these animals, annotate the findings, and make these resources widely available to investigators [90]. The Mouse Genome Informatics site curates phenotype data from other published genetically engineered mice, including those not generated by International Mouse Phenotyping Consortium [91]. They provide search tools to help identify possible mouse models of human disease.

Humanized immunodeficient mice – These animals allow investigators to engraft human hematopoietic immune response cells into immunodeficient mice that can be manipulated experimentally. These animals are especially useful for studying immune function, transplantation, infectious diseases, and tumor biology.

Nude mouse – The nude mouse (Foxn1nu/nu) was first noted as a spontaneous mutant in 1962, initially recognized for being hairless and having a short lifespan [92,93]. Subsequent work revealed absence of the thymus, impaired T cell development, and ability to tolerate xenografts (tissues from other species) [94,95]. Nude mice have great historical importance, but contemporary humanized mice used to engraft human hematopoietic and immune cells are based on different mouse mutants.

Humanized mice – Humanized mice have three principal elements of immunodeficiency to allow robust human immune and hematopoietic cell engraftment:

-Disruption in the recombinational machinery necessary for immune cell differentiation, satisfied by loss of function of PRKDC, RAG1, or RAG2.

-Loss of function mutations of PRKDC, encoding the catalytic subunit of DNA-dependent protein kinase, to prevent repair of double-stranded DNA breaks, resistance to ionizing radiation, and VDJ rearrangement in B and T cells [96]. Prior work established the severe combined immunodeficiency (SCID, PRKDCscid/scid) phenotype and demonstrated that SCID mice can serve as experimental hosts for engrafting human immune cells [97-100]. This property led to SCID mice rapidly becoming a widely used platform in hematologic malignancy, infectious disease, and autoimmunity research, even before the molecular pathogenesis of the SCID mutation was understood. Two additional genes, encoding recombination activating genes 1 and 2 (RAG1 and RAG2), also are needed for VDJ rearrangement, and mice deficient in them show profound impairment of B and T cell maturation, and consequently, immunity [101-104].

-Disruption of the shared gamma subunit of multiple interleukin receptors, encoded by IL2RG. Loss of IL2RG function interferes with high-affinity binding of multiple cytokines including IL-2, IL-4, IL-7, IL-9, IL-15, and IL-21. This causes X-linked severe combined immune deficiency (X-SCID) in humans and a similar phenotype in mice [105-108]. (See "X-linked severe combined immunodeficiency (X-SCID)".)

The mice must also carry a "human-like" allele of SIRPA, encoding signal regulatory protein alpha type 1 [109,110]. This protein is expressed on macrophages and mediates an inhibitory signal that prevents phagocytosis [111]. Nonobese diabetic (NOD) mice carry polymorphisms in SIRPA that resembles the human protein and thereby protects human cells. For this reason, many humanized mice feature a NOD genetic background, although a few strains have a Balb/c background, which also carries a more favorable SIRPA allele [112]. A few strains have a C57BL/6 background together with SIRPA or CD47 alleles that allow human cell engraftment.

Over 20 mouse strains and stocks that can be humanized are available from the major commercial laboratory mouse vendors; descriptions can be found on their websites. Additional strains have been developed by individual laboratories.

Human cells are introduced into an immunodeficient mouse by one of several delivery methods. Human peripheral blood mononuclear cells or human hematopoietic stem cells can be infused into host mice that have been irradiated [98,100]. Alternatively, fetal liver and thymus tissue can be placed under the kidney capsule [99]. All mouse host strains and engraftment strategies have limitations, and careful planning is necessary to choose the best system to study the relevant biology. Some common problems include graft-versus-host disease (GVHD), reduced development of lymph nodes or lymphocyte maturation, and limited survival of engrafted tissue.

Transgenic mice limitations

Unclear or mixed genetic background – Different mouse strains are highly variable, and the background in which a genetic construct has been studied affects experimental interpretation. Many genetically engineered mice have been generated on one of the 129 strains that are more diverse than previously believed [113-115]. The limitations related to strain background are significant, particularly because 129-related strains have been the source of many ES cell lines. Generation of ES cells from other strains may mitigate this problem in the future [116,117].

Investigators routinely breed founder animals to a recipient strain, most often C57BL/6. Consequently, the constructs are often studied on a poorly defined "mixed 129 X C57BL/6 background," without further information regarding the relative contributions of the progenitor genomes, number of generations of subsequent inbreeding, or often even the correct strain information for the progenitors. Efforts to improve reporting of strain background in the future are underway [118].

Risk of insertional mutagenesis – For mice in which the construct has not been targeted to a specific locus, incorporation of the transgene may result in insertional mutagenesis, with the resulting phenotype arising not from the transgene, but from disruption of the gene into which the transgene was placed [119]. The transgene's expression may also vary according to the properties of the insertion site [120].

Variation in expression levels – When multiple copies of the transgene are inserted into the genome, there will be differences in transgene expression level. Targeting transgenes to specific sites can help reduce this problem. One site that allows insertion of a single copy of the transgene while allowing transcription to be mediated by sequences included in the targeting vector is the HPRT locus, encoding the salvage purine utilization enzyme hypoxanthine/guanine phosphoribosyl transferase [121,122].

Challenges in defining causality – The major limitation of knockout mice is that while they are useful for establishing the role of the target gene in a pathway, it is not necessarily the case that mutations in the target gene account for human diseases or population variation in downstream phenotypes. Knock-in strategies can address this by allowing the study of a series of mutant alleles. By virtue of being targeted to the homologous locus, the issue of unintentional insertional mutation does not arise with knockout mice. (See 'Transgenic mice terminology and types of manipulation' above.)

Off-target effects and Cre toxicity – Cre constructs used to create tissue-specific knockouts are neither perfectly efficient nor perfectly specific. Tissue-specific constructs are sometimes active outside the target tissue, and inducible constructs may be active in the absence of the inducing substance [123,124]. Cre constructs can also cause "Cre toxicity" and Cre-mediated suppression of tumor growth [125-129].

The practical response to these limitations is that when transgenic mice are generated, investigators routinely study animals derived from several different founders. The minimal characterization will generally include estimation of copy number, transgene mRNA level, and transgene protein level. More detailed analysis is then conducted on one or a small number of the transgenic lines. Appropriate and complete control groups need to be characterized in parallel. Careful investigators will also report the breeding history between founder and the analyzed animals.

ONLINE SOURCES OF GENETIC INFORMATION

National Institutes of Health (NIH) – The National Center for Biotechnology Information (NCBI), operated by the National Library of Medicine, consists of a series of interconnected databases (the full list is given at www.ncbi.nlm.nih.gov/sites/gquery?itool=toolbar) and serves as a repository for all publicly submitted genomic data. The site includes nucleic acid sequences generated through federally funded genome projects, genetic sequence variants, results from gene expression profiling studies, and genome-wide association studies. The Entrez browser system is one possible entry point (www.ncbi.nlm.nih.gov/gquery/gquery.fcgi). Many readers are already familiar with this site's medical literature search system, PubMed.

An Entrez portal (the genotypes and phenotypes database, dbGaP www.ncbi.nlm.nih.gov/sites/entrez?Db=gap) allows users to search by study or disease. The National Human Genome Research Institute (NHGRI) and the European Molecular Biology Laboratory (EMBL) also collaborate to produce a database of published GWAS at http://www.ebi.ac.uk/gwas/.

The Entrez system also allows users to conduct a variety of analyses. For example, it enables searching of all nucleotide (DNA) or protein databases for sequence homologies to a query sequence [130-136]. BLAST (Basic Local Alignment Search Tool) reports include all sequence matches and provide metrics of the strength of alignment, confidence scores for the alignment, and graphical depictions of the alignments.

While Entrez is extremely flexible and comprehensive, its interface is not as easy to use as that of several other resources that are more graphics based.

Two easily used graphics-based genome browsers are Ensembl [137] and the UCSC Genome Browser [138]. These browsers provide graphical representation of the genome with sequence annotation of genes, variants, functional elements, and other types of genomic data. Much of the information presented is derived from the NCBI database, and daily cross-references and updates of NCBI, Ensembl, and the UCSC browser ensure similar query retrievals regardless of which interface is used for the query.

The most clinically oriented genome-based database within NCBI is OMIM (Online Mendelian Inheritance in Man). OMIM is a manually curated compendium of genes and genetic phenotypes, regularly updated by reviewers who cull the published literature. Each entry is a full-text overview, with references hyperlinked to PubMed entries. OMIM's emphasis is the relationship between gene locus, genetic variation, and their phenotypic consequences. OMIM also provides the historical context in which understanding about various genes and diseases has emerged.

For clinicians needing guidance in management of patients with genetic conditions, the Genetic Testing Registry (GTR) is a database that provides information on genetic conditions and laboratories that offer testing for pathogenic variants in disease genes. Like OMIM, it is managed by the National Library of Medicine.

GeneCards – GeneCards (www.genecards.org/) is a well-organized, gene-centered database and suite of interactive tools curated as a partnership between the Weizmann Institute and LifeMap Sciences. It provides genomic, proteomic, and functional information on all known and predicted human genes; its scope and functionality have grown substantially since inception [139-148]. In addition to the parent GeneCard entry, the suite includes tools designed specifically to explore diseases, noncoding RNAs, regulatory elements, pathways, and related genes. The included information is primarily geared toward researchers but also includes information of interest to clinicians.

Human Genetics Amplifier (HuGeAmp) – HuGeAmp is a collaborative initiative anchored by partnership among academic, government, and industry sponsors. It hosts a series of knowledge portals on various topics, including common metabolic diseases, type 1 and type 2 diabetes, cerebrovascular disease, lung disease, and many others [149]. As the resources are developed and curated by active researchers in each area, there is some variability in the scope and structure among the various portals.

HuGeAmp also hosts a series of researcher tools via its Bring Your Own Results (BYOR.Science) platform, which allows users to analyze data with analytical tools and to contribute datasets for sharing.

Broad Institute – The Broad Institute hosts a suite of tools for analyzing sequence data, the Genetic Analysis Toolkit (GATK, https://gatk.broadinstitute.org/hc/en-us). In addition to software for undertaking analysis of next-generation sequencing data, the site also features extensive documentation that helps users to follow best practices in seeking sequence variants.

The genome aggregation database (gnomAD) is one such tool, integrating reference sequence data collected across various experimental platforms and facilitating variant interpretation [150].

The Genotype-Tissue Expression (GTEx) Portal is another such resource (https://www.gtexportal.org/home/) [151]. This portal allows the user to explore whole genome sequencing, RNA seq, and eQTL data for 54 human tissues. While not representative of every tissue type in the body, tissue-specific expression and other gene features such as alternative splice can also be explored.

Public datasets – In addition to special interest datasets available through the knowledge portal network, there are notable large public datasets.

The UK Biobank is perhaps the best known among these [152]. It includes deidentified data from approximately 500,000 UK (United Kingdom) residents. Genetic and core phenotypic data are available for all participants, while subsets also have undergone additional phenotyping.

FinnGen is a similar effort conducted in Finland [153].

The All of US Research Program (https://allofus.nih.gov/) and the Millions Veterans Program (https://www.research.va.gov/mvp/) in the United States are similar in that they are collecting genomic data for all participants, but they have less replete phenotype data sets. Repository projects are cataloged in an online biobank directory.

SUMMARY

Genome sequences – Understanding the genetic basis for human disease involves knowledge of genome sequences and gene function. The genomes of many organisms have been sequenced, including the human genome in 2001. (See 'Historical background' above.)

Choice of experimental system – Model organisms can provide a useful means of studying gene function in ways that cannot be used in human research. The type of system used depends on the research question and available models and human DNA-based methods. The table (table 1) summarizes commonly used model systems for studying human disease and their advantages and disadvantages. Human DNA-based methods of investigation, including next-generation DNA sequencing (NGS), genome-wide association studies (GWAS), epigenetic studies, Mendelian randomization studies, transcriptomics, and chromatin immunoprecipitation (ChIP). Details and links to more extensive discussions are listed above. (See 'How experimental systems are chosen' above.)

Ethical considerations – Vertebrate animal use is overseen by a local Institutional Animal Care and Use Committee (IACUC), analogous to an Institutional Review Board (IRB). IACUC protocols require investigators to address a legitimate scientific question, justify the number and characteristics of animals used, and limit studies to items specified in approved protocols. Animal use is guided by the "3 R's" (replace, reduce, and refine), as defined above. (See 'Ethical considerations and oversight' above.)

Viruses and yeast – Viruses, prokaryotic (bacterial), and yeast models derive value from the ability to identify rare individuals within large experimental populations. Bacteriophages (viruses that infect bacteria) have been used to study gene transcription. The budding yeast Saccharomyces cerevisiae can be used for targeted gene deletions and selection. Approximately one-third of human genes have yeast homologs. The yeast two-hybrid system is used to study protein interactions required for various biological pathways. (See 'Viruses, prokaryotes, and yeast' above.)

Worms, flies, and zebrafishCaenorhabditis elegans is a worm in which all cell fates and neural circuitry have been mapped, making it especially useful for studies of developmental biology and neurologic disorders. It is also useful for aging studies, as many aging genes are conserved with humans and lifespan mutants have been identified. The fruit fly Drosophila melanogaster has been particularly informative in studies of how spatial information is established during development to create the body plan. Zebrafish (Danio rerio) is a small fish that became popular for genetic studies in the 1980s because it is amenable to genetic manipulations and RNA interference and has a transparent body. Zebrafish are well-suited to studying drugs and thermal stress. (See 'Complex multicellular non-mammalian model systems' above.)

Mouse models – Transgenic mice are mice that have foreign DNA inserted into their genome; these can be especially useful for studying functional implications of gene loss or gene alterations (figure 1). This can be done by microinjecting DNA into the pronucleus of a fertilized egg (picture 1) or by transfection of DNA into embryonic stem (ES) cells. The ES cells that carry the desired construct are injected into mouse blastocysts to yield chimeric embryos that can be implanted into foster mothers. Differences between knock-out, knock-in, immunodeficient, and mice with conditional systems (tissue-specific or inducible) are discussed above. (See 'Genetically engineered mice' above.)

Online resources – Links to online resources are provided above. (See 'Online sources of genetic information' above.)

  1. Venter JC, Adams MD, Myers EW, et al. The sequence of the human genome. Science 2001; 291:1304.
  2. Sherry ST, Ward MH, Kholodov M, et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 2001; 29:308.
  3. Gong B, Li D, Zhang Y, et al. Extend the benchmarking indel set by manual review using the individual cell line sequencing data from the Sequencing Quality Control 2 (SEQC2) project. Sci Rep 2024; 14:7028.
  4. Truty R, Paul J, Kennemer M, et al. Prevalence and properties of intragenic copy-number variation in Mendelian disease genes. Genet Med 2019; 21:114.
  5. Chen FC, Vallender EJ, Wang H, et al. Genomic divergence between human and chimpanzee estimated from large-scale alignments of genomic sequences. J Hered 2001; 92:481.
  6. Fujiyama A, Watanabe H, Toyoda A, et al. Construction and analysis of a human-chimpanzee comparative clone map. Science 2002; 295:131.
  7. Eppig JT, Nadeau JH. Comparative maps: the mammalian jigsaw puzzle. Curr Opin Genet Dev 1995; 5:709.
  8. DeBry RW, Seldin MF. Human/mouse homology relationships. Genomics 1996; 33:337.
  9. Nadeau JH. Maps of linkage and synteny homologies between mouse and man. Trends Genet 1989; 5:82.
  10. Seldin MF. Genome surfing: using internet-based informatic tools toward functional genetic studies in mouse and humans. Methods 1997; 13:445.
  11. Barolo S, Posakony JW. Three habits of highly effective signaling pathways: principles of transcriptional control by developmental cell signaling. Genes Dev 2002; 16:1167.
  12. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009; 10:57.
  13. Cookson W, Liang L, Abecasis G, et al. Mapping complex disease traits with global gene expression. Nat Rev Genet 2009; 10:184.
  14. Rao A, Barkley D, França GS, Yanai I. Exploring tissue architecture using spatial transcriptomics. Nature 2021; 596:211.
  15. Acevedo LG, Iniguez AL, Holster HL, et al. Genome-scale ChIP-chip analysis using 10,000 human cells. Biotechniques 2007; 43:791.
  16. Nance J, Frøkjær-Jensen C. The Caenorhabditis elegans Transgenic Toolbox. Genetics 2019; 212:959.
  17. Xu RG, Wang X, Shen D, et al. Perspectives on gene expression regulation techniques in Drosophila. J Genet Genomics 2019; 46:213.
  18. Liu K, Petree C, Requena T, et al. Expanding the CRISPR Toolbox in Zebrafish for Studying Development and Disease. Front Cell Dev Biol 2019; 7:13.
  19. Low BE, Kutny PM, Wiles MV. Simple, Efficient CRISPR-Cas9-Mediated Gene Editing in Mice: Strategies and Methods. Methods Mol Biol 2016; 1438:19.
  20. Meek S, Mashimo T, Burdon T. From engineering to editing the rat genome. Mamm Genome 2017; 28:302.
  21. https://olaw.nih.gov/policies-laws/guide-care-use-lab-animals (Accessed on February 12, 2025).
  22. Winzeler EA, Shoemaker DD, Astromoff A, et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 1999; 285:901.
  23. Michan S, Sinclair D. Sirtuins in mammals: insights into their biological function. Biochem J 2007; 404:1.
  24. Brent R, Finley RL Jr. Understanding gene and allele function with two-hybrid methods. Annu Rev Genet 1997; 31:663.
  25. Uetz P, Giot L, Cagney G, et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 2000; 403:623.
  26. Parrish JR, Gulyas KD, Finley RL Jr. Yeast two-hybrid contributions to interactome mapping. Curr Opin Biotechnol 2006; 17:387.
  27. Gong Y, Kakihara Y, Krogan N, et al. An atlas of chaperone-protein interactions in Saccharomyces cerevisiae: implications to protein folding pathways in the cell. Mol Syst Biol 2009; 5:275.
  28. Engel SR, Balakrishnan R, Binkley G, et al. Saccharomyces Genome Database provides mutant phenotype data. Nucleic Acids Res 2010; 38:D433.
  29. Hodgkin J, Horvitz HR, Jasny BR, Kimble JC. Elegans: Sequence to Biology. Science 1998; 282:2011.
  30. C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 1998; 282:2012.
  31. Bargmann CI. Neurobiology of the Caenorhabditis elegans genome. Science 1998; 282:2028.
  32. Pennisi E. Worming secrets from the C. elegans genome. Science 1998; 282:1972.
  33. Blaxter M. Caenorhabditis elegans is a nematode. Science 1998; 282:2041.
  34. Ruvkun G, Hobert O. The taxonomy of developmental control in Caenorhabditis elegans. Science 1998; 282:2033.
  35. Clarke ND, Berg JM. Zinc fingers in Caenorhabditis elegans: finding families and probing pathways. Science 1998; 282:2018.
  36. Bloom FE. Staying afloat on the seas of data. Science 1998; 282:1989.
  37. Chervitz SA, Aravind L, Sherlock G, et al. Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 1998; 282:2022.
  38. Deppe U, Schierenberg E, Cole T, et al. Cell lineages of the embryo of the nematode Caenorhabditis elegans. Proc Natl Acad Sci U S A 1978; 75:376.
  39. Sulston JE, Schierenberg E, White JG, Thomson JN. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev Biol 1983; 100:64.
  40. Prahlad V, Morimoto RI. Integrating the stress response: lessons for neurodegenerative diseases from C. elegans. Trends Cell Biol 2009; 19:52.
  41. Harris TW, Antoshechkin I, Bieri T, et al. WormBase: a comprehensive resource for nematode research. Nucleic Acids Res 2010; 38:D463.
  42. Worm Genome. Sanger Institute. Available at: https://www.sanger.ac.uk/data/worm-genome/ (Accessed on October 25, 2022).
  43. Perrimon N, Pitsouli C, Shilo BZ. Signaling mechanisms controlling cell fate and embryonic patterning. Cold Spring Harb Perspect Biol 2012; 4:a005975.
  44. Tweedie S, Ashburner M, Falls K, et al. FlyBase: enhancing Drosophila Gene Ontology annotations. Nucleic Acids Res 2009; 37:D555.
  45. Adhish M, Manjubala I. Effectiveness of zebrafish models in understanding human diseases-A review of models. Heliyon 2023; 9:e14557.
  46. Yeyati PL, Bancewicz RM, Maule J, van Heyningen V. Hsp90 selectively modulates phenotype in vertebrate development. PLoS Genet 2007; 3:e43.
  47. Ghatge MS, Al Mughram M, Omar AM, Safo MK. Inborn errors in the vitamin B6 salvage enzymes associated with neonatal epileptic encephalopathy and other pathologies. Biochimie 2021; 183:18.
  48. Howe K, Clark MD, Torroja CF, et al. The zebrafish reference genome sequence and its relationship to the human genome. Nature 2013; 496:498.
  49. Sprague J, Bayraktaroglu L, Clements D, et al. The Zebrafish Information Network: the zebrafish model organism database. Nucleic Acids Res 2006; 34:D581.
  50. Eppig JT, Blake JA, Bult CJ, et al. The Mouse Genome Database (MGD): comprehensive resource for genetics and genomics of the laboratory mouse. Nucleic Acids Res 2012; 40:D881.
  51. Blake JA, Bult CJ, Kadin JA, et al. The Mouse Genome Database (MGD): premier model organism resource for mammalian genomics and genetics. Nucleic Acids Res 2011; 39:D842.
  52. Smith CM, Finger JH, Hayamizu TF, et al. The mouse Gene Expression Database (GXD): 2007 update. Nucleic Acids Res 2007; 35:D618.
  53. Laulederkind SJ, Hayman GT, Wang SJ, et al. The Rat Genome Database 2013--data, tools and users. Brief Bioinform 2013; 14:520.
  54. Hardin JD, Boast S, Mendelsohn M, et al. Transgenes encoding both type I and type IV c-abl proteins rescue the lethality of c-abl mutant mice. Oncogene 1996; 12:2669.
  55. Kalajzic I, Kalajzic Z, Kaliterna M, et al. Use of type I collagen green fluorescent protein transgenes to identify subpopulations of cells at different stages of the osteoblast lineage. J Bone Miner Res 2002; 17:15.
  56. Chai Y, Jiang X, Ito Y, et al. Fate of the mammalian cranial neural crest during tooth and mandibular morphogenesis. Development 2000; 127:1671.
  57. Snippert HJ, van der Flier LG, Sato T, et al. Intestinal crypt homeostasis results from neutral competition between symmetrically dividing Lgr5 stem cells. Cell 2010; 143:134.
  58. Khillan JS, Olsen AS, Kontusaari S, et al. Transgenic mice that express a mini-gene version of the human gene for type I procollagen (COL1A1) develop a phenotype resembling a lethal form of osteogenesis imperfecta. J Biol Chem 1991; 266:23373.
  59. Bradley A, Evans M, Kaufman MH, Robertson E. Formation of germ-line chimaeras from embryo-derived teratocarcinoma cell lines. Nature 1984; 309:255.
  60. Robertson E, Bradley A, Kuehn M, Evans M. Germ-line transmission of genes introduced into cultured pluripotential cells by retroviral vector. Nature 1986; 323:445.
  61. Martin GR. Isolation of a pluripotent cell line from early mouse embryos cultured in medium conditioned by teratocarcinoma stem cells. Proc Natl Acad Sci U S A 1981; 78:7634.
  62. Mansour SL, Thomas KR, Capecchi MR. Disruption of the proto-oncogene int-2 in mouse embryo-derived stem cells: a general strategy for targeting mutations to non-selectable genes. Nature 1988; 336:348.
  63. Schnütgen F, De-Zolt S, Van Sloun P, et al. Genomewide production of multipurpose alleles for the functional analysis of the mouse genome. Proc Natl Acad Sci U S A 2005; 102:7221.
  64. Schnütgen F. Generation of multipurpose alleles for the functional analysis of the mouse genome. Brief Funct Genomic Proteomic 2006; 5:15.
  65. Floss T, Schnütgen F. Conditional gene trapping using the FLEx system. Methods Mol Biol 2008; 435:127.
  66. Friedel RH, Seisenberger C, Kaloff C, Wurst W. EUCOMM--the European conditional mouse mutagenesis program. Brief Funct Genomic Proteomic 2007; 6:180.
  67. Fedorov LM, Tyrsin OY, Krenn V, et al. Tet-system for the regulation of gene expression during embryonic development. Transgenic Res 2001; 10:247.
  68. Bohl D, Heard JM. Modulation of erythropoietin delivery from engineered muscles in mice. Hum Gene Ther 1997; 8:195.
  69. Paulus W, Baur I, Boyce FM, et al. Self-contained, tetracycline-regulated retroviral vector system for gene delivery to mammalian cells. J Virol 1996; 70:62.
  70. Hoess RH, Ziese M, Sternberg N. P1 site-specific recombination: nucleotide sequence of the recombining sites. Proc Natl Acad Sci U S A 1982; 79:3398.
  71. Rajewsky K, Gu H, Kühn R, et al. Conditional gene targeting. J Clin Invest 1996; 98:600.
  72. Rossant J, McMahon A. "Cre"-ating mouse mutants-a meeting review on conditional mouse genetics. Genes Dev 1999; 13:142.
  73. Nagy A, Rossant J, Nagy R, et al. Derivation of completely cell culture-derived mice from early-passage embryonic stem cells. Proc Natl Acad Sci U S A 1993; 90:8424.
  74. Nagy A, Rossant J. Production of completely ES cell-derived fetuses. In: Gene Targeting: A Practical Approach, Joyner AL (Ed), Oxford University Press, New York 1993. p.147.
  75. Duncan SA, Nagy A, Chan W. Murine gastrulation requires HNF-4 regulated gene expression in the visceral endoderm: tetraploid rescue of Hnf-4(-/-) embryos. Development 1997; 124:279.
  76. Li J, Ning G, Duncan SA. Mammalian hepatocyte differentiation requires the transcription factor HNF-4alpha. Genes Dev 2000; 14:464.
  77. Soriano P. Generalized lacZ expression with the ROSA26 Cre reporter strain. Nat Genet 1999; 21:70.
  78. Lobe CG, Koop KE, Kreppner W, et al. Z/AP, a double reporter for cre-mediated recombination. Dev Biol 1999; 208:281.
  79. Mao X, Fujiwara Y, Orkin SH. Improved reporter strain for monitoring Cre recombinase-mediated DNA excisions in mice. Proc Natl Acad Sci U S A 1999; 96:5037.
  80. Kawamoto S, Niwa H, Tashiro F, et al. A novel reporter mouse strain that expresses enhanced green fluorescent protein upon Cre-mediated recombination. FEBS Lett 2000; 470:263.
  81. Novak A, Guo C, Yang W, et al. Z/EG, a double reporter mouse line that expresses enhanced green fluorescent protein upon Cre-mediated excision. Genesis 2000; 28:147.
  82. Mao X, Fujiwara Y, Chapdelaine A, et al. Activation of EGFP expression by Cre-mediated excision in a new ROSA26 reporter mouse strain. Blood 2001; 97:324.
  83. Srinivas S, Watanabe T, Lin CS, et al. Cre reporter strains produced by targeted insertion of EYFP and ECFP into the ROSA26 locus. BMC Dev Biol 2001; 1:4.
  84. Kretzschmar K, Watt FM. Lineage tracing. Cell 2012; 148:33.
  85. Heffner CS, Herbert Pratt C, Babiuk RP, et al. Supporting conditional mouse mutagenesis with a comprehensive cre characterization resource. Nat Commun 2012; 3:1218.
  86. DasGupta R, Fuchs E. Multiple roles for activated LEF/TCF transcription complexes during hair follicle development and differentiation. Development 1999; 126:4557.
  87. Hackl MJ, Burford JL, Villanueva K, et al. Tracking the fate of glomerular epithelial cells in vivo using serial multiphoton imaging in new mouse models with fluorescent lineage tags. Nat Med 2013; 19:1661.
  88. Debnath S, Yallowitz AR, McCormick J, et al. Discovery of a periosteal stem cell mediating intramembranous bone formation. Nature 2018; 562:133.
  89. Huang Z. Simplifying cell fate map by determining lineage history of core pathway activation during fate specification. Trends Dev Biol 2022; 15:53.
  90. Ringwald M, Iyer V, Mason JC, et al. The IKMC web portal: a central point of entry to data and resources from the International Knockout Mouse Consortium. Nucleic Acids Res 2011; 39:D849.
  91. https://www.informatics.jax.org/ (Accessed on March 02, 2025).
  92. Isaacson JH, Cattanach BM. Two new 'hairless' mutants - Sha and Hfh11. Mouse News Lett 1962; 27:31.
  93. Flanagan SP. 'Nude', a new hairless gene with pleiotropic effects in the mouse. Genet Res 1966; 8:295.
  94. Pantelouris EM. Absence of thymus in a mouse mutant. Nature 1968; 217:370.
  95. Rygaard J, Povlsen CO. Heterotransplantation of a human malignant tumour to "Nude" mice. Acta Pathol Microbiol Scand 1969; 77:758.
  96. Blunt T, Gell D, Fox M, et al. Identification of a nonsense mutation in the carboxyl-terminal region of DNA-dependent protein kinase catalytic subunit in the scid mouse. Proc Natl Acad Sci U S A 1996; 93:10285.
  97. Bosma GC, Custer RP, Bosma MJ. A severe combined immunodeficiency mutation in the mouse. Nature 1983; 301:527.
  98. Mosier DE, Gulizia RJ, Baird SM, Wilson DB. Transfer of a functional human immune system to mice with severe combined immunodeficiency. Nature 1988; 335:256.
  99. McCune JM, Namikawa R, Kaneshima H, et al. The SCID-hu mouse: murine model for the analysis of human hematolymphoid differentiation and function. Science 1988; 241:1632.
  100. Kamel-Reid S, Dick JE. Engraftment of immune-deficient mice with human hematopoietic stem cells. Science 1988; 242:1706.
  101. Schatz DG, Oettinger MA, Baltimore D. The V(D)J recombination activating gene, RAG-1. Cell 1989; 59:1035.
  102. Oettinger MA, Schatz DG, Gorka C, Baltimore D. RAG-1 and RAG-2, adjacent genes that synergistically activate V(D)J recombination. Science 1990; 248:1517.
  103. Mombaerts P, Iacomini J, Johnson RS, et al. RAG-1-deficient mice have no mature B and T lymphocytes. Cell 1992; 68:869.
  104. Shinkai Y, Rathbun G, Lam KP, et al. RAG-2-deficient mice lack mature lymphocytes owing to inability to initiate V(D)J rearrangement. Cell 1992; 68:855.
  105. Noguchi M, Yi H, Rosenblatt HM, et al. Interleukin-2 receptor gamma chain mutation results in X-linked severe combined immunodeficiency in humans. Cell 1993; 73:147.
  106. DiSanto JP, Müller W, Guy-Grand D, et al. Lymphoid development in mice with a targeted deletion of the interleukin 2 receptor gamma chain. Proc Natl Acad Sci U S A 1995; 92:377.
  107. Cao X, Shores EW, Hu-Li J, et al. Defective lymphoid development in mice lacking expression of the common cytokine receptor gamma chain. Immunity 1995; 2:223.
  108. Ohbo K, Suda T, Hashiyama M, et al. Modulation of hematopoiesis in mice with a truncated mutant of the interleukin-2 receptor gamma chain. Blood 1996; 87:956.
  109. Takenaka K, Prasolava TK, Wang JC, et al. Polymorphism in Sirpa modulates engraftment of human hematopoietic stem cells. Nat Immunol 2007; 8:1313.
  110. Yamauchi T, Takenaka K, Urata S, et al. Polymorphic Sirpa is the genetic determinant for NOD-based mouse lines to achieve efficient human cell engraftment. Blood 2013; 121:1316.
  111. Yamao T, Noguchi T, Takeuchi O, et al. Negative regulation of platelet clearance and of the macrophage phagocytic response by the transmembrane glycoprotein SHPS-1. J Biol Chem 2002; 277:39833.
  112. Iwamoto C, Takenaka K, Urata S, et al. The BALB/c-specific polymorphic SIRPA enhances its affinity for human CD47, inhibiting phagocytosis against human cells to promote xenogeneic engraftment. Exp Hematol 2014; 42:163.
  113. Threadgill DW, Yee D, Matin A, et al. Genealogy of the 129 inbred strains: 129/SvJ is a contaminated inbred strain. Mamm Genome 1997; 8:390.
  114. Simpson EM, Linder CC, Sargent EE, et al. Genetic variation among 129 substrains and its importance for targeted mutagenesis in mice. Nat Genet 1997; 16:19.
  115. Threadgill DW, Matin A, Yee D, et al. SSLPs to map genetic differences between the 129 inbred strains and closed-colony, random-bred CD-1 mice. Mamm Genome 1997; 8:441.
  116. Kitani H, Takagi N, Atsumi T, et al. Isolation of a germline-transmissible embryonic stem (ES) cell line from C3H/He mice. Zoolog Sci 1996; 13:865.
  117. Schuster-Gossler K, Lee AW, Lerner CP, et al. Use of coisogenic host blastocysts for efficient establishment of germline chimeras with C57BL/6J ES cell lines. Biotechniques 2001; 31:1022.
  118. Festing MF, Simpson EM, Davisson MT, Mobraaten LE. Revised nomenclature for strain 129 mice. Mamm Genome 1999; 10:836.
  119. Tutois S, Salaun J, Mattei MG, Guénet JL. Tg (9 HSA-MYC), a homozygous lethal insertion in the mouse. Mamm Genome 1991; 1:184.
  120. DeLoia JA, Solter D. A transgene insertional mutation at an imprinted locus in the mouse genome. Dev Suppl 1990; :73.
  121. Bronson SK, Plaehn EG, Kluckman KD, et al. Single-copy transgenic mice with chosen-site integration. Proc Natl Acad Sci U S A 1996; 93:9067.
  122. Misra RP, Bronson SK, Xiao Q, et al. Generation of single-copy transgenic mouse embryos directly from ES cells by tetraploid embryo complementation. BMC Biotechnol 2001; 1:12.
  123. Jeffery E, Berry R, Church CD, et al. Characterization of Cre recombinase models for the study of adipose tissue. Adipocyte 2014; 3:206.
  124. Kristianto J, Johnson MG, Zastrow RK, et al. Spontaneous recombinase activity of Cre-ERT2 in vivo. Transgenic Res 2017; 26:411.
  125. Schmidt-Supprian M, Rajewsky K. Vagaries of conditional gene targeting. Nat Immunol 2007; 8:665.
  126. Loonstra A, Vooijs M, Beverloo HB, et al. Growth inhibition and DNA damage induced by Cre recombinase in mammalian cells. Proc Natl Acad Sci U S A 2001; 98:9209.
  127. Jeannotte L, Aubin J, Bourque S, et al. Unsuspected effects of a lung-specific Cre deleter mouse line. Genesis 2011; 49:152.
  128. Thanos A, Morizane Y, Murakami Y, et al. Evidence for baseline retinal pigment epithelium pathology in the Trp1-Cre mouse. Am J Pathol 2012; 180:1917.
  129. Li Y, Choi PS, Casey SC, Felsher DW. Activation of Cre recombinase alone can induce complete tumor regression. PLoS One 2014; 9:e107589.
  130. Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol 1990; 215:403.
  131. Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25:3389.
  132. Gish W, States DJ. Identification of protein coding regions by database similarity search. Nat Genet 1993; 3:266.
  133. Madden TL, Tatusov RL, Zhang J. Applications of network BLAST server. Methods Enzymol 1996; 266:131.
  134. Zhang J, Madden TL. PowerBLAST: a new network BLAST application for interactive or automated sequence analysis and annotation. Genome Res 1997; 7:649.
  135. Zhang Z, Schwartz S, Wagner L, Miller W. A greedy algorithm for aligning DNA sequences. J Comput Biol 2000; 7:203.
  136. Morgulis A, Coulouris G, Raytselis Y, et al. Database indexing for production MegaBLAST searches. Bioinformatics 2008; 24:1757.
  137. Hubbard TJ, Aken BL, Beal K, et al. Ensembl 2007. Nucleic Acids Res 2007; 35:D610.
  138. Kent WJ, Sugnet CW, Furey TS, et al. The human genome browser at UCSC. Genome Res 2002; 12:996.
  139. Stelzer G, Rosen N, Plaschkes I, et al. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses. Curr Protoc Bioinformatics 2016; 54:1.30.1.
  140. Rappaport N, Twik M, Plaschkes I, et al. MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search. Nucleic Acids Res 2017; 45:D877.
  141. Barshir R, Fishilevich S, Iny-Stein T, et al. GeneCaRNA: A Comprehensive Gene-centric Database of Human Non-coding RNAs in the GeneCards Suite. J Mol Biol 2021; 433:166913.
  142. Fishilevich S, Nudel R, Rappaport N, et al. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards. Database (Oxford) 2017; 2017.
  143. Stelzer G, Plaschkes I, Oz-Levi D, et al. VarElect: the phenotype-based variation prioritizer of the GeneCards Suite. BMC Genomics 2016; 17 Suppl 2:444.
  144. Ben-Ari Fuchs S, Lieder I, Stelzer G, et al. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data. OMICS 2016; 20:139.
  145. Belinky F, Nativ N, Stelzer G, et al. PathCards: multi-source consolidation of human biological pathways. Database (Oxford) 2015; 2015.
  146. Fishilevich S, Zimmerman S, Kohn A, et al. Genic insights from integrated human proteomics in GeneCards. Database (Oxford) 2016; 2016.
  147. Rosen N, Chalifa-Caspi V, Shmueli O, et al. GeneLoc: exon-based integration of human genome maps. Bioinformatics 2003; 19 Suppl 1:i222.
  148. Stelzer G, Inger A, Olender T, et al. GeneDecks: paralog hunting and gene-set distillation with GeneCards annotation. OMICS 2009; 13:477.
  149. Lawrence MS, Stojanov P, Mermel CH, et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 2014; 505:495.
  150. Gudmundsson S, Singer-Berk M, Watts NA, et al. Variant interpretation using population databases: Lessons from gnomAD. Hum Mutat 2022; 43:1012.
  151. GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 2020; 369:1318.
  152. UK Biobank. Available at: https://www.ukbiobank.ac.uk/ (Accessed on April 11, 2024).
  153. Kurki MI, Karjalainen J, Palta P, et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature 2023; 613:508.
Topic 2903 Version 36.0

References