DATA SOURCES
Polymer Genomics aggregates data from the following public databases and published literature. Each source retains its original license. Please cite the original data providers when using their data in publications.
Note on licenses: Human Protein Atlas data is licensed under CC BY-SA 3.0 (ShareAlike). Any derivative works incorporating HPA data must be shared under the same license. GTEx data served here consists of summary statistics (median TPM by tissue) and does not include individual-level data, which requires dbGaP authorization.
Copyleft obligations: gnomAD data is licensed under ODC-ODbL 1.0, which requires derivative databases to remain open. CpG islands, RepeatMasker annotations, and conservation scores are independently computed by Polymer Genomics using open-source tools and public-domain reference data — no non-commercial restrictions apply to these layers.
Epigenetic clock IP: Some epigenetic clocks are protected by patents (e.g., GrimAge: US Patent 10,706,957). DunedinPACE may have separate intellectual property protections. Clock probe annotations served here are for informational and reference purposes only. Computation of epigenetic age using these coefficients may require separate licensing from the respective intellectual property holders.
Sequence & Assembly
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| GRCh38 / hg38 | GCA_000001405.15 | — | Public Domain | Genome Reference Consortium |
| GRCh37 / hg37 | GCA_000001405.1 | — | Public Domain | Genome Reference Consortium |
| UCSC Genome Browser | 2024 | — | Free for non-commercial use | Kent et al., Genome Res 2002 |
Gene Annotation
Methylation & CpG
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| Illumina EPIC v2 Manifest | v2.0 | — | Coordinates only (factual data) | Illumina Inc. Probe IDs and genomic coordinates via sesameData. |
| Illumina EPIC v1 Manifest | v1.0 | — | Coordinates only (factual data) | Pidsley et al., Genome Biol 2016. Probe IDs and genomic coordinates via sesameData. |
| Illumina 450K Manifest | v1.2 | — | Coordinates only (factual data) | Bibikova et al., Genomics 2011. Probe IDs and genomic coordinates via sesameData. |
| CpG Islands | hg38 | — | MIT (computed) | Gardiner-Garden & Frommer 1987. Computed from reference FASTA. |
Expression
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| GTEx | v10 | — | dbGaP (summary statistics: open access) | GTEx Consortium, Science 2020 |
| Human Protein Atlas | v23 | — | CC BY-SA 3.0 | Uhlén et al., Science 2015 |
| PaxDb | v5.0 | — | CC BY 4.0 | Wang et al., Mol Cell Proteomics 2015 |
Constraint & Variation
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| gnomAD Constraint | v4.1 | — | ODC-ODbL 1.0 | Chen et al., Nature 2024 |
| gnomAD Structural Variants | v4.1 | — | ODC-ODbL 1.0 | Collins et al., Nature 2020 |
| ClinVar | 2024-09 | — | Public Domain | Landrum et al., Nucleic Acids Res 2020 |
Pathways & Gene Sets
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| Reactome | v88 | — | CC BY 4.0 | Gillespie et al., Nucleic Acids Res 2022 |
| MSigDB Hallmark | v2024.1 | — | CC BY 4.0 | Liberzon et al., Cell Syst 2015 |
Chromatin & Epigenomics
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| ENCODE cCREs | v4 | — | CC BY 4.0 | ENCODE Consortium, Nature 2012 |
| ENCODE Histone Peaks | v3 | — | CC BY 4.0 | ENCODE Consortium, Nature 2012 |
| ENCODE TF Binding Sites | v3 | — | CC BY 4.0 | ENCODE Consortium, Nature 2020 |
| ENCODE Accessibility Signal | v3 | — | CC BY 4.0 | ENCODE Consortium, Nature 2020 |
| ChromHMM 15-state Model | 2012 | — | Public Domain (NIH) | Ernst & Kellis, Nat Methods 2012 |
| Roadmap Epigenomics | 2015 | — | Public Domain (NIH) | Roadmap Consortium, Nature 2015 |
3D Genome
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| ENCODE TAD Domains | Arrowhead | — | CC BY 4.0 | Rao et al., Cell 2014 |
| Hi-C A/B Compartments | v1 | — | CC BY 4.0 | Lieberman-Aiden et al., Science 2009 |
| Insulation Score | v1 | — | CC BY 4.0 | Crane et al., Nature 2015 |
| LADs | Meuleman 2013 | — | Published literature | Meuleman et al., Genome Res 2013 |
| NADs | v1 | — | Published literature | Nemeth et al., PLoS Genet 2010 |
Biophysics & Thermodynamics
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| SantaLucia NN Parameters | 1998/2004 | — | Published literature | SantaLucia, Proc Natl Acad Sci 1998; SantaLucia & Hicks, Annu Rev Biophys 2004 |
| Sugimoto RNA/DNA Parameters | 1995 | — | Published literature | Sugimoto et al., Biochemistry 1995 |
| Xia RNA Parameters | 1998 | — | Published literature | Xia et al., Biochemistry 1998 |
Epigenetic Clocks
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| Horvath Clock | 2013 | — | Published literature | Horvath, Genome Biol 2013 |
| Hannum Clock | 2013 | — | Published literature | Hannum et al., Mol Cell 2013 |
| PhenoAge | 2018 | — | Published literature | Levine et al., Aging 2018 |
| GrimAge | 2019 | — | Published literature | Lu et al., Aging 2019 |
| DunedinPACE | 2022 | — | Published literature | Belsky et al., eLife 2022 |
Mutations
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| SBS Mutation Thermodynamics | v1.0 | — | MIT (computed) | SantaLucia, PNAS 1998. 96-channel trinucleotide stacking energy perturbations. |
Conservation & Evolution
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| Zoonomia phyloP 447-way | 2023 | — | Freely usable for any purpose | Zoonomia Consortium, Science 2023; Cactus 447-way vertebrate alignment |
| phastCons 470-way | 2023 | — | Freely usable for any purpose | Siepel et al., Genome Res 2005; 470-way Cactus alignment |
| Ensembl Compara | v112 | — | Apache 2.0 | Herrero et al., Database 2016 |
| Ultraconserved Elements | Bejerano 2004 | — | Published literature | Bejerano et al., Science 2004 |
| Human Accelerated Regions | v1 | — | Published literature | Pollard et al., Nature 2006 |
| Archaic Introgression | Vernot 2016 | — | Published literature | Vernot et al., Science 2016 |
| Selection Sweeps | v1 | — | Published literature | Sabeti et al., Nature 2007 |
Repeat Elements & Transposons
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| RepeatMasker | v4.1.5 | — | Open Source + Dfam CC0 (self-computed) | Smit, Hubley & Green, RepeatMasker Open-4.0 + Dfam 3.x |
| Telescope HERV Loci | v2 | — | MIT | Bendall et al., PLoS Comput Biol 2019 |
| TE Exaptation Catalog | v1 | — | CC BY 4.0 | Chuong et al., Nat Rev Genet 2017 |
GWAS
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| EBI GWAS Catalog | v1.0.4 | — | CC0 1.0 | Buniello et al., Nucleic Acids Res 2019 |
Cell-Type Methylation
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| FlowSorted.Blood.EPIC | 2018 | — | Artistic License 2.0 | Salas et al., Genome Biol 2018 |
| WGBS Hematopoietic | BLUEPRINT | — | CC BY 4.0 | Stunnenberg et al., Cell 2016 |
| Archaic Methylation | v1 | — | Published literature | Gokhman et al., Science 2014 |
Protein Properties & Turnover
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| UniProt | 2024_05 | — | CC BY 4.0 | UniProt Consortium, Nucleic Acids Res 2023 |
| Protein Half-Lives | 2018 | — | Published literature | Mathieson et al., Nat Commun 2018 |
QTLs
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| GTEx eQTLs | v8 | — | dbGaP (summary statistics: open access) | GTEx Consortium, Science 2020 |
| GoDMC meQTLs | v1 | — | Published literature | Min et al., Nat Genet 2021 |
Regulatory Elements
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| ABC Enhancer-Gene Links | v1 | — | CC BY 4.0 | Fulco et al., Nat Genet 2019 |
| dbSUPER Super-Enhancers | v1 | — | Published literature | Khan & Zhang, Nucleic Acids Res 2016 |
| DNA Methylation Valleys | v1 | — | Published literature | Jeong et al., Genome Res 2014 |
Recombination
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| Crossover Hotspots | Palsson 2025 | — | Published literature | Palsson et al., Science 2025 |
| DMC1 Meiotic Hotspots | Pratto 2014 | — | Published literature | Pratto et al., Science 2014 |
Structural DNA & Fragility
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| Non-B DNA Structures | v1 | — | MIT | Polymer Genomics — G4, Z-DNA, cruciform, triplex, R-loop predictions |
| Fragility Composite Score | v1 | — | MIT | Polymer Genomics — integrated from non-B, stacking, curvature |
| Recurrent Breakpoints | v1 | — | Published literature | HumCFS (Mrasek 2010) + Mitelman Database. Fragile sites and translocation breakpoints. |
HLA
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| IPD-IMGT/HLA Database | 3.55 | — | CC BY 4.0 | Robinson et al., Nucleic Acids Res 2020 |
Mutation Density
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| PCAWG Mutation Rates | v1 | — | CC BY 4.0 | ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium, Nature 2020 |
Computed Tracks
| Source | Version | Rows | License | Citation |
|---|---|---|---|---|
| Polymer Evolution Layer 0 | v1.0 | — | MIT | Polymer Genomics — computed from SantaLucia/published parameters |