dbNSFP version 5.2a Release: July 2, 2025 Copyright: Copyright © 2025 Genos Bioinformatics LLC (Texas, USA) . All rights reserved. dbNSFP is free for academic and non-commercial use under the CC4-NC-NDR license. For commercial use of dbNSFP, please contact license@dbnsfp.org. Website: https://dbnsfp.org License Notice: dbNSFP version 5.2a is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). You are free to share (copy and redistribute) the material in any medium or format under the following terms: Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. NonCommercial: You may not use the material for commercial purposes. NoDerivatives: If you remix, transform, or build upon the material, you may not distribute the modified material. For more details, refer to the full license here: https://creativecommons.org/licenses/by-nc-nd/4.0/ Major sources: Variant determination: Gencode release 48/Ensembl 114, released May, 2025 (hg38) Functional predictions: SIFT ensembl 66, released Jan, 2015 http://provean.jcvi.org/index.php SIFT4G 2.4, released Nov. 1, 2016 http://sift.bii.a-star.edu.sg/sift4g/public//Homo_sapiens/ PROVEAN 1.1 ensembl 66, released Jan, 2015 http://provean.jcvi.org/index.php Polyphen-2 v2.2.2, released Feb, 2012 http://genetics.bwh.harvard.edu/pph2/ MutationTaster 2021, https://www.genecascade.org/MutationTaster2021/ MutationAssessor release 3, http://mutationassessor.org/ fathmm-XF, http://fathmm.biocompute.org.uk/fathmm-xf/ CADD v1.7, http://cadd.gs.washington.edu/ VEST v4.0, http://karchinlab.org/apps/appVest.html DANN, https://cbcl.ics.uci.edu/public_data/DANN/ MetaSVM and MetaLR, doi: 10.1093/hmg/ddu733 MetaRNN v1.0, http://www.liulab.science/metarnn.html Eigen & Eigen PC v1.1, http://www.columbia.edu/~ii2135/eigen.html M-CAP v1.3, http://bejerano.stanford.edu/MCAP/ REVEL release May 3, 2021, https://sites.google.com/site/revelgenomics/ MutPred2, http://mutpred.mutdb.org/index.html MVP 1.0, https://github.com/ShenLab/missense gMVP, https://github.com/ShenLab/gMVP/ MPC release1, ftp://ftp.broadinstitute.org/pub/ExAC_release/release1/regional_missense_constraint/ PrimateAI, https://github.com/Illumina/PrimateAI deogen2, https://deogen2.mutaframe.com/ ALoFT 1.0, http://aloft.gersteinlab.org/ BayesDel v1, http://fengbj-laboratory.org/BayesDel/BayesDel.html ClinPred, https://sites.google.com/site/clinpred/home LIST-S2 v1.10, https://precomputed.list-s2.msl.ubc.ca/ VARITY, http://varity.varianteffect.org/ ESM1b, https://huggingface.co/spaces/ntranoslab/esm_variants/tree/main AlphaMissense, https://console.cloud.google.com/storage/browser/dm_alphamissense PHACTboost, https://github.com/CompGenomeLab/PHACTboost MutFormer, https://github.com/WGLab/mutformer MutScore, https://iob-genetic.shinyapps.io/mutscore/ Conservation scores: phyloP100way_vertebrate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP100way/ phyloP470way_mammalian (hg38) https://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP470way/ phyloP17way_primate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP17way/ phastCons100way_vertebrate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phastCons100way/ phastCons470way_mammalian (hg38) https://hgdownload.cse.ucsc.edu/goldenpath/hg38/phastCons470way/ phastCons17way_primate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phastCons17way/ GERP++ http://mendel.stanford.edu/SidowLab/downloads/gerp/ GERP_91_mammals https://ftp.ensembl.org/pub/current_compara/conservation_scores/91_mammals.gerp_conservation_score/ bStatistic in CADDv1.7 http://cadd.gs.washington.edu/ Other variant annotation sources: Interpro v105 http://www.ebi.ac.uk/interpro/ 1000 Genomes project http://www.1000genomes.org/ TOPMed freeze8 https://legacy.bravo.sph.umich.edu/freeze8/hg38/downloads dbSNP b157 (hg38) https://ftp.ncbi.nih.gov/snp/archive/b157/VCF/GCF_000001405.40.gz clinvar release 20250504 (hg38) ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/ gnomAD exome subset v2.1.1 http://gnomad.broadinstitute.org/downloads gnomAD joint v4.1 http://gnomad.broadinstitute.org/downloads ALFA (Allele Frequency Aggregator) release 4 https://www.ncbi.nlm.nih.gov/snp/docs/gsr/alfa/ Ancestral alleles (hg38) in Ensembl 114 https://ftp.ensembl.org/pub/release-114/fasta/ancestral_alleles/homo_sapiens_ancestor_GRCh38.tar.gz Altai Neanderthal genotypes: http://cdna.eva.mpg.de/neandertal/Vindija/VCF/Altai/ Denisova genotypes: http://cdna.eva.mpg.de/neandertal/Vindija/VCF/Denisova/ Vindija33.19 genotypes: http://cdna.eva.mpg.de/neandertal/Vindija/VCF/Vindija33.19/ Chagyrskaya genotype: http://cdna.eva.mpg.de/neandertal/Chagyrskaya/VCF/ MANE release 1.4: https://ftp.ncbi.nlm.nih.gov/refseq/MANE/MANE_human/release_1.4/ Regeneron Genetics Center Million Exome: https://rgc-research.regeneron.com/me/home All of Us 250K: Human Genome Sequencing Center, BCM, Houston, TX Other gene annotation sources: HGNC, downloaded on 05/19/2025, https://www.genenames.org/download/ Uniprot, Release 2025_02, https://www.uniprot.org/uniprotkb IntAct, release 250, https://www.ebi.ac.uk/intact/download GWAS catalog, r2025-05-13, https://www.ebi.ac.uk/gwas/downloads Haploinsufficiency probability data, from doi:10.1371/journal.pgen.1001154 Recessive probability data, from DOI:10.1126/science.1215040 Residual Variation Intolerance Score (RVIS), v3 http://genic-intolerance.org/ Genome-wide haploinsufficiency score (GHIS), from doi: 10.1093/nar/gkv474 ExAC Functional Gene Constraint, from release0.3.1, https://gnomad.broadinstitute.org/downloads#exac ExAC CNV gene score, from release0.3.1, https://gnomad.broadinstitute.org/downloads#exac Gene Ontology (GO), 2025-03-16, https://geneontology.org/docs/download-ontology/ ConsensusPathDB, Release 35, http://cpdb.molgen.mpg.de/MCPDB/daccess Essential genes, from doi:10.1371/journal.pgen.1003484, doi: 10.1126/science.aac7041, doi: 10.1016/j.cell.2015.11.015, doi: 10.1126/science.aac7557, doi:10.1371/journal.pcbi.1002886 Mouse genes, from Mouse Genome Informatics (MGI), 6.24 update 05/06/2025 Zebrafish genes, from The Zebrafish Information Network (ZFIN), reports from 05/18/2025 KEGG pathway, from http://www.openbioinformatics.org/gengen/tutorial_calculate_gsea.html BioCarta pathway, from http://www.openbioinformatics.org/gengen/tutorial_calculate_gsea.html GDI, from doi: 10.1073/pnas.1518646112 LoFtool, from DOI:10.1093/bioinformatics/btv602 HIPred, from doi:10.1093/bioinformatics/btx028 HPO, data release 2025-05-06, https://hpo.jax.org/app/download/annotation ClinGen Dosage Sensitivity, 2025-05-18, https://search.clinicalgenome.org/kb/gene-dosage The Human Protein Atlas, downloaded on October 22, 2024, https://www.proteinatlas.org/about/download OMIM, downloaded on 05/19/2025, https://omim.org/ Orphanet, update 12/03/2024, https://www.orpha.net/ GenCC, downloaded on 05/24/2025, https://search.thegencc.org/download Files: dbNSFP5.2a_variant.chr<#>.gz - gzipped dbNSFP variant database files by chromosomes dbNSFP5.2_gene.gz - gzipped dbNSFP gene database file dbNSFP5.2a.readme.txt - this file search_dbNSFP52a.jar - companion GUI Java program for searching dbNSFP5.2a search_dbNSFP52a.class - companion command-line Java program for searching dbNSFP5.2a search_dbNSFP52a.readme.pdf - README file for search_dbNSFP52a.class tryhg19.in - an example input file with hg19 genome positions tryhg18.in - an example input file with hg18 genome positions tryhg38.in - an example input file with hg38 genome positions try.vcf - an example of vcf input file Description: The dbNSFP is an integrated database of functional annotations from multiple sources for the comprehensive collection of human non-synonymous SNPs (nsSNVs). Its current version includes a total of 81,529,581 nsSNVs and 2,230,506 ssSNVs (splice site SNVs). It compiles prediction scores from 34 prediction algorithms (SIFT, SIFT4G, PROVEAN, Polyphen2-HDIV, Polyphen2-HVAR, MutationTaster 2021, MutationAssessor, FATHMM-XF coding, CADD, VEST4, DANN, MetaSVM, MetaLR, MetaRNN, Eigen, Eigen-PC, M-CAP, REVEL, MutPred2, MVP, gMVP, MPC, PrimateAI, DEOGEN2, ALoFT, BayesDel, ClinPred, LIST-S2, VARITY, ESM1b, AlphaMissense, PHACTboost, MutFormer, MutScore), 9 conservation scores (bStatistic, phyloP100way_vertebrate, phyloP470way_mammalian, phyloP17way_primate, phastCons100way_vertebrate, phastCons470way_mammalian, phastCons17way_primate, GERP++ and GERP_91) and other function annotations. Since version 2.0, dbNSFP is separated into two parts, dbNSFP_variant and dbNSFP_gene. As their names indicate, the former focuses on variant annotations (including prediction scores and conservation scores), and the latter focuses on gene annotations. Since version 2.6, dbscSNV is added as an attached database, which includes all potential human SNVs within splicing consensus regions (−3 to +8 at the 5’ splice site and −12 to +2 at the 3’ splice site), i.e. scSNVs, and predictions for their potential of altering splicing. Since version 3, two branches of dbNSFP are provided: "a" branch is suitable for academic use, which includes all the resources, and "c" branch is suitable for commercial use, which does not include those do not allow or require licenses for commercial usages, including Polyphen-2, CADD,VEST,M-CAP,REVEL,MutPred,ClinPred,MutScore,and PHACTboost. Columns of dbNSFP_variant: 1 chr: chromosome number 2 pos(1-based): physical position on the chromosome as to hg38 (1-based coordinate). For mitochondrial SNV, this position refers to the rCRS (GenBank: NC_012920). 3 ref: reference nucleotide allele (as on the + strand) 4 alt: alternative nucleotide allele (as on the + strand) 5 aaref: reference amino acid "." if the variant is a splicing site SNP (2bp on each end of an intron) 6 aaalt: alternative amino acid "." if the variant is a splicing site SNP (2bp on each end of an intron) 7 rs_dbSNP: rs number from dbSNP 8 hg19_chr: chromosome as to hg19, "." means missing 9 hg19_pos(1-based): physical position on the chromosome as to hg19 (1-based coordinate). For mitochondrial SNV, this position refers to a YRI sequence (GenBank: AF347015) 10 hg18_chr: chromosome as to hg18, "." means missing 11 hg18_pos(1-based): physical position on the chromosome as to hg18 (1-based coordinate) For mitochondrial SNV, this position refers to a YRI sequence (GenBank: AF347015) 12 aapos: amino acid position as to the protein. "-1" if the variant is a splicing site SNP (2bp on each end of an intron). Multiple entries separated by ";", corresponding to Ensembl_proteinid 13 genename: gene name; if the nsSNV can be assigned to multiple genes, gene names are separated by ";" 14 Ensembl_geneid: Ensembl gene id 15 Ensembl_transcriptid: Ensembl transcript ids (Multiple entries separated by ";") 16 Ensembl_proteinid: Ensembl protein ids Multiple entries separated by ";", corresponding to Ensembl_transcriptids 17 Uniprot_acc: Uniprot accession number matching the Ensembl_proteinid Multiple entries separated by ";". 18 Uniprot_entry: Uniprot entry ID matching the Ensembl_proteinid Multiple entries separated by ";". 19 HGVSc_snpEff: HGVS coding variant presentation from snpEff Multiple entries separated by ";", corresponds to Ensembl_transcriptid 20 HGVSp_snpEff: HGVS protein variant presentation from snpEff Multiple entries separated by ";", corresponds to Ensembl_proteinid 21 HGVSc_VEP: HGVS coding variant presentation from VEP Multiple entries separated by ";", corresponds to Ensembl_transcriptid 22 HGVSp_VEP: HGVS protein variant presentation from VEP Multiple entries separated by ";", corresponds to Ensembl_proteinid 23 APPRIS: APPRIS annotation for the transcripts matching Ensembl_transcriptid Multiple entries separated by ";". Potential values: principal1, principal2, principal3, principal4, principal5, alternative1, alternative2. See https://useast.ensembl.org/info/genome/genebuild/transcript_quality_tags.html 24 GENCODE_basic: Whether the transcript belongs to GENCODE_basic (5' and 3' complete transcripts). Multiple entries separated by ";", matching Ensembl_transcriptid. See https://useast.ensembl.org/info/genome/genebuild/transcript_quality_tags.html 25 TSL: Transcript Support Level. Multiple entries separated by ";", matching Ensembl_transcriptid. Potential values: 1 to 5, NA. See https://useast.ensembl.org/info/genome/genebuild/transcript_quality_tags.html 26 VEP_canonical: canonical transcript used in Ensembl. Multiple entries separated by ";", matching Ensembl_transcriptid. See https://useast.ensembl.org/Help/Glossary?id=521 27 MANE: transcripts annotated by the MANE project. Multiple entries separated by ";", matching Ensembl_transcriptid. Potential values include "Select" (representative transcripts) or "Plus_Clinical" (additional clinical relevant transcripts). See https://www.ncbi.nlm.nih.gov/refseq/MANE/ 28 cds_strand: coding sequence (CDS) strand (+ or -) 29 refcodon: reference codon 30 codonpos: position on the codon (1, 2 or 3) 31 codon_degeneracy: degenerate type (0, 2 or 3) 32 Ancestral_allele: ancestral allele based on 8 primates EPO. Ancestral alleles by Ensembl 84. The following comes from its original README file: ACTG - high-confidence call, ancestral state supported by the other two sequences actg - low-confidence call, ancestral state supported by one sequence only N - failure, the ancestral state is not supported by any other sequence - - the extant species contains an insertion at this position . - no coverage in the alignment 33 AltaiNeandertal: genotype of a deep sequenced Altai Neanderthal 34 Denisova: genotype of a deep sequenced Denisova 35 VindijiaNeandertal: genotype of a deep sequenced Vindijia Neandertal 36 ChagyrskayaNeandertal: genotype of a deep sequenced Chagyrskaya Neandertal 37 clinvar_id: clinvar variation ID 38 clinvar_clnsig: clinical significance by clinvar Possible values: Benign, Likely_benign, Likely_pathogenic, Pathogenic, drug_response, histocompatibility. A negative score means the score is for the ref allele 39 clinvar_trait: the trait/disease the clinvar_clnsig referring to 40 clinvar_review: ClinVar Review Status summary Possible values: no assertion criteria provided, criteria provided, single submitter, criteria provided, multiple submitters, no conflicts, reviewed by expert panel, practice guideline 41 clinvar_hgvs: variant in HGVS format 42 clinvar_var_source: source of the variant 43 clinvar_MedGen_id: MedGen ID of the trait/disease the clinvar_trait referring to 44 clinvar_OMIM_id: OMIM ID of the trait/disease the clinvar_trait referring to 45 clinvar_Orphanet_id: Orphanet ID of the trait/disease the clinvar_trait referring to 46 Interpro_domain: domain or conserved site on which the variant locates. Domain annotations come from Interpro database. The number in the brackets following a specific domain is the count of times Interpro assigns the variant position to that domain, typically coming from different predicting databases. Multiple entries separated by ";". 47 SIFT_score: SIFT score (SIFTori). Scores range from 0 to 1. The smaller the score the more likely the SNP has a damaging effect. Multiple scores separated by ";", corresponding to Ensembl_proteinid. 48 SIFT_converted_rankscore: SIFTori scores were first converted to SIFTnew=1-SIFTori, then ranked among all SIFTnew scores in dbNSFP. The rankscore is the ratio of the rank the SIFTnew score over the total number of SIFTnew scores in dbNSFP. If there are multiple scores, only the most damaging (largest) rankscore is presented. The rankscores range from 0.00964 to 0.91255. 49 SIFT_pred: If SIFTori is smaller than 0.05 (rankscore>0.39575) the corresponding nsSNV is predicted as "D(amaging)"; otherwise it is predicted as "T(olerated)". Multiple predictions separated by ";" 50 SIFT4G_score: SIFT 4G score (SIFT4G). Scores range from 0 to 1. The smaller the score the more likely the SNP has damaging effect. Multiple scores separated by ",", corresponding to Ensembl_transcriptid 51 SIFT4G_converted_rankscore: SIFT4G scores were first converted to SIFT4Gnew=1-SIFT4G, then ranked among all SIFT4Gnew scores in dbNSFP. The rankscore is the ratio of the rank the SIFT4Gnew score over the total number of SIFT4Gnew scores in dbNSFP. If there are multiple scores, only the most damaging (largest) rankscore is presented. 52 SIFT4G_pred: If SIFT4G is < 0.05 the corresponding nsSNV is predicted as "D(amaging)"; otherwise it is predicted as "T(olerated)". Multiple scores separated by ",", corresponding to Ensembl_transcriptid 53 Polyphen2_HDIV_score: Polyphen2 score based on HumDiv, i.e. hdiv_prob. The score ranges from 0 to 1. Multiple entries separated by ";", corresponding to Uniprot_acc. 54 Polyphen2_HDIV_rankscore: Polyphen2 HDIV scores were first ranked among all HDIV scores in dbNSFP. The rankscore is the ratio of the rank the score over the total number of the scores in dbNSFP. If there are multiple scores, only the most damaging (largest) rankscore is presented. The scores range from 0.03061 to 0.91137. 55 Polyphen2_HDIV_pred: Polyphen2 prediction based on HumDiv, "D" ("probably damaging", HDIV score in [0.957,1] or rankscore in [0.55859,0.91137]), "P" ("possibly damaging", HDIV score in [0.454,0.956] or rankscore in [0.37043,0.55681]) and "B" ("benign", HDIV score in [0,0.452] or rankscore in [0.03061,0.36974]). Score cutoff for binary classification is 0.5 for HDIV score or 0.38028 for rankscore, i.e. the prediction is "neutral" if the HDIV score is smaller than 0.5 (rankscore is smaller than 0.38028), and "deleterious" if the HDIV score is larger than 0.5 (rankscore is larger than 0.38028). Multiple entries are separated by ";", corresponding to Uniprot_acc. 56 Polyphen2_HVAR_score: Polyphen2 score based on HumVar, i.e. hvar_prob. The score ranges from 0 to 1. Multiple entries separated by ";", corresponding to Uniprot_acc. 57 Polyphen2_HVAR_rankscore: Polyphen2 HVAR scores were first ranked among all HVAR scores in dbNSFP. The rankscore is the ratio of the rank the score over the total number of the scores in dbNSFP. If there are multiple scores, only the most damaging (largest) rankscore is presented. The scores range from 0.01493 to 0.97581. 58 Polyphen2_HVAR_pred: Polyphen2 prediction based on HumVar, "D" ("probably damaging", HVAR score in [0.909,1] or rankscore in [0.65694,0.97581]), "P" ("possibly damaging", HVAR in [0.447,0.908] or rankscore in [0.47121,0.65622]) and "B" ("benign", HVAR score in [0,0.446] or rankscore in [0.01493,0.47076]). Score cutoff for binary classification is 0.5 for HVAR score or 0.48762 for rankscore, i.e. the prediction is "neutral" if the HVAR score is smaller than 0.5 (rankscore is smaller than 0.48762), and "deleterious" if the HVAR score is larger than 0.5 (rankscore is larger than 0.48762). Multiple entries are separated by ";", corresponding to Uniprot_acc. 59 MutationTaster_score: MutationTaster p-value (MTori), ranges from 0 to 1. Multiple scores are separated by ";". Information on corresponding transcript(s) can be found by querying http://www.mutationtaster.org/ChrPos.html 60 MutationTaster_rankscore: The MTori scores were ranked among all MTori scores in dbNSFP. If there are multiple scores of a SNV, only the largest MTori was used in ranking. The rankscore is the ratio of the rank of the score over the total number of MTori scores in dbNSFP. 61 MutationTaster_pred: MutationTaster prediction, "A" ("disease_causing_automatic"), "D" ("disease_causing"), "N" ("polymorphism") or "P" ("polymorphism_automatic"). The score cutoff between "D" and "N" is 0.5 for MTnew and 0.31733 for the rankscore. 62 MutationTaster_model: MutationTaster prediction models. 63 MutationTaster_trees_benign: the number of decision trees of the Random Forest suggesting benign; trees_deleterious/(trees_benign+trees_deleterious) can be used as a measure for deleteriousness. 64 MutationTaster_trees_deleterious: the number of decision trees of the Random Forest suggesting deleterious; trees_deleterious/(trees_benign+trees_deleterious) can be used as a measure for deleteriousness. 65 MutationAssessor_score: MutationAssessor functional impact combined score (MAori). The score ranges from -5.17 to 6.49 in dbNSFP. Multiple entries are separated by ";", corresponding to Uniprot_entry. 66 MutationAssessor_rankscore: MAori scores were ranked among all MAori scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of MAori scores in dbNSFP. The scores range from 0 to 1. 67 MutationAssessor_pred: MutationAssessor's functional impact of a variant - predicted functional, i.e. high ("H") or medium ("M"), or predicted non-functional, i.e. low ("L") or neutral ("N"). The MAori score cutoffs between "H" and "M", "M" and "L", and "L" and "N", are 3.5, 1.935 and 0.8, respectively. The rankscore cutoffs between "H" and "M", "M" and "L", and "L" and "N", are 0.9307, 0.52043 and 0.19675, respectively. 68 PROVEAN_score: PROVEAN score (PROVEANori). Scores range from -14 to 14. The smaller the score the more likely the SNP has damaging effect. Multiple scores separated by ";", corresponding to Ensembl_proteinid. 69 PROVEAN_converted_rankscore: PROVEANori were first converted to PROVEANnew=1-(PROVEANori+14)/28, then ranked among all PROVEANnew scores in dbNSFP. The rankscore is the ratio of the rank the PROVEANnew score over the total number of PROVEANnew scores in dbNSFP. If there are multiple scores, only the most damaging (largest) rankscore is presented. The scores range from 0 to 1. 70 PROVEAN_pred: If PROVEANori <= -2.5 (rankscore>=0.54382) the corresponding nsSNV is predicted as "D(amaging)"; otherwise it is predicted as "N(eutral)". Multiple predictions separated by ";", corresponding to Ensembl_proteinid. 71 VEST4_score: VEST 4.0 score. Score ranges from 0 to 1. The larger the score the more likely the mutation may cause functional change. Multiple scores separated by ";", corresponding to Ensembl_transcriptid. Please note this score is free for non-commercial use. For more details please refer to http://wiki.chasmsoftware.org/index.php/SoftwareLicense. Commercial users should contact the Johns Hopkins Technology Transfer office. 72 VEST4_rankscore: VEST4 scores were ranked among all VEST4 scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of VEST4 scores in dbNSFP. In case there are multiple scores for the same variant, the largest score (most damaging) is presented. The scores range from 0 to 1. Please note VEST score is free for non-commercial use. For more details please refer to http://wiki.chasmsoftware.org/index.php/SoftwareLicense. Commercial users should contact the Johns Hopkins Technology Transfer office. 73 MetaSVM_score: Our support vector machine (SVM) based ensemble prediction score, which incorporated 10 scores (SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, Mutation Assessor, FATHMM, LRT, SiPhy, PhyloP) and the maximum frequency observed in the 1000 genomes populations. Larger value means the SNV is more likely to be damaging. Scores range from -2 to 3 in dbNSFP. 74 MetaSVM_rankscore: MetaSVM scores were ranked among all MetaSVM scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of MetaSVM scores in dbNSFP. The scores range from 0 to 1. 75 MetaSVM_pred: Prediction of our SVM based ensemble prediction score,"T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0. The rankscore cutoff between "D" and "T" is 0.82257. 76 MetaLR_score: Our logistic regression (LR) based ensemble prediction score, which incorporated 10 scores (SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, Mutation Assessor, FATHMM, LRT, SiPhy, PhyloP) and the maximum frequency observed in the 1000 genomes populations. Larger value means the SNV is more likely to be damaging. Scores range from 0 to 1. 77 MetaLR_rankscore: MetaLR scores were ranked among all MetaLR scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of MetaLR scores in dbNSFP. The scores range from 0 to 1. 78 MetaLR_pred: Prediction of our MetaLR based ensemble prediction score,"T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0.5. The rankscore cutoff between "D" and "T" is 0.81101. 79 Reliability_index: Number of observed component scores (except the maximum frequency in the 1000 genomes populations) for MetaSVM and MetaLR. Ranges from 1 to 10. As MetaSVM and MetaLR scores are calculated based on imputed data, the less missing component scores, the higher the reliability of the scores and predictions. 80 MetaRNN_score: Our recurrent neural network (RNN) based ensemble prediction score, which incorporated 16 scores (SIFT, Polyphen2_HDIV, Polyphen2_HVAR, MutationAssessor, PROVEAN, VEST4, M-CAP, REVEL, MutPred, MVP, PrimateAI, DEOGEN2, CADD, fathmm-XF, Eigen and GenoCanyon), 8 conservation scores (GERP, phyloP100way_vertebrate, phyloP30way_mammalian, phyloP17way_primate, phastCons100way_vertebrate, phastCons30way_mammalian, phastCons17way_primate and SiPhy), and allele frequency information from the 1000 Genomes Project (1000GP), ExAC, and gnomAD. Larger value means the SNV is more likely to be damaging. Scores range from 0 to 1. 81 MetaRNN_rankscore: MetaRNN scores were ranked among all MetaRNN scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of MetaRNN scores in dbNSFP. The scores range from 0 to 1. 82 MetaRNN_pred: Prediction of our MetaRNN based ensemble prediction score,"T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0.5. The rankscore cutoff between "D" and "T" is 0.6149. 83 M-CAP_score: M-CAP is hybrid ensemble score (details in DOI: 10.1038/ng.3703). Scores range from 0 to 1. The larger the score the more likely the SNP has damaging effect. 84 M-CAP_rankscore: M-CAP scores were ranked among all M-CAP scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of M-CAP scores in dbNSFP. 85 M-CAP_pred: Prediction of M-CAP score based on the authors' recommendation, "T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0.025. 86 REVEL_score: REVEL is an ensemble score based on 13 individual scores for predicting the pathogenicity of missense variants. Scores range from 0 to 1. The larger the score the more likely the SNP has a damaging effect. "REVEL scores are freely available for non-commercial use. For other uses, please contact Weiva Sieh" (weiva.sieh@mssm.edu) Multiple entries are separated by ";", corresponding to Ensembl_transcriptid. 87 REVEL_rankscore: REVEL scores were ranked among all REVEL scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of REVEL scores in dbNSFP. 88 MutPred2_score: General MutPred2 score. Scores range from 0 to 1. The larger the score the more likely the SNP has a damaging effect. Multiple entries are separated by ";", corresponding to Ensembl_transcriptid. 89 MutPred2_rankscore: MutPred2 scores were ranked among all MutPred2 scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of MutPred2 scores in dbNSFP. 90 MutPred2_pred: Prediction of MutPred2 score after calibration (https://doi.org/10.1016/j.ajhg.2022.10.013): PS (Strong Evidence of Pathogenic, score>=0.932), PM (Moderate Evidence of Pathogenic (Moderate evidence of pathogenicity, score >=0.829 and < 0.932), PP (Supporting Evidence of Pathogenic, score >=0.737 and <0.829), UC (Uncertain, score >0.391 and <0.737), BP (Supporting Evidence of Benign, score >0.197 and <=0.391), BM (Moderate Evidence of Benign, score >0.010 and <=0.197), BS (Strong Evidence of Benign, score <=0.010). Multiple entries are separated by ";", corresponding to Ensembl_transcriptid. 91 MutPred2_top5_mechanisms: Top 5 molecular mechanisms as predicted by MutPred2 with p values. The Pr is the "posterior probability of the loss/gain of certain structural and functional properties due to the substitution". The P is the "empirical P-value calculated as the fraction of benign substitutions in MutPred2's training set with Pr values >= to the Pr value for the given substitution." More details can be found at http://mutpred.mutdb.org/help.html. Multiple entries are separated by ";", corresponding to Ensembl_transcriptid. 92 MVP_score: A pathogenicity prediction score for missense variants using deep learning approach. The range of MVP score is from 0 to 1. The larger the score, the more likely the variant is pathogenic. The authors suggest thresholds of 0.7 and 0.75 for separating damaging vs tolerant variants in constrained genes (ExAC pLI >=0.5) and non-constrained genes (ExAC pLI<0.5), respectively. Details see doi: http://dx.doi.org/10.1101/259390 Multiple entries are separated by ";", corresponding to Ensembl_transcriptid. 93 MVP_rankscore: MVP scores were ranked among all MVP scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of MVP scores in dbNSFP. 94 gMVP_score: A pathogenicity prediction score for missense variants using a graph attention neural network model. The range of gMVP score is from 0 to 1. The larger the score, the more likely the variant is pathogenic. Details see doi: https://www.nature.com/articles/s42256-022-00561-w Multiple entries are separated by ";", corresponding to Ensembl_transcriptid. 95 gMVP_rankscore: gMVP scores were ranked among all gMVP scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of gMVP scores in dbNSFP. 96 MPC_score: A deleteriousness prediction score for missense variants based on regional missense constraint. The range of MPC score is 0 to 5. The larger the score, the more likely the variant is pathogenic. Details see doi: http://dx.doi.org/10.1101/148353. Multiple entries are separated by ";", corresponding to Ensembl_transcriptid. 97 MPC_rankscore: MPC scores were ranked among all MPC scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of MPC scores in dbNSFP. 98 PrimateAI_score: A pathogenicity prediction score for missense variants based on common variants of non-human primate species using a deep neural network. The range of PrimateAI score is 0 to 1. The larger the score, the more likely the variant is pathogenic. The authors suggest a threshold of 0.803 for separating damaging vs tolerant variants. Details see https://doi.org/10.1038/s41588-018-0167-z 99 PrimateAI_rankscore: PrimateAI scores were ranked among all PrimateAI scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of PrimateAI scores in dbNSFP. 100 PrimateAI_pred: Prediction of PrimateAI score based on the authors' recommendation, "T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0.803. 101 DEOGEN2_score: A deleteriousness prediction score "which incorporates heterogeneous information about the molecular effects of the variants, the domains involved, the relevance of the gene and the interactions in which it participates". It ranges from 0 to 1. The larger the score, the more likely the variant is deleterious. The authors suggest a threshold of 0.5 for separating damaging vs tolerant variants. 102 DEOGEN2_rankscore: DEOGEN2 scores were ranked among all DEOGEN2 scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of DEOGEN2 scores in dbNSFP. 103 DEOGEN2_pred: Prediction of DEOGEN2 score based on the authors' recommendation, "T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0.5. 104 BayesDel_addAF_score: A deleteriousness prediction meta-score for SNVs and indels with inclusion of MaxAF. See https://doi.org/10.1002/humu.23158 for details. The range of the score in dbNSFP is from -1.11707 to 0.750927. The higher the score, the more likely the variant is pathogenic. The author suggested cutoff between deleterious ("D") and tolerated ("T") is 0.0692655. 105 BayesDel_addAF_rankscore: BayesDel_addAF scores were ranked among all BayesDel_addAF scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of BayesDel_addAF scores in dbNSFP. 106 BayesDel_addAF_pred: Prediction of BayesDel_addAF score based on the authors' recommendation, "T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0.0692655. 107 BayesDel_noAF_score: A deleteriousness prediction meta-score for SNVs and indels without inclusion of MaxAF. See https://doi.org/10.1002/humu.23158 for details. The range of the score in dbNSFP is from -1.31914 to 0.840878. The higher the score, the more likely the variant is pathogenic. The author suggested cutoff between deleterious ("D") and tolerated ("T") is -0.0570105. 108 BayesDel_noAF_rankscore: BayesDel_noAF scores were ranked among all BayesDel_noAF scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of BayesDel_noAF scores in dbNSFP. 109 BayesDel_noAF_pred: Prediction of BayesDel_noAF score based on the authors' recommendation, "T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is -0.0570105. 110 ClinPred_score: A deleteriousness prediction meta-score for nonsynonymous SNVs. See https://doi.org/10.1016/j.ajhg.2018.08.005. for details. The range of the score in dbNSFP is from 0 to 1. The higher the score, the more likely the variant is pathogenic. The author suggested cutoff between deleterious ("D") and tolerated ("T") is 0.5. 111 ClinPred_rankscore: ClinPred scores were ranked among all ClinPred scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of ClinPred scores in dbNSFP. 112 ClinPred_pred: Prediction of ClinPred score based on the authors' recommendation, "T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0.5. 113 LIST-S2_score: A deleteriousness prediction score for nonsynonymous SNVs. See https://doi.org/10.1093/nar/gkaa288. for details. The range of the score in dbNSFP is from 0 to 1. The higher the score, the more likely the variant is pathogenic. The author suggested cutoff between deleterious ("D") and tolerated ("T") is 0.85. 114 LIST-S2_rankscore: LIST-S2 scores were ranked among all LIST-S2 scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of LIST-S2 scores in dbNSFP. 115 LIST-S2_pred: Prediction of LIST-S2 score based on the authors' recommendation, "T(olerated)" or "D(amaging)". The score cutoff between "D" and "T" is 0.85. 116 VARITY_R_score: VARITY_R scores are pathogenicity prediction scores for rare human missense variants. The range of VARITY_R score is from 0 to 1. The larger the score, the more likely the variant is pathogenic. Details see doi: https://doi.org/10.1016/j.ajhg.2021.08.012 117 VARITY_R_rankscore: VARITY_R scores were ranked among all VARITY_R scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of VARITY_R scores in dbNSFP. 118 VARITY_ER_score: VARITY_ER scores are pathogenicity prediction scores for extremely rare human missense variants. The range of VARITY_ER score is from 0 to 1. The larger the score, the more likely the variant is pathogenic. Details see doi: https://doi.org/10.1016/j.ajhg.2021.08.012 119 VARITY_ER_rankscore: VARITY_ER scores were ranked among all VARITY_ER scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of VARITY_ER scores in dbNSFP. 120 VARITY_R_LOO_score: "Same as VARITY_R except the prediction on the variants used for training was made using Leave-One-Variant out." Details see doi: https://doi.org/10.1016/j.ajhg.2021.08.012 121 VARITY_R_LOO_rankscore: VARITY_R_LOO scores were ranked among all VARITY_R_LOO scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of VARITY_R_LOO scores in dbNSFP. 122 VARITY_ER_LOO_score: "Same as VARITY_ER except the prediction on the variants used for training was made using Leave-One-Variant out." Details see https://doi.org/10.1016/j.ajhg.2021.08.012 123 VARITY_ER_LOO_rankscore: VARITY_ER_LOO scores were ranked among all VARITY_ER_LOO scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of VARITY_ER_LOO scores in dbNSFP. 124 ESM1b_score: ESM1b scores are log-likelihood ratio (LLR) scores for predicting the pathogenic effects of coding variants based on a 650-million-parameter protein language model, ESM1b. The range of ESM1b score in dbNSFP is from -24.538 to 6.937. The smaller the score, the more likely the variant is pathogenic. Details see doi: https://doi.org/10.1038/s41588-023-01465-0 125 ESM1b_converted_rankscore: ESM1b scores were firstly negated (i.e., -ESM1b_score), then ranked among all -ESM1b_score scores in dbNSFP. The rankscore is the ratio of the rank of the -ESM1b_score over the total number of scores in dbNSFP. 126 ESM1b_pred: The authors do not recommend a threshold for separating deleterious (D) variants versus tolerated (T) variants. This prediction is based on the threshold of -7.5 described in their paper that yields a true-positive rate of 81% and a true-negative rate of 82% in their ClinVar and HGMD test datasets. 127 AlphaMissense_score: AlphaMissense is an unsupervised model for predicting the pathogenicity of human missense variants by incorporating the structural context of an AlphaFold-derived system. The AlphaMissense score ranges from 0 to 1. The larger the score, the more likely the variant is pathogenic. Details see https://doi.org/10.1126/science.adg7492. License information: Copyright (2023) DeepMind Technologies Limited. All materials are licensed under the Creative Commons Attribution 4.0 International License (CC-BY) (the “License”). You may obtain a copy of the License at: https://creativecommons.org/licenses/by/4.0/legalcode. 128 AlphaMissense_rankscore: AlphaMissense scores were ranked among all AlphaMissense scores in dbNSFP. The rankscore is the ratio of the rank of the AlphaMissense_score over the total number of scores in dbNSFP. 129 AlphaMissense_pred: The AlphaMissense classification of likely (B)enign, (A)mbiguous, or likely (P)athogenic with 90% expected precision estimated from ClinVar for likely benign and likely pathogenic classes. 130 PHACTboost_score: "PHACTboost is a gradient boosting tree based classifier that combines PHACT scores with information from multiple sequence alignment, phylogenetic trees, and ancestral reconstruction." The range of the score is from 0 to 1, the larger the score the more likely the variant is pathogenic. Details see https://doi.org/10.1093/molbev/msae136. The authors recommend to use 0.62 as the cutoff for binary prediction (personal communication). 131 PHACTboost_rankscore: PHACTboost scores were ranked among all PHACTboost scores in dbNSFP. The rankscore is the ratio of the rank of the PHACTboost_score over the total number of scores in dbNSFP. 132 MutFormer_score: "MutFormer is an application of the BERT (Bidirectional Encoder Representations from Transformers) NLP (Natural Language Processing) model with an added adaptive vocabulary to protein context, for the purpose of predicting the effect of missense mutations on protein function." The range of the score is from 0 to 1, the larger the score the more likely the variant is pathogenic. Details see https://doi.org/10.1016/j.xinn.2023.100487. The authors recommend to use 0.8838 as the cutoff for binary prediction (personal communication). 133 MutFormer_rankscore: MutFormer scores were ranked among all MutFormer scores in dbNSFP. The rankscore is the ratio of the rank of the MutFormer_score over the total number of scores in dbNSFP. 134 MutScore_score: MutScore is an ensemble score which integrate multiple unsupervised scores for DNA substitutions with additional positional clustering information. The range of the score is from 0 to 1, the larger the score the more likely the variant is pathogenic. Details see https://doi.org/10.1016/j.ajhg.2022.01.006. The authors recommend to use 0.5 as the cutoff for binary prediction (personal communication). 135 MutScore_rankscore: MutScore scores were ranked among all MutScore scores in dbNSFP. The rankscore is the ratio of the rank of the MutScore_score over the total number of scores in dbNSFP. 136 Aloft_Fraction_transcripts_affected: the fraction of the transcripts of the gene affected i.e. No. of transcripts affected by the SNP/Total no. of protein_coding transcripts for the gene multiple values separated by ";", corresponding to Ensembl_proteinid. 137 Aloft_prob_Tolerant: Probability of the SNP being classified as benign by ALoFT multiple values separated by ";", corresponding to Ensembl_proteinid. 138 Aloft_prob_Recessive: Probability of the SNP being classified as recessive disease-causing by ALoFT multiple values separated by ";", corresponding to Ensembl_proteinid. 139 Aloft_prob_Dominant: Probability of the SNP being classified as dominant disease-causing by ALoFT multiple values separated by ";", corresponding to Ensembl_proteinid. 140 Aloft_pred: final classification predicted by ALoFT; values can be Tolerant, Recessive or Dominant multiple values separated by ";", corresponding to Ensembl_proteinid. 141 Aloft_Confidence: Confidence level of Aloft_pred; values can be "High Confidence" (p < 0.05) or "Low Confidence" (p > 0.05) multiple values separated by ";", corresponding to Ensembl_proteinid. 142 CADD_raw: CADD raw score for functional prediction of a SNP. Please refer to Kircher et al. (2014) Nature Genetics 46(3):310-5 for details. The larger the score the more likely the SNP has damaging effect. Scores range from -28.377575 to 25.511592 in dbNSFP. Please note the following copyright statement for CADD: "CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely available for all academic, non-commercial applications. For commercial licensing information contact Jennifer McCullar (mccullaj@uw.edu)." 143 CADD_raw_rankscore: CADD raw scores were ranked among all CADD raw scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of CADD raw scores in dbNSFP. Please note the following copyright statement for CADD: "CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely available for all academic, non-commercial applications. For commercial licensing information contact Jennifer McCullar (mccullaj@uw.edu)." 144 CADD_phred: CADD phred-like score. This is phred-like rank score based on whole genome CADD raw scores. Please refer to Kircher et al. (2014) Nature Genetics 46(3):310-5 for details. The larger the score the more likely the SNP has damaging effect. Please note the following copyright statement for CADD: "CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely available for all academic, non-commercial applications. For commercial licensing information contact Jennifer McCullar (mccullaj@uw.edu)." 145 DANN_score: DANN is a functional prediction score retrained based on the training data of CADD using deep neural network. Scores range from 0 to 1. A larger number indicate a higher probability to be damaging. More information of this score can be found in doi: 10.1093/bioinformatics/btu703. 146 DANN_rankscore: DANN scores were ranked among all DANN scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of DANN scores in dbNSFP. 147 fathmm-XF_coding_score: fathmm-XF p-values. Scores range from 0 to 1. SNVs with scores >0.5 are predicted to be deleterious, and those <0.5 are predicted to be neutral or benign. Scores close to 0 or 1 are with the highest-confidence. Coding scores are trained using 10 groups of features. More details of the score can be found in doi: 10.1093/bioinformatics/btx536. 148 fathmm-XF_coding_rankscore: fathmm-XF coding scores were ranked among all fathmm-XF coding scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of fathmm-XF coding scores in dbNSFP. 149 fathmm-XF_coding_pred: If a fathmm-XF_coding_score is >0.5, the corresponding nsSNV is predicted as "D(AMAGING)"; otherwise it is predicted as "N(EUTRAL)". 150 Eigen-raw_coding: Eigen score for coding SNVs. A functional prediction score based on conservation, allele frequencies, and deleteriousness prediction using an unsupervised learning method (doi: 10.1038/ng.3477). 151 Eigen-raw_coding_rankscore: Eigen-raw scores were ranked among all Eigen-raw scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of Eigen-raw scores in dbNSFP. 152 Eigen-phred_coding: Eigen score in phred scale. 153 Eigen-PC-raw_coding: Eigen PC score for genome-wide SNVs. A functional prediction score based on conservation, allele frequencies, deleteriousness prediction (for missense SNVs) and epigenomic signals (for synonymous and non-coding SNVs) using an unsupervised learning method (doi: 10.1038/ng.3477). 154 Eigen-PC-raw_coding_rankscore: Eigen-PC-raw scores were ranked among all Eigen-PC-raw scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of Eigen-PC-raw scores in dbNSFP. 155 Eigen-PC-phred_coding: Eigen PC score in phred scale. 156 GERP++_NR: GERP++ neutral rate 157 GERP++_RS: GERP++ RS score, the larger the score, the more conserved the site. Scores range from -12.3 to 6.17. 158 GERP++_RS_rankscore: GERP++ RS scores were ranked among all GERP++ RS scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of GERP++ RS scores in dbNSFP. 159 GERP_91_mammals: GERP conservation score calculated based on multiple sequence alignments of 91 mammals. 160 GERP_91_mammals_rankscore: GERP (91 mammals) scores were ranked among all GERP (91 mammals) scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of GERP_91_mammals scores in dbNSFP. 161 phyloP100way_vertebrate: phyloP (phylogenetic p-values) conservation score based on the multiple alignments of 100 vertebrate genomes (including human). The larger the score, the more conserved the site. Scores range from -20.0 to 10.003 in dbNSFP. 162 phyloP100way_vertebrate_rankscore: phyloP100way_vertebrate scores were ranked among all phyloP100way_vertebrate scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of phyloP100way_vertebrate scores in dbNSFP. 163 phyloP470way_mammalian: phyloP (phylogenetic p-values) conservation score based on the multiple alignments of 470 mammalian genomes (including human). The larger the score, the more conserved the site. Scores range from -20 to 11.936 in dbNSFP. 164 phyloP470way_mammalian_rankscore: phyloP470way_mammalian scores were ranked among all phyloP470way_mammalian scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of phyloP470way_mammalian scores in dbNSFP. 165 phyloP17way_primate: a conservation score based on 17way alignment primate set, the higher the more conservative. Scores range from -13.362 to 0.756 in dbNSFP. 166 phyloP17way_primate_rankscore: the rank of the phyloP17way_primate score among all phyloP17way_primate scores in dbNSFP. 167 phastCons100way_vertebrate: phastCons conservation score based on the multiple alignments of 100 vertebrate genomes (including human). The larger the score, the more conserved the site. Scores range from 0 to 1. 168 phastCons100way_vertebrate_rankscore: phastCons100way_vertebrate scores were ranked among all phastCons100way_vertebrate scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of phastCons100way_vertebrate scores in dbNSFP. 169 phastCons470way_mammalian: phastCons conservation score based on the multiple alignments of 470 mammalian genomes (including human). The larger the score, the more conserved the site. Scores range from 0 to 1. 170 phastCons470way_mammalian_rankscore: phastCons470way_mammalian scores were ranked among all phastCons470way_mammalian scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number of phastCons470way_mammalian scores in dbNSFP. 171 phastCons17way_primate: a conservation score based on 17way alignment primate set, The larger the score, the more conserved the site. Scores range from 0 to 1. 172 phastCons17way_primate_rankscore: the rank of the phastCons17way_primate score among all phastCons17way_primate scores in dbNSFP. 173 bStatistic: Background selection (B) value estimates from doi.org/10.1371/journal.pgen.1000471. Ranges from 0 to 1000. It estimates the expected fraction (*1000) of neutral diversity present at a site. Values close to 0 represent the near complete removal of diversity as a result of background selection and values near 1000 indicating absence of background selection. Data from CADD v1.4. 174 bStatistic_converted_rankscore: bStatistic scores were first converted to -bStatistic, then ranked among all -bStatistic scores in dbNSFP. The rankscore is the ratio of the rank of -bStatistic over the total number of -bStatistic scores in dbNSFP. 175 1000Gp3_AC: Alternative allele counts in the whole 1000 genomes phase 3 (1000Gp3) data. 176 1000Gp3_AF: Alternative allele frequency in the whole 1000Gp3 data. 177 1000Gp3_AFR_AC: Alternative allele counts in the 1000Gp3 African descendent samples. 178 1000Gp3_AFR_AF: Alternative allele frequency in the 1000Gp3 African descendent samples. 179 1000Gp3_EUR_AC: Alternative allele counts in the 1000Gp3 European descendent samples. 180 1000Gp3_EUR_AF: Alternative allele frequency in the 1000Gp3 European descendent samples. 181 1000Gp3_AMR_AC: Alternative allele counts in the 1000Gp3 American descendent samples. 182 1000Gp3_AMR_AF: Alternative allele frequency in the 1000Gp3 American descendent samples. 183 1000Gp3_EAS_AC: Alternative allele counts in the 1000Gp3 East Asian descendent samples. 184 1000Gp3_EAS_AF: Alternative allele frequency in the 1000Gp3 East Asian descendent samples. 185 1000Gp3_SAS_AC: Alternative allele counts in the 1000Gp3 South Asian descendent samples. 186 1000Gp3_SAS_AF: Alternative allele frequency in the 1000Gp3 South Asian descendent samples. 187 TOPMed_frz8_AC: Alternative allele counts in the TOPMed freeze 8 samples. 188 TOPMed_frz8_AN: Total allele count in the TOPMed freeze 8 samples. 189 TOPMed_frz8_AF: Alternative allele frequency in the TOPMed freeze 8 samples. 190 AllofUs_ALL_AC: Alternative allele counts in the whole All of Us 250k (~245,393 genomes) data. 191 AllofUs_ALL_AN: Total allele counts in the whole All of Us 250k (~245,393 genomes) data. 192 AllofUs_ALL_AF: Alternative allele frequency in the whole All of Us 250k (~245,393 genomes) data. 193 AllofUs_AFR_AC: Alternative allele counts in the All of Us African descendent samples. 194 AllofUs_AFR_AN: Total allele counts in the All of Us African descendent samples. 195 AllofUs_AFR_AF: Alternative allele frequency in the All of Us African descendent samples. 196 AllofUs_AMR_AC: Alternative allele counts in the All of Us American descendent samples. 197 AllofUs_AMR_AN: Total allele counts in the All of Us American descendent samples. 198 AllofUs_AMR_AF: Alternative allele frequency in the All of Us American descendent samples. 199 AllofUs_EAS_AC: Alternative allele counts in the All of Us East Asian descendent samples. 200 AllofUs_EAS_AN: Total allele counts in the All of Us East Asian descendent samples. 201 AllofUs_EAS_AF: Alternative allele frequency in the All of Us East Asian descendent samples. 202 AllofUs_EUR_AC: Alternative allele counts in the All of Us European descendent samples. 203 AllofUs_EUR_AN: Total allele counts in the All of Us European descendent samples. 204 AllofUs_EUR_AF: Alternative allele frequency in the All of Us European descendent samples. 205 AllofUs_MID_AC: Alternative allele counts in the All of Us Middle Eastern descendent samples. 206 AllofUs_MID_AN: Total allele counts in the All of Us Middle Eastern descendent samples. 207 AllofUs_MID_AF: Alternative allele frequency in the All of Us Middle Eastern descendent samples. 208 AllofUs_SAS_AC: Alternative allele counts in the All of Us South Asian descendent samples. 209 AllofUs_SAS_AN: Total allele counts in the All of Us South Asian descendent samples. 210 AllofUs_SAS_AF: Alternative allele frequency in the All of Us South Asian descendent samples. 211 AllofUs_OTH_AC: Alternative allele counts in the All of Us other samples. 212 AllofUs_OTH_AN: Total allele counts in the All of Us other samples. 213 AllofUs_OTH_AF: Alternative allele frequency in the All of Us other samples. 214 AllofUs_POPMAX_AF: Maximum alternative allele frequency across all populations in All of Us. 215 AllofUs_POPMAX_AC: Alternative allele count(s) of the population(s) with AllofUs_POPMAX_AF. Multiple entries are separated by ";". 216 AllofUs_POPMAX_AN: Total allele count(s) of the population(s) with AllofUs_POPMAX_AF. Multiple entries are separated by ";". 217 AllofUs_POPMAX_POP: Population(s) with AllofUs_POPMAX_AF. Multiple entries are separated by ";". 218 RegeneronME_ALL_AC: Alternate allele count of all the whole Regeneron Genetics Center Million Exome (RGC-ME) (~983,578 exomes) data. 219 RegeneronME_ALL_AF: Alternate allele frequency of all the whole Regeneron Genetics Center Million Exome (RGC-ME) (~983,578 exomes) data. 220 RegeneronME_ALL_AN: Total number of alleles of all the whole Regeneron Genetics Center Million Exome (RGC-ME) (~983,578 exomes) data. 221 RegeneronME_AFR_AC: Probabilistic alternate allele count of African descendent samples in RGC-ME. 222 RegeneronME_AFR_AF: Probabilistic alternate allele frequency of African descendent samples in RGC-ME. 223 RegeneronME_AFR_AN: Probabilistic total number of alleles of African descendent samples in RGC-ME. 224 RegeneronME_AMI_AC: Alternate allele count of Amish descendent samples in RGC-ME. 225 RegeneronME_AMI_AF: Alternate allele frequency of Amish descendent samples in RGC-ME. 226 RegeneronME_AMI_AN: Total number of alleles of Amish descendent samples in RGC-ME. 227 RegeneronME_ASH_AC: Probabilistic alternate allele count of Ashkenazi Jewish descendent samples in RGC-ME. 228 RegeneronME_ASH_AF: Probabilistic alternate allele frequency of Ashkenazi Jewish descendent samples in RGC-ME. 229 RegeneronME_ASH_AN: Probabilistic total number of alleles of Ashkenazi Jewish descendent samples in RGC-ME. 230 RegeneronME_BI_AC: Probabilistic alternate allele count of British Isles descendent samples in RGC-ME. 231 RegeneronME_BI_AF: Probabilistic alternate allele frequency of British Isles descendent samples in RGC-ME. 232 RegeneronME_BI_AN: Probabilistic total number of alleles of British Isles descendent samples in RGC-ME. 233 RegeneronME_C_EUR_AC: Probabilistic alternate allele count of Central Europe descendent samples in RGC-ME. 234 RegeneronME_C_EUR_AF: Probabilistic alternate allele frequency of Central Europe descendent samples in RGC-ME. 235 RegeneronME_C_EUR_AN: Probabilistic total number of alleles of Central Europe descendent samples in RGC-ME. 236 RegeneronME_EAS_AC: Probabilistic alternate allele count of East Asian descendent samples in RGC-ME. 237 RegeneronME_EAS_AF: Probabilistic alternate allele frequency of East Asian descendent samples in RGC-ME. 238 RegeneronME_EAS_AN: Probabilistic total number of alleles of East Asian descendent samples in RGC-ME. 239 RegeneronME_EUR_AC: Probabilistic alternate allele count of European descendent samples in RGC-ME. 240 RegeneronME_EUR_AF: Probabilistic alternate allele frequency of European descendent samples in RGC-ME. 241 RegeneronME_EUR_AN: Probabilistic total number of alleles of European descendent samples in RGC-ME. 242 RegeneronME_E_AFR_AC: Probabilistic alternate allele count of East Africa descendent samples in RGC-ME. 243 RegeneronME_E_AFR_AF: Probabilistic alternate allele frequency of East Africa descendent samples in RGC-ME. 244 RegeneronME_E_AFR_AN: Probabilistic total number of alleles of East Africa descendent samples in RGC-ME. 245 RegeneronME_E_ASIA_AC: Probabilistic alternate allele count of East Asia descendent samples in RGC-ME. 246 RegeneronME_E_ASIA_AF: Probabilistic alternate allele frequency of East Asia descendent samples in RGC-ME. 247 RegeneronME_E_ASIA_AN: Probabilistic total number of alleles of East Asia descendent samples in RGC-ME. 248 RegeneronME_FIN_AC: Probabilistic alternate allele count of Finland descendent samples in RGC-ME. 249 RegeneronME_FIN_AF: Probabilistic alternate allele frequency of Finland descendent samples in RGC-ME. 250 RegeneronME_FIN_AN: Probabilistic total number of alleles of Finland descendent samples in RGC-ME. 251 RegeneronME_GBR_AC: Probabilistic alternate allele count of Great Britain descendent samples in RGC-ME. 252 RegeneronME_GBR_AF: Probabilistic alternate allele frequency of Great Britain descendent samples in RGC-ME. 253 RegeneronME_GBR_AN: Probabilistic total number of alleles of Great Britain descendent samples in RGC-ME. 254 RegeneronME_GHA_AC: Probabilistic alternate allele count of Ghana descendent samples in RGC-ME. 255 RegeneronME_GHA_AF: Probabilistic alternate allele frequency of Ghana descendent samples in RGC-ME. 256 RegeneronME_GHA_AN: Probabilistic total number of alleles of Ghana descendent samples in RGC-ME. 257 RegeneronME_IAM_AC: Probabilistic alternate allele count of Indigenous American descendent samples in RGC-ME. 258 RegeneronME_IAM_AF: Probabilistic alternate allele frequency of Indigenous American descendent samples in RGC-ME. 259 RegeneronME_IAM_AN: Probabilistic total number of alleles of Indigenous American descendent samples in RGC-ME. 260 RegeneronME_IND_AC: Probabilistic alternate allele count of Indian descendent samples in RGC-ME. 261 RegeneronME_IND_AF: Probabilistic alternate allele frequency of Indian descendent samples in RGC-ME. 262 RegeneronME_IND_AN: Probabilistic total number of alleles of Indian descendent samples in RGC-ME. 263 RegeneronME_IRE_AC: Probabilistic alternate allele count of Ireland of Northern Ireland descendent samples in RGC-ME. 264 RegeneronME_IRE_AF: Probabilistic alternate allele frequency of Ireland of Northern Ireland descendent samples in RGC-ME. 265 RegeneronME_IRE_AN: Probabilistic total number of alleles of Ireland of Northern Ireland descendent samples in RGC-ME. 266 RegeneronME_ITA_AC: Probabilistic alternate allele count of Italy descendent samples in RGC-ME. 267 RegeneronME_ITA_AF: Probabilistic alternate allele frequency of Italy descendent samples in RGC-ME. 268 RegeneronME_ITA_AN: Probabilistic total number of alleles of Italy descendent samples in RGC-ME. 269 RegeneronME_MEA_AC: Probabilistic alternate allele count of Middle East descendent samples in RGC-ME. 270 RegeneronME_MEA_AF: Probabilistic alternate allele frequency of Middle East descendent samples in RGC-ME. 271 RegeneronME_MEA_AN: Probabilistic total number of alleles of Middle East descendent samples in RGC-ME. 272 RegeneronME_MEX_AC: Probabilistic alternate allele count of Mexico descendent samples in RGC-ME. 273 RegeneronME_MEX_AF: Probabilistic alternate allele frequency of Mexico descendent samples in RGC-ME. 274 RegeneronME_MEX_AN: Probabilistic total number of alleles of Mexico descendent samples in RGC-ME. 275 RegeneronME_NIN_AC: Probabilistic alternate allele count of Nigeria-Niger descendent samples in RGC-ME. 276 RegeneronME_NIN_AF: Probabilistic alternate allele frequency of Nigeria-Niger descendent samples in RGC-ME. 277 RegeneronME_NIN_AN: Probabilistic total number of alleles of Nigeria-Niger descendent samples in RGC-ME. 278 RegeneronME_N_EUR_AC: Probabilistic alternate allele count of Northern Europe descendent samples in RGC-ME. 279 RegeneronME_N_EUR_AF: Probabilistic alternate allele frequency of Northern Europe descendent samples in RGC-ME. 280 RegeneronME_N_EUR_AN: Probabilistic total number of alleles of Northern Europe descendent samples in RGC-ME. 281 RegeneronME_PAK_AC: Probabilistic alternate allele count of Pakistan descendent samples in RGC-ME. 282 RegeneronME_PAK_AF: Probabilistic alternate allele frequency of Pakistan descendent samples in RGC-ME. 283 RegeneronME_PAK_AN: Probabilistic total number of alleles of Pakistan descendent samples in RGC-ME. 284 RegeneronME_SAS_AC: Probabilistic alternate allele count of South Asia descendent samples in RGC-ME. 285 RegeneronME_SAS_AF: Probabilistic alternate allele frequency of South Asia descendent samples in RGC-ME. 286 RegeneronME_SAS_AN: Probabilistic total number of alleles of South Asia descendent samples in RGC-ME. 287 RegeneronME_SE_ASIA_AC: Probabilistic alternate allele count of South East Asia descendent samples in RGC-ME. 288 RegeneronME_SE_ASIA_AF: Probabilistic alternate allele frequency of South East Asia descendent samples in RGC-ME. 289 RegeneronME_SE_ASIA_AN: Probabilistic total number of alleles of South East Asia descendent samples in RGC-ME. 290 RegeneronME_SPA_AC: Probabilistic alternate allele count of Spain descendent samples in RGC-ME. 291 RegeneronME_SPA_AF: Probabilistic alternate allele frequency of Spain descendent samples in RGC-ME. 292 RegeneronME_SPA_AN: Probabilistic total number of alleles of Spain descendent samples in RGC-ME. 293 RegeneronME_S_AFR_AC: Probabilistic alternate allele count of South Africa descendent samples in RGC-ME. 294 RegeneronME_S_AFR_AF: Probabilistic alternate allele frequency of South Africa descendent samples in RGC-ME. 295 RegeneronME_S_AFR_AN: Probabilistic total number of alleles of South Africa descendent samples in RGC-ME. 296 RegeneronME_S_EUR_AC: Probabilistic alternate allele count of Southern Europe descendent samples in RGC-ME. 297 RegeneronME_S_EUR_AF: Probabilistic alternate allele frequency of Southern Europe descendent samples in RGC-ME. 298 RegeneronME_S_EUR_AN: Probabilistic total number of alleles of Southern Europe descendent samples in RGC-ME. 299 RegeneronME_W_AFR_AC: Probabilistic alternate allele count of West Africa descendent samples in RGC-ME. 300 RegeneronME_W_AFR_AF: Probabilistic alternate allele frequency of West Africa descendent samples in RGC-ME. 301 RegeneronME_W_AFR_AN: Probabilistic total number of alleles of West Africa descendent samples in RGC-ME. 302 gnomAD2.1.1_exomes_flag: information from gnomAD exome data indicating whether the variant falling within low-complexity (lcr) or segmental duplication (segdup) or decoy regions. The flag can be either "." for high-quality PASS or not reported/polymorphic in gnomAD exomes, "lcr" for within lcr, "segdup" for within segdup, or "decoy" for with decoy region. 303 gnomAD2.1.1_exomes_controls_AC: Alternative allele count in the controls subset of whole gnomAD exome samples v2.1.1 304 gnomAD2.1.1_exomes_controls_AN: Total allele count in the controls subset of whole gnomAD exome samples v2.1.1 305 gnomAD2.1.1_exomes_controls_AF: Alternative allele frequency in the controls subset of whole gnomAD exome samples v2.1.1 306 gnomAD2.1.1_exomes_controls_nhomalt: Count of individuals with homozygous alternative allele in the controls subset of whole gnomAD exome samples v2.1.1 307 gnomAD2.1.1_exomes_non_neuro_AC: Alternative allele count in the non-neuro subset of whole gnomAD exome samples v2.1.1 308 gnomAD2.1.1_exomes_non_neuro_AN: Total allele count in the non-neuro subset of whole gnomAD exome samples v2.1.1 309 gnomAD2.1.1_exomes_non_neuro_AF: Alternative allele frequency in the non-neuro subset of whole gnomAD exome samples v2.1.1 310 gnomAD2.1.1_exomes_non_neuro_nhomalt: Count of individuals with homozygous alternative allele in the non-neuro subset of whole gnomAD exome samples v2.1.1 311 gnomAD2.1.1_exomes_non_cancer_AC: Alternative allele count in the non-cancer subset of whole gnomAD exome samples v2.1.1 312 gnomAD2.1.1_exomes_non_cancer_AN: Total allele count in the non-cancer subset of whole gnomAD exome samples v2.1.1 313 gnomAD2.1.1_exomes_non_cancer_AF: Alternative allele frequency in the non-cancer subset of whole gnomAD exome samples v2.1.1 314 gnomAD2.1.1_exomes_non_cancer_nhomalt: Count of individuals with homozygous alternative allele in the non-cancer subset of whole gnomAD exome samples v2.1.1 315 gnomAD2.1.1_exomes_controls_AFR_AC: Alternative allele count in the controls subset of African/African American gnomAD exome samples v2.1.1 316 gnomAD2.1.1_exomes_controls_AFR_AN: Total allele count in the controls subset of African/African American gnomAD exome samples v2.1.1 317 gnomAD2.1.1_exomes_controls_AFR_AF: Alternative allele frequency in the controls subset of African/African American gnomAD exome samples v2.1.1 318 gnomAD2.1.1_exomes_controls_AFR_nhomalt: Count of individuals with homozygous alternative allele in the controls subset of African/African American gnomAD exome samples v2.1.1 319 gnomAD2.1.1_exomes_controls_AMR_AC: Alternative allele count in the controls subset of Latino gnomAD exome samples v2.1.1 320 gnomAD2.1.1_exomes_controls_AMR_AN: Total allele count in the controls subset of Latino gnomAD exome samples v2.1.1 321 gnomAD2.1.1_exomes_controls_AMR_AF: Alternative allele frequency in the controls subset of Latino gnomAD exome samples v2.1.1 322 gnomAD2.1.1_exomes_controls_AMR_nhomalt: Count of individuals with homozygous alternative allele in the controls subset of Latino gnomAD exome samples v2.1.1 323 gnomAD2.1.1_exomes_controls_ASJ_AC: Alternative allele count in the controls subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 324 gnomAD2.1.1_exomes_controls_ASJ_AN: Total allele count in the controls subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 325 gnomAD2.1.1_exomes_controls_ASJ_AF: Alternative allele frequency in the controls subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 326 gnomAD2.1.1_exomes_controls_ASJ_nhomalt: Count of individuals with homozygous alternative allele in the controls subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 327 gnomAD2.1.1_exomes_controls_EAS_AC: Alternative allele count in the controls subset of East Asian gnomAD exome samples v2.1.1 328 gnomAD2.1.1_exomes_controls_EAS_AN: Total allele count in the controls subset of East Asian gnomAD exome samples v2.1.1 329 gnomAD2.1.1_exomes_controls_EAS_AF: Alternative allele frequency in the controls subset of East Asian gnomAD exome samples v2.1.1 330 gnomAD2.1.1_exomes_controls_EAS_nhomalt: Count of individuals with homozygous alternative allele in the controls subset of East Asian gnomAD exome samples v2.1.1 331 gnomAD2.1.1_exomes_controls_FIN_AC: Alternative allele count in the controls subset of Finnish gnomAD exome samples v2.1.1 332 gnomAD2.1.1_exomes_controls_FIN_AN: Total allele count in the controls subset of Finnish gnomAD exome samples v2.1.1 333 gnomAD2.1.1_exomes_controls_FIN_AF: Alternative allele frequency in the controls subset of Finnish gnomAD exome samples v2.1.1 334 gnomAD2.1.1_exomes_controls_FIN_nhomalt: Count of individuals with homozygous alternative allele in the controls subset of Finnish gnomAD exome samples v2.1.1 335 gnomAD2.1.1_exomes_controls_NFE_AC: Alternative allele count in the controls subset of Non-Finnish European gnomAD exome samples v2.1.1 336 gnomAD2.1.1_exomes_controls_NFE_AN: Total allele count in the controls subset of Non-Finnish European gnomAD exome samples v2.1.1 337 gnomAD2.1.1_exomes_controls_NFE_AF: Alternative allele frequency in the controls subset of Non-Finnish European gnomAD exome samples v2.1.1 338 gnomAD2.1.1_exomes_controls_NFE_nhomalt: Count of individuals with homozygous alternative allele in the controls subset of Non-Finnish European gnomAD exome samples v2.1.1 339 gnomAD2.1.1_exomes_controls_SAS_AC: Alternative allele count in the controls subset of South Asian gnomAD exome samples v2.1.1 340 gnomAD2.1.1_exomes_controls_SAS_AN: Total allele count in the controls subset of South Asian gnomAD exome samples v2.1.1 341 gnomAD2.1.1_exomes_controls_SAS_AF: Alternative allele frequency in the controls subset of South Asian gnomAD exome samples v2.1.1 342 gnomAD2.1.1_exomes_controls_SAS_nhomalt: Count of individuals with homozygous alternative allele in the controls subset of South Asian gnomAD exome samples v2.1.1 343 gnomAD2.1.1_exomes_controls_POPMAX_AC: Allele count in the controls subset of population with the maximum AF 344 gnomAD2.1.1_exomes_controls_POPMAX_AN: Total number of alleles in the controls subset of population with the maximum AF 345 gnomAD2.1.1_exomes_controls_POPMAX_AF: Maximum allele frequency across populations (excluding samples of Ashkenazi, Finnish, and indeterminate ancestry) in the controls subset 346 gnomAD2.1.1_exomes_controls_POPMAX_nhomalt: Count of homozygous individuals in the controls subset of population with the maximum allele frequency 347 gnomAD2.1.1_exomes_non_neuro_AFR_AC: Alternative allele count in the non-neuro subset of African/African American gnomAD exome samples v2.1.1 348 gnomAD2.1.1_exomes_non_neuro_AFR_AN: Total allele count in the non-neuro subset of African/African American gnomAD exome samples v2.1.1 349 gnomAD2.1.1_exomes_non_neuro_AFR_AF: Alternative allele frequency in the non-neuro subset of African/African American gnomAD exome samples v2.1.1 350 gnomAD2.1.1_exomes_non_neuro_AFR_nhomalt: Count of individuals with homozygous alternative allele in the non-neuro subset of African/African American gnomAD exome samples v2.1.1 351 gnomAD2.1.1_exomes_non_neuro_AMR_AC: Alternative allele count in the non-neuro subset of Latino gnomAD exome samples v2.1.1 352 gnomAD2.1.1_exomes_non_neuro_AMR_AN: Total allele count in the non-neuro subset of Latino gnomAD exome samples v2.1.1 353 gnomAD2.1.1_exomes_non_neuro_AMR_AF: Alternative allele frequency in the non-neuro subset of Latino gnomAD exome samples v2.1.1 354 gnomAD2.1.1_exomes_non_neuro_AMR_nhomalt: Count of individuals with homozygous alternative allele in the non-neuro subset of Latino gnomAD exome samples v2.1.1 355 gnomAD2.1.1_exomes_non_neuro_ASJ_AC: Alternative allele count in the non-neuro subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 356 gnomAD2.1.1_exomes_non_neuro_ASJ_AN: Total allele count in the non-neuro subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 357 gnomAD2.1.1_exomes_non_neuro_ASJ_AF: Alternative allele frequency in the non-neuro subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 358 gnomAD2.1.1_exomes_non_neuro_ASJ_nhomalt: Count of individuals with homozygous alternative allele in the non-neuro subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 359 gnomAD2.1.1_exomes_non_neuro_EAS_AC: Alternative allele count in the non-neuro subset of East Asian gnomAD exome samples v2.1.1 360 gnomAD2.1.1_exomes_non_neuro_EAS_AN: Total allele count in the non-neuro subset of East Asian gnomAD exome samples v2.1.1 361 gnomAD2.1.1_exomes_non_neuro_EAS_AF: Alternative allele frequency in the non-neuro subset of East Asian gnomAD exome samples v2.1.1 362 gnomAD2.1.1_exomes_non_neuro_EAS_nhomalt: Count of individuals with homozygous alternative allele in the non-neuro subset of East Asian gnomAD exome samples v2.1.1 363 gnomAD2.1.1_exomes_non_neuro_FIN_AC: Alternative allele count in the non-neuro subset of Finnish gnomAD exome samples v2.1.1 364 gnomAD2.1.1_exomes_non_neuro_FIN_AN: Total allele count in the non-neuro subset of Finnish gnomAD exome samples v2.1.1 365 gnomAD2.1.1_exomes_non_neuro_FIN_AF: Alternative allele frequency in the non-neuro subset of Finnish gnomAD exome samples v2.1.1 366 gnomAD2.1.1_exomes_non_neuro_FIN_nhomalt: Count of individuals with homozygous alternative allele in the non-neuro subset of Finnish gnomAD exome samples v2.1.1 367 gnomAD2.1.1_exomes_non_neuro_NFE_AC: Alternative allele count in the non-neuro subset of Non-Finnish European gnomAD exome samples v2.1.1 368 gnomAD2.1.1_exomes_non_neuro_NFE_AN: Total allele count in the non-neuro subset of Non-Finnish European gnomAD exome samples v2.1.1 369 gnomAD2.1.1_exomes_non_neuro_NFE_AF: Alternative allele frequency in the non-neuro subset of Non-Finnish European gnomAD exome samples v2.1.1 370 gnomAD2.1.1_exomes_non_neuro_NFE_nhomalt: Count of individuals with homozygous alternative allele in the non-neuro subset of Non-Finnish European gnomAD exome samples v2.1.1 371 gnomAD2.1.1_exomes_non_neuro_SAS_AC: Alternative allele count in the non-neuro subset of South Asian gnomAD exome samples v2.1.1 372 gnomAD2.1.1_exomes_non_neuro_SAS_AN: Total allele count in the non-neuro subset of South Asian gnomAD exome samples v2.1.1 373 gnomAD2.1.1_exomes_non_neuro_SAS_AF: Alternative allele frequency in the non-neuro subset of South Asian gnomAD exome samples v2.1.1 374 gnomAD2.1.1_exomes_non_neuro_SAS_nhomalt: Count of individuals with homozygous alternative allele in the non-neuro subset of South Asian gnomAD exome samples v2.1.1 375 gnomAD2.1.1_exomes_non_neuro_POPMAX_AC: Allele count in the non-neuro subset of population with the maximum AF 376 gnomAD2.1.1_exomes_non_neuro_POPMAX_AN: Total number of alleles in the non-neuro subset of population with the maximum AF 377 gnomAD2.1.1_exomes_non_neuro_POPMAX_AF: Maximum allele frequency across populations (excluding samples of Ashkenazi, Finnish, and indeterminate ancestry) in the non-neuro subset 378 gnomAD2.1.1_exomes_non_neuro_POPMAX_nhomalt: Count of homozygous individuals in the non-neuro subset of population with the maximum allele frequency 379 gnomAD2.1.1_exomes_non_cancer_AFR_AC: Alternative allele count in the non-cancer subset of African/African American gnomAD exome samples v2.1.1 380 gnomAD2.1.1_exomes_non_cancer_AFR_AN: Total allele count in the non-cancer subset of African/African American gnomAD exome samples v2.1.1 381 gnomAD2.1.1_exomes_non_cancer_AFR_AF: Alternative allele frequency in the non-cancer subset of African/African American gnomAD exome samples v2.1.1 382 gnomAD2.1.1_exomes_non_cancer_AFR_nhomalt: Count of individuals with homozygous alternative allele in the non-cancer subset of African/African American gnomAD exome samples v2.1.1 383 gnomAD2.1.1_exomes_non_cancer_AMR_AC: Alternative allele count in the non-cancer subset of Latino gnomAD exome samples v2.1.1 384 gnomAD2.1.1_exomes_non_cancer_AMR_AN: Total allele count in the non-cancer subset of Latino gnomAD exome samples v2.1.1 385 gnomAD2.1.1_exomes_non_cancer_AMR_AF: Alternative allele frequency in the non-cancer subset of Latino gnomAD exome samples v2.1.1 386 gnomAD2.1.1_exomes_non_cancer_AMR_nhomalt: Count of individuals with homozygous alternative allele in the non-cancer subset of Latino gnomAD exome samples v2.1.1 387 gnomAD2.1.1_exomes_non_cancer_ASJ_AC: Alternative allele count in the non-cancer subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 388 gnomAD2.1.1_exomes_non_cancer_ASJ_AN: Total allele count in the non-cancer subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 389 gnomAD2.1.1_exomes_non_cancer_ASJ_AF: Alternative allele frequency in the non-cancer subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 390 gnomAD2.1.1_exomes_non_cancer_ASJ_nhomalt: Count of individuals with homozygous alternative allele in the non-cancer subset of Ashkenazi Jewish gnomAD exome samples v2.1.1 391 gnomAD2.1.1_exomes_non_cancer_EAS_AC: Alternative allele count in the non-cancer subset of East Asian gnomAD exome samples v2.1.1 392 gnomAD2.1.1_exomes_non_cancer_EAS_AN: Total allele count in the non-cancer subset of East Asian gnomAD exome samples v2.1.1 393 gnomAD2.1.1_exomes_non_cancer_EAS_AF: Alternative allele frequency in the non-cancer subset of East Asian gnomAD exome samples v2.1.1 394 gnomAD2.1.1_exomes_non_cancer_EAS_nhomalt: Count of individuals with homozygous alternative allele in the non-cancer subset of East Asian gnomAD exome samples v2.1.1 395 gnomAD2.1.1_exomes_non_cancer_FIN_AC: Alternative allele count in the non-cancer subset of Finnish gnomAD exome samples v2.1.1 396 gnomAD2.1.1_exomes_non_cancer_FIN_AN: Total allele count in the non-cancer subset of Finnish gnomAD exome samples v2.1.1 397 gnomAD2.1.1_exomes_non_cancer_FIN_AF: Alternative allele frequency in the non-cancer subset of Finnish gnomAD exome samples v2.1.1 398 gnomAD2.1.1_exomes_non_cancer_FIN_nhomalt: Count of individuals with homozygous alternative allele in the non-cancer subset of Finnish gnomAD exome samples v2.1.1 399 gnomAD2.1.1_exomes_non_cancer_NFE_AC: Alternative allele count in the non-cancer subset of Non-Finnish European gnomAD exome samples v2.1.1 400 gnomAD2.1.1_exomes_non_cancer_NFE_AN: Total allele count in the non-cancer subset of Non-Finnish European gnomAD exome samples v2.1.1 401 gnomAD2.1.1_exomes_non_cancer_NFE_AF: Alternative allele frequency in the non-cancer subset of Non-Finnish European gnomAD exome samples v2.1.1 402 gnomAD2.1.1_exomes_non_cancer_NFE_nhomalt: Count of individuals with homozygous alternative allele in the non-cancer subset of Non-Finnish European gnomAD exome samples v2.1.1 403 gnomAD2.1.1_exomes_non_cancer_SAS_AC: Alternative allele count in the non-cancer subset of South Asian gnomAD exome samples v2.1.1 404 gnomAD2.1.1_exomes_non_cancer_SAS_AN: Total allele count in the non-cancer subset of South Asian gnomAD exome samples v2.1.1 405 gnomAD2.1.1_exomes_non_cancer_SAS_AF: Alternative allele frequency in the non-cancer subset of South Asian gnomAD exome samples v2.1.1 406 gnomAD2.1.1_exomes_non_cancer_SAS_nhomalt: Count of individuals with homozygous alternative allele in the non-cancer subset of South Asian gnomAD exome samples v2.1.1 407 gnomAD2.1.1_exomes_non_cancer_POPMAX_AC: Allele count in the non-cancer subset of population with the maximum AF 408 gnomAD2.1.1_exomes_non_cancer_POPMAX_AN: Total number of alleles in the non-cancer subset of population with the maximum AF 409 gnomAD2.1.1_exomes_non_cancer_POPMAX_AF: Maximum allele frequency across populations (excluding samples of Ashkenazi, Finnish, and indeterminate ancestry) in the non-cancer subset 410 gnomAD2.1.1_exomes_non_cancer_POPMAX_nhomalt: Count of homozygous individuals in the non-cancer subset of population with the maximum allele frequency 411 gnomAD4.1_joint_flag: information from gnomAD joint (genome+exome) data indicating whether the variant falling within low-complexity (lcr) or segmental duplication (segdup) or decoy regions. The flag can be either "." for high-quality PASS or not reported/polymorphic in gnomAD exomes, "lcr" for within lcr, "segdup" for within segdup, or "decoy" for with decoy region. 412 gnomAD4.1_joint_AC: Alternative allele count in the whole gnomAD joint (genome+exome) samples v4.1 413 gnomAD4.1_joint_AN: Total allele count in the whole gnomAD joint (genome+exome) samples v4.1 414 gnomAD4.1_joint_AF: Alternative allele frequency in the whole gnomAD genome samples v4.1 415 gnomAD4.1_joint_nhomalt: Count of individuals with homozygous alternative allele in the whole gnomAD joint (genome+exome) samples v4.1 416 gnomAD4.1_joint_POPMAX_AC: Allele count in the population with the maximum AF 417 gnomAD4.1_joint_POPMAX_AN: Total number of alleles in the population with the maximum AF 418 gnomAD4.1_joint_POPMAX_AF: Maximum allele frequency across populations (excluding samples of Ashkenazi, Finnish, and indeterminate ancestry) 419 gnomAD4.1_joint_POPMAX_nhomalt: Count of homozygous individuals in the population with the maximum allele frequency 420 gnomAD4.1_joint_AFR_AC: Alternative allele count in the African/African American gnomAD joint (genome+exome) samples v4.1 421 gnomAD4.1_joint_AFR_AN: Total allele count in the African/African American gnomAD joint (genome+exome) samples v4.1 422 gnomAD4.1_joint_AFR_AF: Alternative allele frequency in the African/African American gnomAD joint (genome+exome) samples v4.1 423 gnomAD4.1_joint_AFR_nhomalt: Count of individuals with homozygous alternative allele in the African/African American gnomAD joint (genome+exome) samples v4.1 424 gnomAD4.1_joint_AMI_AC: Alternative allele count in the Amish gnomAD joint (genome+exome) samples v4.1 425 gnomAD4.1_joint_AMI_AN: Total allele count in the Amish gnomAD joint (genome+exome) samples v4.1 426 gnomAD4.1_joint_AMI_AF: Alternative allele frequency in the Amish gnomAD joint (genome+exome) samples v4.1 427 gnomAD4.1_joint_AMI_nhomalt: Count of individuals with homozygous alternative allele in the Amish gnomAD joint (genome+exome) samples v4.1 428 gnomAD4.1_joint_AMR_AC: Alternative allele count in the Latino gnomAD joint (genome+exome) samples v4.1 429 gnomAD4.1_joint_AMR_AN: Total allele count in the Latino gnomAD joint (genome+exome) samples v4.1 430 gnomAD4.1_joint_AMR_AF: Alternative allele frequency in the Latino gnomAD joint (genome+exome) samples v4.1 431 gnomAD4.1_joint_AMR_nhomalt: Count of individuals with homozygous alternative allele in the Latino gnomAD joint (genome+exome) samples v4.1 432 gnomAD4.1_joint_ASJ_AC: Alternative allele count in the Ashkenazi Jewish gnomAD joint (genome+exome) samples v4.1 433 gnomAD4.1_joint_ASJ_AN: Total allele count in the Ashkenazi Jewish gnomAD joint (genome+exome) samples v4.1 434 gnomAD4.1_joint_ASJ_AF: Alternative allele frequency in the Ashkenazi Jewish gnomAD joint (genome+exome) samples v4.1 435 gnomAD4.1_joint_ASJ_nhomalt: Count of individuals with homozygous alternative allele in the Ashkenazi Jewish gnomAD joint (genome+exome) samples v4.1 436 gnomAD4.1_joint_EAS_AC: Alternative allele count in the East Asian gnomAD joint (genome+exome) samples v4.1 437 gnomAD4.1_joint_EAS_AN: Total allele count in the East Asian gnomAD joint (genome+exome) samples v4.1 438 gnomAD4.1_joint_EAS_AF: Alternative allele frequency in the East Asian gnomAD joint (genome+exome) samples v4.1 439 gnomAD4.1_joint_EAS_nhomalt: Count of individuals with homozygous alternative allele in the East Asian gnomAD joint (genome+exome) samples v4.1 440 gnomAD4.1_joint_FIN_AC: Alternative allele count in the Finnish gnomAD joint (genome+exome) samples v4.1 441 gnomAD4.1_joint_FIN_AN: Total allele count in the Finnish gnomAD joint (genome+exome) samples v4.1 442 gnomAD4.1_joint_FIN_AF: Alternative allele frequency in the Finnish gnomAD joint (genome+exome) samples v4.1 443 gnomAD4.1_joint_FIN_nhomalt: Count of individuals with homozygous alternative allele in the Finnish gnomAD joint (genome+exome) samples v4.1 444 gnomAD4.1_joint_MID_AC: Alternative allele count in the Middle Eastern gnomAD joint (genome+exome) samples v4.1 445 gnomAD4.1_joint_MID_AN: Total allele count in the Middle Eastern gnomAD joint (genome+exome) samples v4.1 446 gnomAD4.1_joint_MID_AF: Alternative allele frequency in the Middle Eastern gnomAD joint (genome+exome) samples v4.1 447 gnomAD4.1_joint_MID_nhomalt: Count of individuals with homozygous alternative allele in the Middle Eastern gnomAD joint (genome+exome) samples v4.1 448 gnomAD4.1_joint_NFE_AC: Alternative allele count in the Non-Finnish European gnomAD joint (genome+exome) samples v4.1 449 gnomAD4.1_joint_NFE_AN: Total allele count in the Non-Finnish European gnomAD joint (genome+exome) samples v4.1 450 gnomAD4.1_joint_NFE_AF: Alternative allele frequency in the Non-Finnish European gnomAD joint (genome+exome) samples v4.1 451 gnomAD4.1_joint_NFE_nhomalt: Count of individuals with homozygous alternative allele in the Non-Finnish European gnomAD joint (genome+exome) samples v4.1 452 gnomAD4.1_joint_SAS_AC: Alternative allele count in the South Asian gnomAD joint (genome+exome) samples v4.1 453 gnomAD4.1_joint_SAS_AN: Total allele count in the South Asian gnomAD joint (genome+exome) samples v4.1 454 gnomAD4.1_joint_SAS_AF: Alternative allele frequency in the South Asian gnomAD joint (genome+exome) samples v4.1 455 gnomAD4.1_joint_SAS_nhomalt: Count of individuals with homozygous alternative allele in the South Asian gnomAD joint (genome+exome) samples v4.1 456 ALFA_European_AC: Alternative allele count of the European samples in the Allele Frequency Aggregator 457 ALFA_European_AN: Total allele count of the European samples in the Allele Frequency Aggregator 458 ALFA_European_AF: Alternative allele frequency of the European samples in the Allele Frequency Aggregator 459 ALFA_African_Others_AC: Alternative allele count of the individuals with African ancestry in the Allele Frequency Aggregator 460 ALFA_African_Others_AN: Total allele count of the individuals with African ancestry in the Allele Frequency Aggregator 461 ALFA_African_Others_AF: Alternative allele frequency of the individuals with African ancestry in the Allele Frequency Aggregator 462 ALFA_East_Asian_AC: Alternative allele count of the East Asian samples in the Allele Frequency Aggregator 463 ALFA_East_Asian_AN: Total allele count of the East Asian samples in the Allele Frequency Aggregator 464 ALFA_East_Asian_AF: Alternative allele frequency of the East Asian samples in the Allele Frequency Aggregator 465 ALFA_African_American_AC: Alternative allele count of the African American samples in the Allele Frequency Aggregator 466 ALFA_African_American_AN: Total allele count of the African American samples in the Allele Frequency Aggregator 467 ALFA_African_American_AF: Alternative allele frequency of the African American samples in the Allele Frequency Aggregator 468 ALFA_Latin_American_1_AC: Alternative allele count of the Latin American individuals with Afro-Caribbean ancestry in the Allele Frequency Aggregator 469 ALFA_Latin_American_1_AN: Total allele count of the Latin American individuals with Afro-Caribbean ancestry in the Allele Frequency Aggregator 470 ALFA_Latin_American_1_AF: Alternative allele frequency of the Latin American individuals with Afro-Caribbean ancestry in the Allele Frequency Aggregator 471 ALFA_Latin_American_2_AC: Alternative allele count of the Latin American individuals with mostly European and Native American Ancestry in the Allele Frequency Aggregator 472 ALFA_Latin_American_2_AN: Total allele count of the Latin American individuals with mostly European and Native American Ancestry in the Allele Frequency Aggregator 473 ALFA_Latin_American_2_AF: Alternative allele frequency of the Latin American individuals with mostly European and Native American Ancestry in the Allele Frequency Aggregator 474 ALFA_Other_Asian_AC: Alternative allele count of the Asian individuals excluding South or East Asian in the Allele Frequency Aggregator 475 ALFA_Other_Asian_AN: Total allele count of the Asian individuals excluding South or East Asian in the Allele Frequency Aggregator 476 ALFA_Other_Asian_AF: Alternative allele frequency of the Asian individuals excluding South or East Asian in the Allele Frequency Aggregator 477 ALFA_South_Asian_AC: Alternative allele count of the South Asian samples in the Allele Frequency Aggregator 478 ALFA_South_Asian_AN: Total allele count of the South Asian samples in the Allele Frequency Aggregator 479 ALFA_South_Asian_AF: Alternative allele frequency of the South Asian samples in the Allele Frequency Aggregator 480 ALFA_Other_AC: Alternative allele count of the samples whose self-reported population is inconsistent with the GRAF-assigned population in the Allele Frequency Aggregator 481 ALFA_Other_AN: Total allele count of the samples whose self-reported population is inconsistent with the GRAF-assigned population in the Allele Frequency Aggregator 482 ALFA_Other_AF: Alternative allele frequency of the samples whose self-reported population is inconsistent with the GRAF-assigned population in the Allele Frequency Aggregator 483 ALFA_African_AC: Alternative allele count of the all African samples (African_Others and African_American) in the Allele Frequency Aggregator 484 ALFA_African_AN: Total allele count of the all African samples (African_Others and African_American) in the Allele Frequency Aggregator 485 ALFA_African_AF: Alternative allele frequency of the all African samples (African_Others and African_American) in the Allele Frequency Aggregator 486 ALFA_Asian_AC: Alternative allele count of the all Asian individuals (East_Asian and Other_Asian, excluding South_Asian) in the Allele Frequency Aggregator 487 ALFA_Asian_AN: Total allele count of the all Asian individuals (East_Asian and Other_Asian, excluding South_Asian) in the Allele Frequency Aggregator 488 ALFA_Asian_AF: Alternative allele frequency of the all Asian individuals (East_Asian and Other_Asian, excluding South_Asian) in the Allele Frequency Aggregator 489 ALFA_Total_AC: Alternative allele count of the total samples in the Allele Frequency Aggregator 490 ALFA_Total_AN: Total allele count of the total samples in the Allele Frequency Aggregator 491 ALFA_Total_AF: Alternative allele frequency of the total samples in the Allele Frequency Aggregator 492 dbNSFP_POPMAX_AF: Maximum alternative allele frequency across all populations in dbNSFP. 493 dbNSFP_POPMAX_AC: Alternative allele count(s) of the population(s) with dbNSFP_POPMAX_AF. Multiple entries are separated by ";". 494 dbNSFP_POPMAX_POP: Population(s) with dbNSFP_POPMAX_AF. Multiple entries are separated by ";". Note 1: Missing data is designated as '.'. Columns of dbNSFP_gene: Gene_name: Gene symbol from HGNC Ensembl_gene: Ensembl gene id (from HGNC) chr: Chromosome number (from HGNC) 495 Gene_old_names: Old gene symbol (from HGNC) 496 Gene_other_names: Other gene names (from HGNC) 497 Uniprot_acc(HGNC/Uniprot): Uniprot acc number (from HGNC and Uniprot) 498 Uniprot_id(HGNC/Uniprot): Uniprot id (from HGNC and Uniprot) 499 Entrez_gene_id: Entrez gene id (from HGNC) 500 CCDS_id: CCDS id (from HGNC) 501 Refseq_id: Refseq gene id (from HGNC) 502 ucsc_id: UCSC gene id (from HGNC) 503 MIM_id: MIM gene id (from HGNC) 504 OMIM_id: MIM gene id from OMIM 505 Gene_full_name: Gene full name (from HGNC) 506 Pathway(Uniprot): Pathway description from Uniprot 507 Pathway(BioCarta)_short: Short name of the Pathway(s) the gene belongs to (from BioCarta) 508 Pathway(BioCarta)_full: Full name(s) of the Pathway(s) the gene belongs to (from BioCarta) 509 Pathway(ConsensusPathDB): Pathway(s) the gene belongs to (from ConsensusPathDB) 510 Pathway(KEGG)_id: ID(s) of the Pathway(s) the gene belongs to (from KEGG) 511 Pathway(KEGG)_full: Full name(s) of the Pathway(s) the gene belongs to (from KEGG) 512 Function_description: Function description of the gene (from Uniprot) 513 Disease_description: Disease(s) the gene caused or associated with (from Uniprot) 514 MIM_phenotype_id: MIM id(s) of the phenotype the gene caused or associated with (from Uniprot) 515 MIM_disease: MIM disease name(s) with MIM id(s) in "[]" (from Uniprot) 516 Orphanet_disorder_id: Orphanet Number of the disorder the gene caused or associated with 517 Orphanet_disorder: Disorder name from Orphanet 518 Orphanet_association_type: the type of association between the gene and the disorder 519 GenCC_id: uuid of the GenCC data 520 GenCC_disease_title: the disease_title from the GenCC data 521 GenCC_impact_class: the classification_title from the GenCC data 522 GenCC_model_of_inheritance: the moi_title from the GenCC data 523 GenCC_pmid: the submitted_as_pmids from the GenCC data 524 Trait_association(GWAS): Trait(s) the gene associated with (from GWAS catalog) 525 MGI_mouse_gene: Homolog mouse gene name from MGI 526 MGI_mouse_phenotype: Phenotype description for the homolog mouse gene from MGI 527 ZFIN_zebrafish_gene: Homolog zebrafish gene name from ZFIN 528 ZFIN_zebrafish_structure: Affected structure of the homolog zebrafish gene from ZFIN 529 ZFIN_zebrafish_phenotype_quality: Phenotype description for the homolog zebrafish gene from ZFIN 530 ZFIN_zebrafish_phenotype_tag: Phenotype tag for the homolog zebrafish gene from ZFIN 531 HPO_id: ID of the mapped Human Phenotype Ontology. Multiple IDs are separated by ";" 532 HPO_name: Name of the mapped Human Phenotype Ontology. Multiple names are separated by ";" 533 GO_biological_process: GO terms for biological process 534 GO_cellular_component: GO terms for cellular component 535 GO_molecular_function: GO terms for molecular function 536 P(HI): Estimated probability of haploinsufficiency of the gene (from doi:10.1371/journal.pgen.1001154) 537 HIPred_score: Estimated probability of haploinsufficiency of the gene (from doi:10.1093/bioinformatics/btx028) 538 HIPred: HIPred prediction of haploinsufficiency of the gene. Y(es) or N(o). (from doi:10.1093/bioinformatics/btx028) 539 GHIS: A score predicting the gene haploinsufficiency. The higher the score the more likely the gene is haploinsufficient. (from doi: 10.1093/nar/gkv474) 540 ClinGen_Haploinsufficiency_Score: Haploinsufficiency score from ClinGen 541 ClinGen_Haploinsufficiency_Description: description of haploinsufficiency from ClinGen 542 ClinGen_Haploinsufficiency_PMID: PMIDs describing the haploinsufficiency from ClinGen 543 ClinGen_Haploinsufficiency_Disease: diseases associated with the haploinsufficiency from ClinGen 544 P(rec): Estimated probability that gene is a recessive disease gene (from DOI:10.1126/science.1215040) 545 Known_rec_info: Known recessive status of the gene (from DOI:10.1126/science.1215040) "lof-tolerant = seen in homozygous state in at least one 1000G individual" "recessive = known OMIM recessive disease" (original annotations from DOI:10.1126/science.1215040) 546 RVIS_EVS: Residual Variation Intolerance Score, a measure of intolerance of mutational burden, the higher the score the more tolerant to mutational burden the gene is. Based on EVS (ESP6500) data. from doi:10.1371/journal.pgen.1003709 547 RVIS_percentile_EVS: The percentile rank of the gene based on RVIS, the higher the percentile the more tolerant to mutational burden the gene is. Based on EVS (ESP6500) data. 548 LoF-FDR_ExAC: "A gene's corresponding FDR p-value for preferential LoF depletion among the ExAC population. Lower FDR corresponds with genes that are increasingly depleted of LoF variants." cited from RVIS document. 549 RVIS_ExAC: "ExAC-based RVIS; setting 'common' MAF filter at 0.05% in at least one of the six individual ethnic strata from ExAC." cited from RVIS document. 550 RVIS_percentile_ExAC: "Genome-Wide percentile for the new ExAC-based RVIS; setting 'common' MAF filter at 0.05% in at least one of the six individual ethnic strata from ExAC." cited from RVIS document. 551 ExAC_pLI: "the probability of being loss-of-function intolerant (intolerant of both heterozygous and homozygous lof variants)" based on ExAC r0.3 data 552 ExAC_pRec: "the probability of being intolerant of homozygous, but not heterozygous lof variants" based on ExAC r0.3 data 553 ExAC_pNull: "the probability of being tolerant of both heterozygous and homozygous lof variants" based on ExAC r0.3 data 554 ExAC_nonTCGA_pLI: "the probability of being loss-of-function intolerant (intolerant of both heterozygous and homozygous lof variants)" based on ExAC r0.3 nonTCGA subset 555 ExAC_nonTCGA_pRec: "the probability of being intolerant of homozygous, but not heterozygous lof variants" based on ExAC r0.3 nonTCGA subset 556 ExAC_nonTCGA_pNull: "the probability of being tolerant of both heterozygous and homozygous lof variants" based on ExAC r0.3 nonTCGA subset 557 ExAC_nonpsych_pLI: "the probability of being loss-of-function intolerant (intolerant of both heterozygous and homozygous lof variants)" based on ExAC r0.3 nonpsych subset 558 ExAC_nonpsych_pRec: "the probability of being intolerant of homozygous, but not heterozygous lof variants" based on ExAC r0.3 nonpsych subset 559 ExAC_nonpsych_pNull: "the probability of being tolerant of both heterozygous and homozygous lof variants" based on ExAC r0.3 nonpsych subset 560 gnomAD_pLI: "the probability of being loss-of-function intolerant (intolerant of both heterozygous and homozygous lof variants)" based on gnomAD 2.1 data 561 gnomAD_pRec: "the probability of being intolerant of homozygous, but not heterozygous lof variants" based on gnomAD 2.1 data 562 gnomAD_pNull: "the probability of being tolerant of both heterozygous and homozygous lof variants" based on gnomAD 2.1 data 563 ExAC_del.score: "Winsorised deletion intolerance z-score" based on ExAC r0.3.1 CNV data 564 ExAC_dup.score: "Winsorised duplication intolerance z-score" based on ExAC r0.3.1 CNV data 565 ExAC_cnv.score: "Winsorised cnv intolerance z-score" based on ExAC r0.3.1 CNV data 566 ExAC_cnv_flag: "Gene is in a known region of recurrent CNVs mediated by tandem segmental duplications and intolerance scores are more likely to be biased or noisy." from ExAC r0.3.1 CNV release 567 GDI: gene damage index score, "a genome-wide, gene-level metric of the mutational damage that has accumulated in the general population" from doi: 10.1073/pnas.1518646112. The higher the score the less likely the gene is to be responsible for monogenic diseases. 568 GDI-Phred: Phred-scaled GDI scores 569 Gene damage prediction (all disease-causing genes): gene damage prediction (low/medium/high) by GDI for all diseases 570 Gene damage prediction (all Mendelian disease-causing genes): gene damage prediction (low/medium/high) by GDI for all Mendelian diseases 571 Gene damage prediction (Mendelian AD disease-causing genes): gene damage prediction (low/medium/high) by GDI for Mendelian autosomal dominant diseases 572 Gene damage prediction (Mendelian AR disease-causing genes): gene damage prediction (low/medium/high) by GDI for Mendelian autosomal recessive diseases 573 Gene damage prediction (all PID disease-causing genes): gene damage prediction (low/medium/high) by GDI for all primary immunodeficiency diseases 574 Gene damage prediction (PID AD disease-causing genes): gene damage prediction (low/medium/high) by GDI for primary immunodeficiency autosomal dominant diseases 575 Gene damage prediction (PID AR disease-causing genes): gene damage prediction (low/medium/high) by GDI for primary immunodeficiency autosomal recessive diseases 576 Gene damage prediction (all cancer disease-causing genes): gene damage prediction (low/medium/high) by GDI for all cancer disease 577 Gene damage prediction (cancer recessive disease-causing genes): gene damage prediction (low/medium/high) by GDI for cancer recessive disease 578 Gene damage prediction (cancer dominant disease-causing genes): gene damage prediction (low/medium/high) by GDI for cancer dominant disease 579 LoFtool_score: a percentile score for gene intolerance to functional change. The lower the score the higher gene intolerance to functional change. For details see doi: 10.1093/bioinformatics/btv602. 580 Essential_gene: Essential ("E") or Non-essential phenotype-changing ("N") based on Mouse Genome Informatics database. from doi:10.1371/journal.pgen.1003484 581 Essential_gene_CRISPR: Essential ("E") or Non-essential phenotype-changing ("N") based on large scale CRISPR experiments. from doi: 10.1126/science.aac7041 582 Essential_gene_CRISPR2: Essential ("E"), context-Specific essential ("S"), or Non-essential phenotype-changing ("N") based on large scale CRISPR experiments. from http://dx.doi.org/10.1016/j.cell.2015.11.015 583 Essential_gene_gene-trap: Essential ("E"), HAP1-Specific essential ("H"), KBM7-Specific essential ("K"), or Non-essential phenotype-changing ("N"), based on large scale mutagenesis experiments. from doi: 10.1126/science.aac7557 584 Gene_indispensability_score: A probability prediction of the gene being essential. From doi:10.1371/journal.pcbi.1002886 585 Gene_indispensability_pred: Essential ("E") or loss-of-function tolerant ("N") based on Gene_indispensability_score. 586 Tissue_specificity(Uniprot): Tissue specificity description from Uniprot 587 HPA_consensus_adipose_tissue: The consensus nTPM value for the gene in tissue type adipose_tissue, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 588 HPA_consensus_adrenal_gland: The consensus nTPM value for the gene in tissue type adrenal_gland, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 589 HPA_consensus_amygdala: The consensus nTPM value for the gene in tissue type amygdala, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 590 HPA_consensus_appendix: The consensus nTPM value for the gene in tissue type appendix, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 591 HPA_consensus_basal_ganglia: The consensus nTPM value for the gene in tissue type basal_ganglia, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 592 HPA_consensus_bone_marrow: The consensus nTPM value for the gene in tissue type bone_marrow, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 593 HPA_consensus_breast: The consensus nTPM value for the gene in tissue type breast, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 594 HPA_consensus_cerebellum: The consensus nTPM value for the gene in tissue type cerebellum, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 595 HPA_consensus_cerebral_cortex: The consensus nTPM value for the gene in tissue type cerebral_cortex, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 596 HPA_consensus_cervix: The consensus nTPM value for the gene in tissue type cervix, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 597 HPA_consensus_choroid_plexus: The consensus nTPM value for the gene in tissue type choroid_plexus, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 598 HPA_consensus_colon: The consensus nTPM value for the gene in tissue type colon, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 599 HPA_consensus_duodenum: The consensus nTPM value for the gene in tissue type duodenum, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 600 HPA_consensus_endometrium: The consensus nTPM value for the gene in tissue type endometrium, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 601 HPA_consensus_epididymis: The consensus nTPM value for the gene in tissue type epididymis, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 602 HPA_consensus_esophagus: The consensus nTPM value for the gene in tissue type esophagus, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 603 HPA_consensus_fallopian_tube: The consensus nTPM value for the gene in tissue type fallopian_tube, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 604 HPA_consensus_gallbladder: The consensus nTPM value for the gene in tissue type gallbladder, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 605 HPA_consensus_heart_muscle: The consensus nTPM value for the gene in tissue type heart_muscle, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 606 HPA_consensus_hippocampal_formation: The consensus nTPM value for the gene in tissue type hippocampal_formation, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 607 HPA_consensus_hypothalamus: The consensus nTPM value for the gene in tissue type hypothalamus, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 608 HPA_consensus_kidney: The consensus nTPM value for the gene in tissue type kidney, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 609 HPA_consensus_liver: The consensus nTPM value for the gene in tissue type liver, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 610 HPA_consensus_lung: The consensus nTPM value for the gene in tissue type lung, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 611 HPA_consensus_lymph_node: The consensus nTPM value for the gene in tissue type lymph_node, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 612 HPA_consensus_midbrain: The consensus nTPM value for the gene in tissue type midbrain, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 613 HPA_consensus_ovary: The consensus nTPM value for the gene in tissue type ovary, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 614 HPA_consensus_pancreas: The consensus nTPM value for the gene in tissue type pancreas, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 615 HPA_consensus_parathyroid_gland: The consensus nTPM value for the gene in tissue type parathyroid_gland, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 616 HPA_consensus_pituitary_gland: The consensus nTPM value for the gene in tissue type pituitary_gland, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 617 HPA_consensus_placenta: The consensus nTPM value for the gene in tissue type placenta, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 618 HPA_consensus_prostate: The consensus nTPM value for the gene in tissue type prostate, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 619 HPA_consensus_rectum: The consensus nTPM value for the gene in tissue type rectum, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 620 HPA_consensus_retina: The consensus nTPM value for the gene in tissue type retina, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 621 HPA_consensus_salivary_gland: The consensus nTPM value for the gene in tissue type salivary_gland, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 622 HPA_consensus_seminal_vesicle: The consensus nTPM value for the gene in tissue type seminal_vesicle, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 623 HPA_consensus_skeletal_muscle: The consensus nTPM value for the gene in tissue type skeletal_muscle, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 624 HPA_consensus_skin: The consensus nTPM value for the gene in tissue type skin, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 625 HPA_consensus_small_intestine: The consensus nTPM value for the gene in tissue type small_intestine, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 626 HPA_consensus_smooth_muscle: The consensus nTPM value for the gene in tissue type smooth_muscle, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 627 HPA_consensus_spinal_cord: The consensus nTPM value for the gene in tissue type spinal_cord, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 628 HPA_consensus_spleen: The consensus nTPM value for the gene in tissue type spleen, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 629 HPA_consensus_stomach: The consensus nTPM value for the gene in tissue type stomach, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 630 HPA_consensus_testis: The consensus nTPM value for the gene in tissue type testis, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 631 HPA_consensus_thymus: The consensus nTPM value for the gene in tissue type thymus, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 632 HPA_consensus_thyroid_gland: The consensus nTPM value for the gene in tissue type thyroid_gland, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 633 HPA_consensus_tongue: The consensus nTPM value for the gene in tissue type tongue, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 634 HPA_consensus_tonsil: The consensus nTPM value for the gene in tissue type tonsil, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 635 HPA_consensus_urinary_bladder: The consensus nTPM value for the gene in tissue type urinary_bladder, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 636 HPA_consensus_vagina: The consensus nTPM value for the gene in tissue type vagina, represents the maximum nTPM value based on the Human Protein Atlas HPA and GTEx. 637 HPA_consensus_highly_expressed: The tissue type the gene is highly_expressed, which is identified as outliers > Q3 + 1.5 * IQR based on the box plot of the gene's nTPM values in all tissues. Columns of dbscSNV1.1: chr: chromosome number pos: physical position on the chromosome as to hg19 (1-based coordinate) ref: reference nucleotide allele (as on the + strand) alt: alternative nucleotide allele (as on the + strand) hg38_chr: chromosome number as to hg38 hg38_pos: physical position on the chromosome as to hg38 (1-based coordinate) RefSeq?: whether the SNV is a scSNV according to RefSeq Ensembl?: whether the SNV is a scSNV according to Ensembl RefSeq_region: functional region the SNV located according to RefSeq RefSeq_gene: gene name according to RefSeq RefSeq_functional_consequence: functional consequence of the SNV according to RefSeq RefSeq_id_c.change_p.change: SNV in format of c.change and p.change according to RefSeq Ensembl_region: functional region the SNV located according to Ensembl Ensembl_gene: gene id according to Ensembl Ensembl_functional_consequence: functional consequence of the SNV according to Ensembl Ensembl_id_c.change_p.change: SNV in format of c.change and p.change according to Ensembl ada_score: ensemble prediction score based on ada-boost. Ranges 0 to 1. The larger the score the higher probability the scSNV will affect splicing. The suggested cutoff for a binary prediction (affecting splicing vs. not affecting splicing) is 0.6. rf_score: ensemble prediction score based on random forests. Ranges 0 to 1. The larger the score the higher probability the scSNV will affect splicing. The suggested cutoff for a binary prediction (affecting splicing vs. not affecting splicing) is 0.6. Note 1: Missing data is designated as '.'. Note 2: Multiple annotations are separated by ';' Please cite: Liu X, Jian X, and Boerwinkle E. 2011. dbNSFP: a lightweight database of human non-synonymous SNPs and their functional predictions. Human Mutation. 32:894-899. Liu X, Li C, Mou C, Dong Y, and Tu Y. 2020. dbNSFP v4: a comprehensive database of transcript-specific functional predictions and annotations for human nonsynonymous and splice-site SNVs. Genome Medicine. 12:103. Contact: Genos Bioinformatics LLC, Texas, USA dbNSFP Team Website: https://dbnsfp.org Email: feedback@dbnsfp.org Changelog: February 23, 2011: dbNSFP and search_dbNSFP v0.9 released. April 4, 2011: A bug related to the prediction scores of MutationTaster is fixed. dbNSFP v1.0 released. A change to the chromosome search order of the search_dbNSFP. A readme file added. search_dbNSFP v1.0 released. May 30, 2011: dbNSFP and search_dbNSFP v1.1 released. Version 1.1 added the following entries: rs numbers from UniSNP (a cleaned version of dbSNP build 129), allele frequency recorded in dbSNP, allele frequency reported by 1000 Genomes Project, alternative gene names, descriptive gene name, database cross references (gene IDs of HGNC, MIM, Ensembl and HPRD). The unzipped database is 18Gb. May 31, 2011: dbNSFP_light and search_dbNSFP_light v1.0 released. dbNSFP_light v1.0 is a light version of dbNSFP, which contains less annotation entries but some additional 9,285,316 NSs that are not in CCDS version 20090327. Scores of PhyloP, SIFT, Polyphen2, LRT and MutationTaster are included but missing data are not imputed. Prediction of LRT and MutationTaster are also included, as well as the omega estimated by LRT. The unzipped database is 6Gb. October 24, 2011: dbNSFP_light v1.1 and search_dbNSFP_light v1.1 released. dbNSFP v1.2 and search_dbNSFP v1.2 released. The new versions added GERP++ neutral rates and RS scores. October 25, 2011: dbNSFP v1.3 released. It added Uniprot ID, accession number and amino acid position based on Polyphen-2 annotation. Users now can search amino acid change directly referring to a Uniprot ID or accession number. November 3, 2011: dbNSFP_light v1.2 released. It added Uniprot ID, accession number and amino acid position based on Polyphen-2 annotation. Users now can search amino acid change directly referring to a Uniprot ID or accession number. November 10, 2011: A bug fixed in the companion search program for dbNSFP v1.3, which causes invalid search using AA mutations with Uniprot ID or accession number. December 16, 2011: dbNSFP_light v1.3 released. It updated SIFT scores (August, 2011 version) and Polyphen-2 scores (May, 2011 version). Uniprot ID, accession number and amino acid position based on the Polyphen-2 annotations have been updated too. April 11, 2012: dbNSFP2.0b1_variant released. This is beta test version of the variant sub-database of dbNSFP v2.0, which is rebuilt based on Gencode release 9 / Ensembl version 64. June 2, 2012: dbNSFP v2.0b2 released. It includes both the dbNSFP_variant and dbNSFP_gene sub-databases. Slight changes have been made to the Ensembl gene and transcript ids of dbNSFP_variant in order to be compatible to other database sources. July 2, 2012: dbNSFP v2.0b3 released. An additional 2.2 million splicing site SNPs have been added to dbNSFP_variant. In the table those SNPs have missing (".") in aaref, aaalt and "-1" in aapos. There's no change to the format of the search input file. August 28, 2012: The companion java search program search_dbNSFP20b3 is updated. Added features include supporting vcf file as input file and options for output contents (columns). October 27, 2012: dbNSFP v2.0b4 was released. A new functional prediction score MutationAssessor is added (I thank Mr. Yevgeniy Antipin for his recommendation). Allele frequencies from ESP 5400 data set are replaced by ESP 6500 data set. February 25, 2013: dbNSFP v2.0 was released. A new functional prediction score FATHMM is added. March 22, 2013: A bug which caused a lot of missing FATHMM scores has been fixed. May 31, 2013: The source code of the companion Java search program is now available under the RECEX SHARED SOURCE LICENSE. October 3, 2013: dbNSFP v2.1 is released. MutationTaster and FATHMM scores have been updated. Converted scores of SIFT, LRT, MutationTaster, MutationAssessor and FATHMM have been added. Columns of SIFT and FATHMM predictions have been added. The gene database has also been updated. Database IDs are updated. GO Slim terms, pathway and protein interaction information from the ConsensusPathDB, and list of essential and non-essential genes (based on phenotypes of mouse homologs) have been added. January 23, 2014: dbNSFP v2.2 is released. SIFT and FATHMM now have multiple scores corresponding to different Ensembl ENSP ids and amino acid positions (aapos_SIFT and aapos_FATHMM). Accordingly, our companion search program now supports SNP searches based on Ensembl ENSP ids and amino acid positions. A bug is fixed for a small proportion of MutationTaster scores. January 26, 2014: dbNSFP v2.3 is released. Two ensemble scores (RadialSVM and LR) and their predictions have been added. February 12, 2014: A bug was fixed in dbNSFP v2.2 and v2.3, which caused missing delimiters in columns aapos_SIFT, SIFT_score_converted and SIFT_pred. (I thank Mr. Yevgeniy Antipin for his reminder). March 5, 2014: dbNSFP v2.4 is released. A whole genome functional prediction score called CADD was added, along with five more conservation scores (phyloP46way_primate, phyloP100way_vertebrate, phastCons46way_primate, phastCons46way_placental, phastCons100way_vertebarate). To facilitate comparison between scores, we added rank scores for most functional prediction scores and conservation scores, and replacing the "converted" scores in the previous versions. June 1, 2014: dbNSFP v2.5 is released. A new functional score VEST 3.0 has been added. We thank Dr. Karchin for kindly providing the score. A bug that causes the MutationTaster score error since v2.1 for variants with a prediction of "Polymorphism_automatic" has been fixed. We thank John McGuigan and James Ireland for reporting this bug. As MutationTaster can also predict splicing change and other functional effects, in case a variant has multiple predictions based on their different model, we took the most damaging score and prediction for dbNSFP. July 26, 2014: dbNSFP v2.6 is released. rs numbers from dbSNP 141 have been added to the variant database files. Mouse and zebra fish homolog genes and phenotypes have been added to the gene database file (I thank Alex Li for his suggestion and helps). Trait_association(GWAS) was also updated. An attached database called dbscSNV is available for download. It includes all potential human SNVs within splicing consensus regions (−3 to +8 at the 5’ splice site and −12 to +2 at the 3’ splice site), i.e. scSNVs, related functional annotations and two ensemble prediction scores for predicting their potential of altering splicing. A manuscript describing those scores have been submitted. search_dbNSFP26 now supports searching dbNSFP along with dbscSNV using option "-s". September 12, 2014: dbNSFP v2.7 was released. Chromosomes and positions of human reference hg38 have been added. search_ dbNSFP27.class now supports querying dbNSFP using the positions based on hg38 with the "-v hg38" option. clinvar (freeze 20140902) annotations have been added. Allele frequencies from 2303 exomes of African Americans and 3203 exomes of European Americans from the Atherosclerosis Risk in Communities Study (ARIC) cohort study have been added. As the columns for gene interactions in dbNSFP_gene table contain very long strings, especially for gene UBC, which may cause problems when viewing the results in Excel, now we only report the number of interacting genes in those columns. Full information is retained in the dbNSFP_gene.complete table. November 21, 2014: dbNSFP v2.8 is released. COSMIC (Catalogue Of Somatic Mutations In Cancer) annotation have been added. Pathway information from BioCarta and KEGG (old version) has been added to the dbNSFP2.8_gene. A bug causing inconsistency between MutationTaster scores and MutationTaster_pred, which affects v2.5 to v2.7, has been fixed. I thank Adam Novak for reporting this bug. February 3, 2015: dbNSFP v2.9 is released. SIFT score has been updated to ensembl66 version. PROVEAN score (Protein Variation Effect Analyzer) v1.1 has been added. I thank Yongwook Choi from jcvi for providing the SIFT and PROVEAN scores. CADD score has been updated to 1.3 version. Please note the following copyright statement for CADD: "CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely available for all academic, non-commercial applications. For commercial licensing information contact Jennifer McCullar (mccullaj@uw.edu)." Allele frequency v0.3 of ~60,706 unrelated individuals from The Exome Aggregation Consortium (ExAC) has been added. ExAC data are released under a Fort Lauderdale Agreement. Please refer to http://exac.broadinstitute.org/terms for terms of use. I also want to thank Dr. CS (Jonathan) Liu from Softgenetics for providing hosting space. April 6, 2015: dbNSFP v3.0b1 is released. The core set of nsSNVs and ssSNVs has been rebuilt based on Gencode 22/ Ensembl 79 with human reference sequence hg38. Putative genes have been included. Genes with incomplete 5' have been excluded (I thank Chris Gillies for reporting the issues for genes with incomplete 5' end.) Genes on mitochondrial DNA have been included. Allele frequencies from the UK10K cohorts and genotypes of two Neanderthals have been added. Some resources have been updated, including the MutationTaster (I thank Dr. Dominik Seelow for kindly providing the scores), allele frequencies from the 1000 Genomes Project populations, ancestral alleles, dbSNP, ClinVar and InterPro. The presentation of the prediction scores has been improved by adding columns for the corresponding transcript/protein ids. PhyloP and PhastCons conservation scores based on hg19 have been replaced by the scores based on hg38. Some resources have been dropped due to various reasons, including SLR test statistic, UniSNP ids, allele frequencies from the ARIC cohorts and allele counts in COSMIC. dbNSFP_gene has also been completely rebuilt using the up-to-date resources. Residual Variation Intolerance Scores (RVIS) have been added. GO Slim terms have been replaced by full GO terms. Two branches of dbNSFP are now provided: dbNSFP3.0b1a suitable for academic use, which includes all the resources, and dbNSFP3.0b1c suitable for commercial use, which does not include VEST3 and CADD. April 12, 2015: dbNSFP v3.0b2 is released. This update fixed the issues due to inconsistent mitochondrial reference sequences used by different resources. I thank Dr. Lishuang Shen at MEEI for helping solving the issues. For mitochondrial SNV, the pos (i.e. hg38) refers to the rCRS (GenBank: NC_012920) and hg19_pos refers to a YRI sequence (GenBank: AF347015). The ancestral allele of mitochondrial SNV now comes from the Reconstructed Sapiens Reference Sequence (RSRS, doi:10.1016/j.ajhg.2012.03.002). The affected content include ancestral alleles, Neanderthal/Denisova genotypes and MutationTaster columns of the chrM file. The rankscores of MutationTaster has also been updated to reflect the update of its chrM scores. dbscSNV has been updated to v1.1 and added hg38 positions liftovered from its hg19 positions. Using search_dbNSFP30b2a or search_dbNSFP30b2c you can search dbscSNV1.1 along with dbNSFP v3.0b2 with either hg19 coordinates or hg38 coordinates. August 3, 2015: dbNSFP v3.0 is released. Three new functional prediction scores (DANN, fathmm-MKL and fitCons) and two conservation scores (phyloP20way_mammalian and phastCons20way_mammalian) have been added to dbNSFP v3.0a. All five scores except DANN are also included in bNSFP v3.0c. For commercial application of DANN, please contact Daniel Quang (dxquang@uci.edu). CADD scores have been updated to v1.3. I thank Dr. Xueqiu Jian and Kirill Prusov for suggestions on README files. dbNSFP v3.0 will be integrated into our new whole genome annotation pipeline WGSA version 0.6. Please join our Email group for news and updates from dbNSFP. Columns updated: CADD_raw (dbNSFP v3.0a only), CADD_raw_rankscore (dbNSFP v3.0a only), CADD_phred (dbNSFP v3.0a only). New columns: DANN_score (dbNSFP v3.0a only), DANN_rankscore (dbNSFP v3.0a only), fathmm-MKL_coding_score, fathmm-MKL_coding_rankscore, fathmm-MKL_coding_pred, fathmm-MKL_coding_group, integrated_fitCons_score, integrated_fitCons_rankscore, integrated_confidence_value, GM12878_fitCons_score, GM12878_fitCons_rankscore, GM12878_confidence_value, H1-hESC_fitCons_score, H1-hESC_fitCons_rankscore, H1-hESC_confidence_value, HUVEC_fitCons_score, HUVEC_fitCons_rankscore, HUVEC_confidence_value. November 24, 2015: dbNSFP v3.1 is released. Significant eQTLs from GTEx V6 has been added. dbSNP rs has been updated to build 144. Gene expression information (rpkm of RNAseq) of 53 tissues from GTEx V6 has been added to dbNSFP_gene. Three gene intolerance scores (RVIS based on ExAC r0.3, GDI and LoFtool) has been added to dbNSFP_gene. March 20, 2016: dbNSFP v3.2 is released. Eigen score, Eigen PC score (doi: 10.1038/ng.3477) and GenoCanyon score (doi:10.1038/srep10576) have been added. Allele frequencies of two commonly used subsets of ExAC data (nonTCGA and nonpsych) have been added. Mutation Assessor scores have been updated to release 3. PhyloP7way_vertebrate and PhastCons7way_vertebrate conservation scores have been updated to PhyloP100way_vertebrate and PhastCons100way_vertebrate, respectively. rankscores have been updated accordingly. Ancestral alleles have been updated based on Ensembl 84. dbSNP has been updated to build 146. Clinvar has been updated to 20160302. InterPro has been updated to v56. Gene name cross-links, IntAct, Uniprot, GWAS catalog, BioGRID, GO, ConsensusPathDB, mouse genes and zebra fish genes information for the dbNSFP_gene table have been updated. November 30, 2016: dbNSFP v3.3 and v2.9.2 are released. M-CAP score (DOI: 10.1038/ng.3703) has been added. We thank Dr. Gill Bejerano for providing the score. Eigen and Eigen PC scores have been updated to v1.1. dbSNP has been updated to v147. clinvar has been updated to 20161101. March 12, 2017: dbNSFP v3.4 and v2.9.3 are released. REVEL score ( doi: 10.1016/j.ajhg.2016.08.016) and MutPred score (doi: 10.1093/bioinformatics/btp528) have been added. SORVA gene ranking scores (doi: 10.1101/103218) have been added to gene annotation. August 6, 2017: dbNSFP v3.5 is released. Allele frequencies from the exomes and genomes of the Genome Aggregation Database (gnomAD) have been added. Interpro, dbSNP, clinvar, ancestral alleles, Altai Neanderthal genotypes, Denisova genotypes and GTEx eQTLs have been updated. dbNSFP_gene has been rebuilt with updated annotations. Other changes to dbNSFP_gene include: Interactions columns now show the gene list instead of the total number; GTEx gene expression annotations have been removed; LoF FDR p-value from RVIS has been added; Genome-wide haploinsufficiency score (GHIS) has been added; LoF and CNV intolerance/tolerance scores based on ExAC data have been added. December 8, 2018: dbNSFP v4.0b1 is released for beta testing. The core set of nsSNVs and ssSNVs has been rebuilt based on Gencode 29/ Ensembl 94 with human reference sequence hg38. Eight deleteriousness prediction scores (ALoFT, DEOGEN2, FATHMM-XF, MPC, MVP, PrimateAI, LINSIGHT, SIFT4G) have been added. Three conservation scores (phyloP17way_primate, phastCons17way_primate, bStatistic) have been added. Allele frequencies from the gnomAD consortium, eQTLs from the Geuvadis project, and genotypes of a Vindija33.19 Neanderthal have been added. Some resources have been updated, including VEST (We thank Dr. Karchin), CADD, M-CAP, ancestral alleles, dbSNP, ClinVar, GTEx and InterPro. The presentation of the prediction scores has been further improved by adding the correspondence to transcript/protein ids in a systematic way. APPRIS, GENCODE_basic, TSL and VEP_canonical have been added to facilitate the choice of appropriate transcripts. dbNSFP_gene has also been completely rebuilt using the up-to-date resources. HIPred, gene constraint scores from the gnomAD data, essential genes predictions based on CRISPR, gene-trap and gene networks have been added. Two branches of dbNSFP are provided: dbNSFP4.0b1a suitable for academic use, which includes all the resources, and dbNSFP4.0b1c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon. December 30, 2018: A bug causing id mapping issue from Uniprot to Ensembl, which further causing increased missing rates of Polyphen2, MutationAssessor and DEOGEN2, has been found and fixed (We thank Dr. Daniele Raimondi). February 20, 2019: sprot_varsplic was included in the mapping from Uniprot to Ensembl. Fixed column title inconsistency between the README file and data file. (We thank Kevin Xin and Julius Jacobsen for pointing out the inconsistency.) dbMTS was added as an attached database. search_dbNSFP added support for searching dbMTS with option '-m'. May 3, 2019: dbNSFP v4.0 is released. HGVS c. and p. presentations from ANNOVAR, SnpEff and VEP have been added. search_dbNSFP now supports search based on HGVS c. and p. presentations. Please refer to search_dbNSFP40a.readme.pdf or search_dbNSFP40c.readme.pdf for details. MedGen ID, OMIM ID and Orphanet ID from clinvar have been added. December 5, 2019: A minor bug is fixed in dbNSFP v4.0. In the previous release the content of the following columns were compressed, i.e. if annotations for all transcripts are identical, only one annotation was presented: genename, cds_strand, refcodon, codonpos, codon_degeneracy, FATHMM_score, FATHMM_pred, Interpro_domain. In this release those columns are decompressed, i.e. have the same number of annotations as the number of transcripts. A Java-based graphic user interface (GUI) search program (search_dbNSFP40a.jar or search_dbNSFP40c.jar) has been added. Users can double-click the jar file to launch the GUI (it supports commandline also, please check the search_dbNSFP readme pdf for details). May 15, 2020: A minor bug is fixed in dbNSFP v4.0. In the previous release, the column Primate_AI_pred was not 100% correct. We thank Alex Kouris for reporting this issue. June 16, 2020: dbNSFP v4.1 is released. BayesDel (https://doi.org/10.1002/humu.23158), ClinPred (https://doi.org/10.1016/j.ajhg.2018.08.005) and LIST-S2 (https://doi.org/10.1093/nar/gkaa288) scores have been added. CADD has been updated to v1.6, CADD score based on hg19 model has been added. Clinvar, GTEx and gnomAD genomes have been updated. HPO terms have been added to the dbNSFP_gene. search_dbNSFP programs now support searching SpliceAI as an attached database. Jan 27, 2021: The command-line only version of the search programs for v4.1a and v4.1c were added. Feb 10, 2021: A bug fixed. In the previous release, the gnomAD_pLI, gnomAD_pRec and gnomAD_pNull scores in dbNSFP4.1_gene.gz and dbNSFP4.1_gene.complete.gz have a problem that the scores are not always corresponding to the canonical transcripts of the genes. We thank Dr. Raphaël Helaers for reporting this bug. March 12, 2021: A bug fixed. In the previous release, some ALoFT scores/information are missing in dbNSFP. We thank Dr. Shuwei Li for reporting this bug. April 6, 2021: dbNSFP v4.2 is released. MetaRNN scores have been added. Allele frequencies of gnomAD exome have been updated to r2.1.1. Allele Frequencies of gnomAD genome have been updated to v3.1. dbSNP has been updated to 154. clinvar has been updated to 20210131. February 18, 2022: dbNSFP v4.3 is released. REVEL scores have been updated with transcript ids, i.e., the scores are now transcript-specific. Genotypes of Chagyrskaya neandertals have been added. dbSNP has been updated to b155. clinvar has been updated to 20220122. May 6, 2023: dbNSFP v4.4 is released. gMVP and VARITY scores have been added. Allele frequencies of ALFA (Allele Frequency Aggregator) have been added. dbSNP has been updated to b156. clinvar has been updated to 20230430. phyloP30way_mammalian has been replaced by phyloP470way_mammalian. phastCons30way_mammalian has been replaced by phastCons470way_mammalian. A bug in MutPred scores (not all SNVs causing the same AA change have scores) has been fixed. November 2, 2023: dbNSFP v4.5 is released. ClinVar has been updated to 20231028. ESM1b, EVE and AlphaMissense scores have been added. AlphaMissense scores are for non-commercial research use only: "AlphaMissense Database Copyright (2023) DeepMind Technologies Limited. All predictions are provided for non-commercial research use only under CC BY-NC-SA license." This distribution of the derived AlphaMissense_score, AlphaMissense_rankscore, and AlphaMissense_pred in dbNSFP are also under CC BY-NC-SA license and only included in the "a" branch of dbNSFP. A copy of CC BY-NC-SA license can be found at https://creativecommons.org/licenses/by-nc-sa/4.0/. February 18, 2024: dbNSFP v4.6 is released. ClinVar has been updated to 20240215. GTEx V8 splicing QTLs (sQTLs) have been added. eQTLs from eQTLGen phase I have been added. There was a bug in v4.5 causing a large proportion of ESM1b scores to be misaligned. It has been fixed. We thank Dr. In-Hee Lee for reporting this bug. March 3, 2024: dbNSFP v4.7 is released. CADD has been updated to v1.7. Allele frequencies of gnomAD exomes and genomes have been updated to v4.0.0. One bug in v4.6 causing eQTLGen eQTLs of some tissues missing has been fixed. March 13, 2024: AlphaMissense scores are now licensed under the Creative Commons Attribution 4.0 International License (CC-BY), thereby been added to dbNSFP v4.7 "c" branch. June 13, 2024: dbNSFP v4.8 is released. MutFormer and PHACTboost scores have been added. GERP conservation score calculated based on 91 mammals has been added. August 8, 2024: dbNSFP v4.9 is released. MutScore has been added. ClinVar has been updated. January 1, 2025: dbNSFP v5.0 is released. Fully rebuilt based on Gencode 46. Allele frequencies from TOPMed freeze8, and gnomAD exome controls, non-neuro, and non-cancer subsets from v2.1.1 have been added. Transcripts annotations from the MANE project have been added. MutationTaster has been updated to MutationTaster2021. Allele frequencies from gnomAD have been updated to v4.1 joint data set (genome+exome). bStatistic ancestral alleles, and ClinVar have been updated. LRT, FATHMM, fathmm-MKL, fitCons, LINSIGHT, GenoCanyon, EVE, Siphy scores, allele frequencies from ESP, ExAC, UK10K, and GTEx, eQTLGen and Geuvadis eQTLs have been retired. The gene table has also been rebuilt. ClinGen Dosage Sensitivity and Human Protein Atlas consensus gene expression levels have been added. HGNC, Uniprot, IntAct, GWAS catalog, Interpro, Gene Ontology, ConsensusPathDB, HPO, mouse homolog genes, zebrafish homolog genes, HPO, OMIM, Orphanet have been updated. The egenetics and GNF/Atlas expression, gene interactions and SORVA statistic have been retired. March 21, 2025: dbNSFP v5.1 is released. Fully rebuilt based on Gencode 47. Allele frequencies from All of Us 250k genomes and Regeneron Genetics Center Million Exome (RGC-ME) data have been added (Thanks to Moez Dawood and Dr. Richard Gibbs). Maximum alternative allele frequency across all populations in dbNSFP has been added. MutationTaster2021 has been updated to include more missense mutations (Thanks to Franziska Fritz and Dr. Dominik Seelow). The gene table has also been rebuilt. Other resources updated include Clinvar, GWAS catalog, OMIM, Orphanet, ClinGen Dosage Sensitivity, Interpro, HGNC, Uniprot, IntAct, Gene Ontology, MGI, ZFIN, UniParc, RefSeq, MANE, and HPO. Please note RGC-ME allele frequency data is only available in the v5.1a for academia usage. July 2, 2025: dbNSFP v5.2 is released. Fully rebuilt based on Gencode 48. MutPred2 scores have been added to both the a and c branches, replacing the MutPred v1.2 scores. ESM1b scores have been updated. ALFA, ancestral alleles, ClinVar, dbSNP, GWAS catalog, InterPro have been updated. GenCC gene annotations have been added. Other updated gene annotation resources include ClinGen, Gene Ontology, HGNC, HPO, IntAct, MGI, OMIM, UniProt, Ensembl, RefSeq, ZFIN.