TY - JOUR
T1 - Reference exome data for Australian Aboriginal populations to support health-based research
AU - Weeks, Alexia L.
AU - D’Antoine, Heather A.
AU - McKinnon, Melita
AU - Syn, Genevieve
AU - Bessarab, Dawn
AU - Brown, Ngiare
AU - Tong, Steven Y.C.
AU - Reményi, Bo
AU - Steer, Andrew
AU - Gray, Lesley Ann
AU - Inouye, Michael
AU - Carapetis, Jonathan R.
AU - Blackwell, Jenefer M.
AU - Lassmann, Timo
PY - 2020/4/29
Y1 - 2020/4/29
N2 - Whole exome sequencing (WES) is a popular and successful technology which is widely used in both research and clinical settings. However, there is a paucity of reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 50 Aboriginal individuals from the Northern Territory (NT) of Australia and compare these to 72 previously published exomes from a Western Australian (WA) population of Martu origin. Sequence data for both NT and WA samples were processed using an ‘intersect-then-combine’ (ITC) approach, using GATK and SAMtools to call variants. A total of 289,829 variants were identified in at least one individual in the NT cohort and 248,374 variants in at least one individual in the WA cohort. Of these, 166,719 variants were present in both cohorts, whilst 123,110 variants were private to the NT cohort and 81,655 were private to the WA cohort. Our data set provides a useful reference point for genomic studies on Aboriginal Australians.
AB - Whole exome sequencing (WES) is a popular and successful technology which is widely used in both research and clinical settings. However, there is a paucity of reference data for Aboriginal Australians to underpin the translation of health-based genomic research. Here we provide a catalogue of variants called after sequencing the exomes of 50 Aboriginal individuals from the Northern Territory (NT) of Australia and compare these to 72 previously published exomes from a Western Australian (WA) population of Martu origin. Sequence data for both NT and WA samples were processed using an ‘intersect-then-combine’ (ITC) approach, using GATK and SAMtools to call variants. A total of 289,829 variants were identified in at least one individual in the NT cohort and 248,374 variants in at least one individual in the WA cohort. Of these, 166,719 variants were present in both cohorts, whilst 123,110 variants were private to the NT cohort and 81,655 were private to the WA cohort. Our data set provides a useful reference point for genomic studies on Aboriginal Australians.
UR - http://www.scopus.com/inward/record.url?scp=85084005363&partnerID=8YFLogxK
U2 - 10.1038/s41597-020-0463-1
DO - 10.1038/s41597-020-0463-1
M3 - Article
C2 - 32350262
AN - SCOPUS:85084005363
SN - 2052-4463
VL - 7
SP - 1
EP - 7
JO - Scientific Data
JF - Scientific Data
M1 - 129
ER -