Caste in India: genetics and heredity (academic studies)

From Indpaedia
Jump to: navigation, search

Hindi English French German Italian Portuguese Russian Spanish

These are abstracts of academic papers, archived for the excellence of their content.
To read the complete, original papers, kindly click the hyperlinks given.
Additional information may please be sent as messages to the Facebook
community, All information used will be gratefully
acknowledged in your name.


Indian population

Reconstructing Indian population history

Nature 461, 489-494 (24 September 2009) | doi:10.1038/nature08365 | Reconstructing Indian population history

Reconstructing Indian population history

Authors David Reich, Kumarasamy Thangaraj, Nick Patterson, Alkes L. Price & Lalji Singh

Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA

Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA

Centre for Cellular and Molecular Biology, Hyderabad 500 007, India

Departments of Epidemiology and Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, USA

These authors contributed equally to this work.

Correspondence to: David Reich1,2,5Lalji Singh3 Correspondence and requests for materials should be addressed to D.R. (Email: or L.S. (Email:


India has been underrepresented in genome-wide surveys of human variation. We analyse 25 diverse groups in India to provide strong evidence for two ancient populations, genetically divergent, that are ancestral to most Indians today. One, the 'Ancestral North Indians' (ANI), is genetically close to Middle Easterners, Central Asians, and Europeans, whereas the other, the 'Ancestral South Indians' (ASI), is as distinct from ANI and East Asians as they are from each other. By introducing methods that can estimate ancestry without accurate ancestral populations, we show that ANI ancestry ranges from 39–71% in most Indian groups, and is higher in traditionally upper caste and Indo-European speakers. Groups with only ASI ancestry may no longer exist in mainland India.

However, the indigenous Andaman Islanders are unique in being ASI-related groups without ANI ancestry. Allele frequency differences between groups in India are larger than in Europe, reflecting strong founder effects whose signatures have been maintained for thousands of years owing to endogamy. We therefore predict that there will be an excess of recessive diseases in India, which should be possible to screen and map genetically.

Andhra Pradesh

Brahmin and Jalari castes

Human Heredity, 1985| Karger

A Sero-Biochemical Genetic Study of Jalari and Brahmin Caste Populations of Andhra Pradesh, India

Authors Naidu J.M.a · Mohrenweiser H.W.b · Nee J.V.b


Blood specimens from Jalari and Brahmin caste populations of Andhra Pradesh, India, were examined for blood groups, red cell enzymes, and serum proteins. Of 33 genetic loci studied, 16 were observed to be invariant among both the castes, while common polymorphism or rare variants were observed in one or both populations for the other loci. Three rare heterozygotes at the phosphoglucoisomerase locus, two different peptidase A variants occurring once each and single cases of rare 6-phosphogluconate dehydrogenase and transferrin variants were recorded. Also a few cases of hemoglobin AS and anhaptoglobinemia were observed. The difference in rare variants between the two castes is conspicuous but large differences in their gene frequencies at the polymorphic loci were not observed. It is pointed out that the frequency of rare variants in the tribal and caste populations of Southern India appears to be higher than observed in temperate-dwelling civilized populations.

© 1985 S. Karger AG, Basel

Brahmin and Kamma

Human Heredity, 1983| Karger

Acid Phosphatase among Brahmin and Kamma Caste Populations of Coastal Andhra Pradesh, India

Authors Naidu J.M.a · Veerraju P.b


Haemolysate samples from two caste populations, namely Brahmin and Kamma from coastal Andhra Pradesh, India, were typed for acid phosphatase by using starch gel electrophoresis with the discontinuous buffer system. The sample includes 225 Brahmins and 221 Kammas. Only A, B and AB phenotypes were observed and a statistically significant difference was found between the two caste groups in their acid phosphatase distribution. An association of higher B gene frequency with non-vegetarian diet is also suggested.

© 1983 S. Karger AG, Basel

Dyslipidemia in the population

Lipids in Health and Disease201716:116 | 13 June 2017

Quantitative trait loci at the 11q23.3 chromosomal region related to dyslipidemia in the population of Andhra Pradesh, India

Authors Rayabarapu Pranavchand and Battini Mohan ReddyEmail author

© The Author(s). 2017



Given the characteristic atherogenic dyslipidemia of south Indian population and crucial role of APOA1, APOC3, APOA4 and APOA5 genes clustered in 11q23.3 chromosomal region in regulating lipoprotein metabolism and cholesterol homeostasis, a large number of recently identified variants are to be explored for their role in regulating the serum lipid parameters among south Indians.


Using fluidigm SNP genotyping platform, a prioritized set of 96 SNPs of the 11q23.3 chromosomal region were genotyped on 516 individuals from Hyderabad, India, and its vicinity and aged >45 years.


The linear regression analysis of the individual lipid traits viz., TC, LDLC, HDLC, VLDL and TG with each of the 78 SNPs that confirm to HWE and with minor allele frequency > 1%, suggests 23 of those to be significantly associated (p ≤ 0.05) with at least one of these quantitative traits. Most importantly, the variant rs632153 is involved in elevating TC, LDLC, TG and VLDLs and probably playing a crucial role in the manifestation of dyslipidemia. Additionally, another three SNPs rs633389, rs2187126 and rs1263163 are found risk conferring to dyslipidemia by elevating LDLC and TC levels in the present population. Further, the ROC (receiver operating curve) analysis for the risk scores and dyslipidemia status yielded a significant area under curve (AUC) = 0.675, suggesting high discriminative power of the risk variants towards the condition. The interaction analysis suggests rs10488699-rs2187126 pair of the BUD13 gene to confer significant risk (Interaction odds ratio = 14.38, P = 7.17 × 105) towards dyslipidemia by elevating the TC levels (β = 37.13, p = 6.614 × 105). On the other hand, the interaction between variants of APOA1 gene and BUD13 and/or ZPR1 regulatory genes at this region are associated with elevated TG and VLDL.


The variants at 11q23.3 chromosomal region seem to determine the quantitative lipid traits and in turn dyslipidemia in the population of Hyderabad. Particularly, the variants rs632153, rs633389, rs2187126 and rs1263163 might be risk conferring to dyslipidemia by elevating LDLC and TC levels, while the variants of APOC3 and APOA1 genes might be the genetic determinants of elevated triglycerides in the present population.

In conclusion, the variants at 11q23.3 chromosomal region seem to determine the quantitative lipid traits and in turn dyslipidemia in the population of Hyderabad. Particularly, the variants rs632153, rs633389, rs2187126 and rs1263163 might be risk conferring to dyslipidemia by elevating LDLC and TC levels in the present population. These four SNPs exhibited a dominant mode of genotypic association with dyslipidemia, which implies that the BUD13, ZPR1 and APOA5-APOA4 intergenic regions might have a direct role in regulating these traits through their pleiotropic effects. Further, the variants of APOC3 and APOA1 genes might be the genetic determinants of elevated triglycerides in the present population. We suggest confirmation of the observed characteristics of 11q23.3 chromosomal region in multi ethnic studies in India that are base on much larger sample sizes well as through a more focused chromatin level studies with subsequent functional validation.

Golla (Southern Andhra Pradesh)

Genomic Diversity

Human Biology | Volume 73, Number 2, April 2001, pp. 175-190

Genomic Diversity at Thirteen Short Tandem Repeat Loci in a Substructured Caste Population, Golla, of Southern Andhra Pradesh, India

Authors B. Mohan Reddy, Guangyun Sun, Javier Rodriguez Luis, Michael H. Crawford, Narabar Shyam Hemam, Ranjan Deka


Genomic diversity based on 13 short tandem repeat (STR) loci was studied in seven population groups of a substructured Golla caste from Chittoor district in southern Andhra Pradesh, India. These groups are traditionally pastoral, culturally homogeneous, and strictly endogamous. Blood samples were drawn from 317 individuals from 30 Golla villages. The 13 STR loci analyzed in five standard multiplex polymerase chain reactions were: (1) CSF1R, TH01, and PLA2A; (2) F13A1, CYP19, and LPL; (3) D21S1446 and D21S1435; (4) D20S481, D20S473, and D20S604; and (5) D5S1453 and D6S1006.

The average heterozygosity was found to be low among the Golla subgroups (0.64-0.70) in comparison to that of groups at the upper levels of the hierarchy. The coefficient of gene differentiation was found to be moderate (average GST = 0.031; range between 0.018 and 0.049 among the loci) when compared to that observed for a similar class of markers among populations with relatively higher levels of hierarchy, for example, among castes. It is, however, much higher when compared to the average observed for Indian caste and tribal populations, based on classical markers. Genetic distance measures revealed clusters of populations that are consistent with the known ethnohistorical and geographical backgrounds of the groups. We claim that these hypervariable markers are quite useful in understanding the process of substructuring within the Indian castes, leading to the formation of smaller breeding isolates, the basic Mendelian units within which microevolutionary forces operate.

Genetic variation

Human Immunology | Volume 62, Issue 9, September 2001, Pages 1031-1041

Genetic variation among the Golla pastoral caste subdivisions of Andhra Pradesh, India, according to the HLA system

Authors Michael H Crawford a. B.Mohan Reddy b. Jorge Martinez-Laso c. Steven J Mack d. Henry A Erlich d.

a Laboratory of Biological Anthropology (M.H.C.), Department of Anthropology, University of Kansas, Lawrence, KS, USA

b Anthropometry and Human Genetics Unit (B.M.R.), Indian Statistical Institute, Calcutta, India

c Departamento de Inmunologı́a y Biologı́a Molecular (J.M.-L.), H. 12 de Octubre, Universidad Complutense, Carretera Andalucı́a, Madrid, Spain

d Roche Molecular Systems (S.J.M., H.A.E.), Alameda, CA, USA

e Children’s Hospital (S.J.M., H.A.E.), Oakland Research Institute, Oakland, CA, USA rights and content


The HLA allele frequency distributions have been characterized for the HLA class I and class II loci of the Golla pastoral caste, from Southeast India, subdivided into the subcastes (Puja, Punugu, Kurava, Pokanati, Karnam, and Doddi). Genetic distances, neighbor-joining, correspondence, and haplotype analyses all indicate that the subcastes exhibit a high haplotype variability and that their genetic substratum may be the result of European-Middle East/Asian admixture with the autochthonous populations. The Karnam subcaste seems to be the one that has undergone a higher degree of admixture, when compared with the other subcastes. The Golla speak an old Indian Dravidian language and should theoretically represent the basic Indian substratum that existed before the postulated “Aryan” invasion.


The autochthonous [indigenous] origin of Brahmins and the caste system

Journal of Human Genetics (2009) | 9 January 2009

The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system

Authors Swarkar Sharma1,2,4, Ekta Rai1,2,4, Prithviraj Sharma1,3, Mamata Jena1, Shweta Singh1, Katayoon Darvishi1, Audesh K Bhat1, A J S Bhanwer2, Pramod Kumar Tiwari3 and Rameshwar N K Bamezai1

1National Centre of Applied Human Genetics, School of Life Sciences, Jawaharlal Nehru University, New Delhi, India

2Department of Human Genetics, Guru Nanak Dev University, Amritsar, India

3Centre for Genomics, School of Studies in Zoology, Jiwaji University, Gwalior, India

Correspondence: Professor RNK Bamezai, National Centre of Applied Human Genetics, School of Life Sciences, Jawaharlal Nehru University, New Delhi 110067, India. E-mail: and

4These authors contributed equally to this work.


Many major rival models of the origin of the Hindu caste system co-exist despite extensive studies, each with associated genetic evidences. One of the major factors that has still kept the origin of the Indian caste system obscure is the unresolved question of the origin of Y-haplogroup R1a1*, at times associated with a male-mediated major genetic influx from Central Asia or Eurasia, which has contributed to the higher castes in India. Y-haplogroup R1a1* has a widespread distribution and high frequency across Eurasia, Central Asia and the Indian subcontinent, with scanty reports of its ancestral (R*, R1* and R1a*) and derived lineages (R1a1a, R1a1b and R1a1c). To resolve these issues, we screened 621 Y-chromosomes (of Brahmins occupying the upper-most caste position and schedule castes/tribals occupying the lower-most positions) with 55 Y-chromosomal binary markers and seven Y-microsatellite markers and compiled an extensive dataset of 2809 Y-chromosomes (681 Brahmins, and 2128 tribals and schedule castes) for conclusions. A peculiar observation of the highest frequency (up to 72.22%) of Y-haplogroup R1a1* in Brahmins hinted at its presence as a founder lineage for this caste group. Further, observation of R1a1* in different tribal population groups, existence of Y-haplogroup R1a* in ancestors and extended phylogenetic analyses of the pooled dataset of 530 Indians, 224 Pakistanis and 276 Central Asians and Eurasians bearing the R1a1* haplogroup supported the autochthonous origin of R1a1 lineage in India and a tribal link to Indian Brahmins. However, it is important to discover novel Y-chromosomal binary marker(s) for a higher resolution of R1a1* and confirm the present conclusions.


India comprises one of the largest ethnic populations with enormous cultural, morphological and genetic diversity,1, 2, 3, 4 which linguistically belong to Austro-Asiatic (AA), Dravidian (DR), Tibeto-Burman (TB) and Indo-European (IE) families.5 Indian populations are culturally stratified as tribals and non-tribals.6 Tribals constitute 8.08% of the total population7, 8 and the majority of them speak languages belonging to AA, DR and TB families;8 also, most of them are believed to be autochthones of India.9 On the contrary, most of the contemporary non-tribal populations belong to Hindu religion and speak languages of IE and DR descent. In addition, there are several other religious communities contributing a fraction to the total Indian population structure.6 The Hindu caste system has played a major role in the social and economic organization of India10 and is constituted by four major classes (varna)—namely, Brahmin (priestly class), Kshatriya (warrior class), Vyasa (business class) and Shudra (menial labor class).6 The fifth class, ‘Panchama’ (standing for tribals), was added at a later date, giving them the lowest rank.11 The co-existence of IE tribes and DR castes indicates a complex historical interaction and suggests no ‘one to one correlation’ between language and this social organization.12 In spite of the consensus on the relatively uniform maternal gene pool of Indian populations and the large efforts through many philological,13, 14 archaeological15, 16 and recent molecular genetic approaches to elucidate rival models,6, 7, 9, 11, 12, 17, 18, 19, 20, 21 the history and concepts of the origin of the caste system are still controversial and unclear. The competing main models (the first of them based on shared IE languages) suggest that contemporary Hindu Indians are descendants of primarily West Eurasians who migrated from the Near east, Antolia and the Caucasus 3000–8000 years ago,13, 14 which has been supported by the demic diffusion model1, 22 and validated by molecular genetic data.7, 11

The second model, based on molecular genetic data, mainly the Y-chromosomal M17 marker (R1a1 haplogroup), suggests the migration of IE people from Central Asia to India.23 Another model suggests that later on ‘not alone but a package’ of Y-haplogroups migrated from Central Asia, introducing the caste system to India.7 Yet another model suggests the late Pleistocene heritage of tribal and caste populations, with limited recent gene flow between them24 and the largely South Asian origin of Indian caste communities, indicating no major genetic influx either with the development of agriculture or with the spread of the Indo-Aryan (IE) language family.12 It has also been suggested that there was a minor influence from Central Asia and the pre-Holocene and Holocene era, not Indo-European expansions, which shaped the pre-existing South Asian gene pool.9 Alternatively, another recent study21 has suggested that distinct paternal distribution patterns exist among caste and tribal populations and tribals have contributed to the lower caste groups (schedule castes) as well as expansions and establishment of Indo-European populations as upper castes. All these proposed hypotheses make the question of the origin of the caste system and the relationship among these hazy and obscure.

This study was designed to evaluate these competing hypotheses of the origin of the caste system by taking into account the information available in the literature about the cultural practices of the Hindu caste system being endogamous.2, 11, 25, 26, 27 To test the concepts of Central Asian introduction of the Indian caste system7 by Indo-Aryans, who plausibly and predominantly appointed themselves to castes of higher rank to legitimize and maintain their power on land, labor and resources;14 and to test the rank-related West Eurasian admixture,11, 21 we chose the Brahmin class, occupying similar socioeconomic upper-most caste positions, and the schedule castes and tribals, occupying the lower-most positions in the Indian caste hierarchy, irrespective of linguistic and geographic affiliations within India, for an ideal comparative study.

Here, we report our analyses based on Y-chromosomal data of 621 Brahmins and schedule caste/tribal samples and its extension with the compiled data of Brahmin, scheduled caste/tribal populations from published sources. The total dataset represents 2809 Y-chromosomes (767 Brahmins, 2042 scheduled caste/tribal samples) constituting an extensive dataset of Brahmins and tribals (including 621 samples from this study). We also attempted to assess the affinities among Brahmins from different regions speaking different languages, and evaluated the hypothesis of large migration of IE people and introduction of the caste system to India with the purpose of elucidating their genetic relationship with other Indian and worldwide populations, using the data available in the literature. Further, taking into consideration the recent study28 that found a high level of male genetic substructure as a result of the founder effect and social stratification among the Brahmins and Kshatriyas of Jaunpur district, we also explored the probability of any such phenomenon or other genetic patterns in other regions of India.


The observation of R1a* in high frequency for the first time in the literature, as well as analyses using different phylogenetic methods, resolved the controversy of the origin of R1a1*, supporting its origin in the Indian subcontinent. Simultaneously, the presence of R1a1* in very high frequency in Brahmins, irrespective of linguistic and geographic affiliations, suggested it as the founder haplogroup for the population. The co-presence of this haplogroup in many of the tribal populations of India, its existence in high frequency in Saharia (present study) and Chenchu tribes, the high frequency of R1a* in Kashmiri Pandits (KPs—Brahmins) as well as Saharia (tribe) and associated phylogenetic ages supported the autochthonous origin and tribal links of Indian Brahmins, confronting the concepts of recent Central Asian introduction and rank-related Eurasian contribution of the Indian caste system.

However, there is a scanty representation of Y-haplogroup R1a1 subgroups in the literature as well as in this study. The known subgroups (R1a1a, R1a1b and R1a1c), which are defined by binary markers M56, M157 or M87, respectively (Supplementary Figure 1), were not observed. In such a situation, it is likely that this haplogroup (R1a1*) is a polyphyletic (or paraphyletic) group of Y-lineages. It is, therefore, very important to discover novel Y chromosomal binary marker(s) for defining monophyletic subhaplogroup(s) belonging to Y-R1a1* with a higher resolution to confirm the present conclusion. Further, the under-representation of phylogenetic data of the population groups of North India in the literature and our observations hint at the immense need of phylogenetic explorations in the northern most Himalayan regions of India, which might have acted as an incubator of many ancient lineages, to obtain a clearer picture of the peopling of India and Eurasia.


Siddi (Andhra Pradesh)

European Journal of Human Genetics (2001) 9, Nature Publishing Group All rights reserved

Y-chromosome SNP haplotypes suggest evidence of gene flow among caste, tribe, and the migrant Siddi populations of Andhra Pradesh, South India

Gutala Venkata Ramana1, Bing Su1, Li Jin1, Lalji Singh2, Ning Wang1, Peter Underhill3 and Ranajit Chakraborty*,1

1 Human Genetics Center, University of Texas, School of Public Health, Houston, Texas, TX 77030, USA;

2 Center for Cellular and Molecular Biology, Hyderabad, India; 3 Department of Genetics, Stanford University, Stanford, CA 94305, USA

From observations of lack of haplotype sharing based on Y-chromosome specific short tandem repeat (STR) loci, previous reports suggested negligible gene flow among different geographic populations of India. Using Single Nucleotide Polymorphism (SNP) sites in combination with STRs, we observed evidence of haplotype sharing across caste-tribe boundaries in South India. We examined 27 SNPs in the non-recombining region of the Y chromosome to investigate gene flow in 204 individuals belonging to three caste groups (Vizag Brahmins, Peruru Brahmins, Kammas), three tribes (Bagata, Poroja, Valmiki) and an additional group (the Siddis) of African ancestry. Principal component and AMOVA analyses show that the between group component of variation is non-significant (P40.05), while that among populations within the caste and tribal groups is significant (P50.001). In particular, the Valmikis and Siddis are close to the caste groups. Of a total of 11 distinct SNP-haplotypes observed, the two tribal groups (Bagata and Poroja) lack the haplotypes H4, H4A, H5A and H16, which are seen in the caste groups. In contrast, all three tribal groups exhibit the Southeast Asian haplotype H11 that is absent in the caste populations. The presence of haplotypes H4, H5, H14, and H16 in the Siddis indicate that they have assimilated considerable non-African admixture. The evidence of haplotype sharing between castes and tribes is also found when the H14 lineage was further subdivided by five STR loci. We conclude that even though these SNP-based Y-haplotypes are able to distinguish the populations, gene flow in these South Indian populations is not as negligible as that inferred from other studies based on Y-specific short tandem repeat markers.


Uniparental transmission along the male lineage, small effective population size and absence of recombination (except pseudo-autosomal region) are the salient features of Y chromosome1,2 that makes it suitable for tracing maleinitiated migrations. Extensive studies using DNA sequencing and HPLC have enabled to identify numerous single nucleotide polymorphisms (SNPs) on the Y chromosome.3,4 These SNPs are single base changes or insertion/deletions, which are slowly evolving in comparison with the short tandem repeat markers, which evolve more rapidly. India represents one of the most diverse regions in the world wherein the populations exhibit enormous diversity in terms of language, culture, and ethnicity. A vast majority of Indian populations belong to the Hindu religion and has over 2000 castes each of which belong to a socially stratified Hindu caste cluster.5 There are over 400 tribal populations in India in addition to other religious groups like Muslims, Sikhs, Christians, Jains and migrant groups such as the Parsees and Siddis.6 Earlier studies from India, based on Y chromosome short tandem repeat (STR) polymorphisms have shown that there is either negligible or no male gene flow among populations of India.7,8 In contrast, mtDNA d-loop sequence variation7 showed higher levels of female gene flow between related caste groups. In this research article, we provide new data on 27 Y-chromosome SNP sites in three castes, three tribes, and Siddis (a migrant population of African ancestry) of Andhra Pradesh, South India, and demonstrate that while these SNP markers reveal a substantial genetic variation among these groups, they also detect an evidence of male gene flow among these population groups.


Using the 27 biallelic markers we identified eleven haplotypes in 204 Y-chromosomes. The haplotype frequencies in various populations are shown in Table 1. Also, we present a phylogenetic tree for the present study (Figure 1) under the parsimony assumption,4,12 which assign H1 as the ancestral haplotype (also observed in the Chimpanzees). Both H1 and H2 are ancient haplotypes present in African and nonAfrican populations. Further, H5 defined by the M9 (C?G) mutation site appears to be the common ancestor for all haplotypes that are distributed in worldwide populations. H11, which is specific to Southeast Asia,12 is also present exclusively in the tribal populations. Counting the mutation events shown in Figure 1 of Underhill et al, 4 in total five mutation events are needed to derive the haplotype H16 from H1.

The principal component analysis of the haplotype distributions reveals that more than 87% of the haplotype variation is explained by the three principal components. The positions of the populations, by the three principal component scores, do not generally cluster them by their caste or tribal affiliation. For example, the Valmikis are closest to the Peruru Brahmins, both of which are also close to the Siddis (particularly based on the first two principal components). Also, the Vizag Brahmins and Peruru Brahmins are distant from each other (particularly based on PC1), although they belong to the same social hierarchy.

The variance decomposition (AMOVA) analysis of the SNP haplotype frequencies provides a quantitative support of the same trend of genetic affiliation of these populations. With the seven populations divided into three groups (caste, tribe, and the migrant), and using the phylogeny of the 11 haplotypes as described in Su et al.,12 the estimates of the variance components are shown in Table 2 (second column), along with their empirical levels of significance (column 3). While the populations are clearly distinguishable (Vp=8.8%, P50.001), the variance component ascribed to among group difference (Vg=6.2%) is not significant (P&0.063). Genetic contact of the Valmikis and Siddis with the caste populations alone does not explain this. Excluding them from the analysis, while the numerical value between group differences becomes larger (Vg=12.5%), it still remains nonsignificant. Haplotype sharing and frequency differences of haplotypes can be examined in the light of these observations. It is true that the caste populations (both Brahmin groups and the Kammas) can be distinguished from the two tribal groups (Bagata, Poroja), since the caste populations exhibit the haplotypes H4, H4A, H5A and H16, which are not present in the two tribal groups. In contrast, all the tribal groups show the presence of the Southeast Asian specific haplotype H11.12,13 However, the Valmikis share haplotypes H1A, H4, H4A, H5A and H16 with caste populations. They also exhibit the Southeast Asian haplotype H11, which is present in the other two tribes, but neither in any of the caste populations nor in Siddis.

The Siddis exhibit H1 and H2 haplotypes, a signature of their African ancestry. Since the H1 individual showed ancestral alleles at M1 (Yap-), M89 (C), and M130 (C) loci, we sequenced this individual for the M168 locus and observed ancestral allele (C). We also sequenced for the M60 locus, and observed insertion of T (1 bp insertion) at this locus (belonging to the haplogroup II).4 This haplogroup has been previously shown to occur widely in Africa. Thus, its presence in the Siddis corroborates the ancestry of the Siddis from Africa. In addition, they also have non-African haplotypes viz., H4, H5, H14, and H16 in their male gene pool, suggesting extensive admixture with the local Indian groups.


Our study on the haplotypic diversity based on Ychromosome SNPs demonstrates that the caste and tribal populations of Andhra Pradesh, South India can be distinguished by the presence of some haplotypes that are unique to these groups (H4, H4A, H5A, and H16 in the caste groups, and H11 in the tribals). However, the presence of haplotypes H4B, H5, and H14 in all of caste and tribal groups studied, and the presence of haplotypes H4A, H5A, H14 and H16 in the Valmikis raise the possibility of extensive gene flow across the caste-tribe distinction of populations in this region of the country. The AMOVA analysis of the frequency distributions of the 11 haplotypes supports this assertion. Of the haplotypes (H4B, H5, H14, H4A, and H5A) providing the suggestion of caste-tribe gene flow, a more detailed study of the H14 haplotype (defined by M45, an Eurasian marker, and presumably the youngest among these group of haplotypes) provides a further confirmation of our assertion. We have typed Y STR markers viz., DYS 19, DYS 389I, DYS 390, DYS 391 and DYS 393 in all individuals (except in seven individuals, because the DNA was exhausted) carrying H14 (n=49 in the combined sample of seven populations). The haplotypes constructed for these five STRs based on repeat size for each locus showed 30 distinct haplotypes, of which five are shared between the seven population groups (Table 3). Even more important is the observation that three of the haplotypes are shared between caste and tribal groups, pointing at the possibility of a recent gene flow between castes and tribes. Using analysis of molecular variance,10,11 we partitioned the allele size variances of the five STR loci of the H14 lineage, according to their population affiliation (Table 2, columns 4 and 5). There is no significant difference between the three population groups (Vg=79.7 P&0.987), while the distinction among populations is significant (Vp=23.2%, P50.001). As in the analysis of SNP haplotype data, exclusion of the Valmikis and Siddis does not affect this result. A longer antiquity of haplotypes, as compared to formation of caste and tribal groups, may be proposed to explain the observation of SNP-haplotype sharing of the Valmikis with the caste populations. Two lines of evidence suggest that this may not be the case. First, the non-significant group differences of SNP-haplotype diversity as well as STRhaplotype sharing between the castes and tribes (Tables 2 and 3) suggest evidence of gene flow across caste-tribe boundaries, rather than antiquity of haplotypes. Second, Underhill et al, 4 estimated that the average time of adding a new mutation in the non-recombining region of the Y chromosome is approximately 6900 years, which places H14 to have evolved (with three mutations) 20 700 years after H1, and H16 (with five mutations) 34 500 years after H1. With H1 estimated to be 44 000 years old,4 these may indicate that H14 and H16 may have existed at a time predating the separation of caste and tribes in India.14 However, we observed haplotype sharing between castes and tribes at the STR level as well within the H14 lineage (Table 3), some of which are at least two mutation steps different from each other. The non-significant caste-tribe group difference of the STR-haplotypes of the H14 lineage supports the gene flow hypothesis rather than the antiquity of the haplotypes. Our data also suggests that Siddis have assimilated considerable non-African Y chromosomes (haplotypes H4, H5, H14, and H16) from the local Indian populations. The arrival of the Siddis in India dates back to AD 110015 ± 17 and they have had social contacts with several local Indian populations. From the combined frequencies of the haplotypes of African (H1 and H2) and non-African haplotypes (H4, H5, H14, and H16), data shown in Table 1 indicates that at least 56% of the male genes of the Siddis could be of Indian origin, consistent with our estimate based on five STR loci we reported elsewhere.18

The tribal population of India

Diverse histories of tribal populations revealed by DNA analysis

European Journal of Human Genetics (2003) 11, 253–264

Mitochondrial DNA analysis reveals diverse histories of tribal populations from India

Authors Richard Cordaux1, Nilmani Saha2, Gillian R Bentley3, Robert Aunger4, S M Sirajuddin5 and Mark Stoneking1

1Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany

2Department of Pediatrics, National University of Singapore, Singapore

3Department of Anthropology, University College London, UK

4Department of Biological Anthropology, Cambridge, UK

5Anthropological Survey of India, Mysore, Karnataka State, India

Correspondence: Dr R Cordaux, Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, D-04103 Leipzig, Germany. Tel: +49 341 9952 537; Fax: +49 341 9952 555; E-mail:

Accepted 3 December 2002.


We analyzed 370 bp of the first hypervariable region of the mitochondrial DNA (mtDNA) control region in 752 individuals from 17 tribal and four nontribal groups from the Indian subcontinent, to address questions concerning the origins, genetic structure and relationships of these groups. Southern Indian tribes showed reduced diversity and large genetic distances, both among themselves and when compared with other groups, and no signal of prehistoric demographic expansions. These results probably reflect enhanced genetic drift because of small population sizes and/or bottlenecks in these groups. By contrast, northern groups exhibited more diversity and signals of prehistoric demographic expansions. Phylogenetic analyses revealed that southern and northern groups (except northeastern ones) have related mtDNA sequences albeit at different frequencies, further supporting the larger impact of drift on the genetic structure of southern groups. The Indian mtDNA gene pool appears to be more closely related to the east Eurasian gene pool (including central, east and southeast Asian populations) than the west Eurasian one (including European and Caucasian populations). Within India, northeastern tribes are quite distinct from other groups; they are more closely related to east Asians than to other Indians. This is consistent with linguistic evidence in that these populations speak Tibeto-Burman languages of east Asian origin. Otherwise, analyses of molecular variance suggested that caste and tribal groups are genetically similar with respect to mtDNA variation.


Archeological, fossil and genetic evidence points to a major expansion of anatomically modern humans out of Africa some 100 000 years ago,1,2 but the migration routes remain poorly understood. In this respect, the Indian subcontinent is considered to be a crucial geographic area for human migrations,3,4 since it is located at the crossroads of Africa, the Pacific and west and east Eurasia.

More than one billion people with enormous morphological, genetic, cultural and linguistic diversity inhabit the Indian subcontinent.5,6 At least four potential sources of genes contributing to the current Indian gene pool can be envisaged.3 The first one is an old Paleolithic component, probably almost extinct nowadays. The second component would have witnessed early Neolithic migrations of farmers from the eastern horn of the Fertile Crescent, probably speaking proto-Dravidian languages. The third source is responsible for the arrival of Indo-European speakers approx3500 years ago, who most probably introduced the caste system that hierarchically organized the vast majority of Indian society. The fourth component is associated with Austro-Asiatic and Tibeto-Burman speakers inhabiting east and northeast India, with ties to east Asia.

The molecular genetic data generated so far concerning the people of the Indian subcontinent have largely focused on caste populations rather than on tribal groups.7,8,9,10,11,12,13 According to the 1991 census, approx8% of the Indian population belong to tribal communities.14 They represent minorities that have not been absorbed by the caste system.3 They are generally thought to be the aboriginal inhabitants of the Indian subcontinent that were present in the region before the arrival of Indo-European speakers. There are currently about 400 tribes in India that vary in size from a few hundred to a few million; they speak languages belonging to all four of the major language families represented in India (Austro-Asiatic, Dravidian, Indo-European and Tibeto-Burman). Their origins and genetic affinities remain largely unknown, although such information is of primary importance in understanding the possible role of India in early migrations of modern humans, since any remnants of genetic contributions from pre-Indo-European migrants would presumably be present in tribal populations rather than in caste populations.

The molecular genetic evidence on Indian tribal origins and relationships is rather scanty. The mitochondrial DNA (mtDNA) intergenic COII/tRNALys 9-bp deletion marker, present at high frequency in some Asian15 and African16 populations, is found at low frequencies in India and arose multiple times independently.17,18,19 Based on a study of autosomal and mtDNA markers in eight Indian tribes speaking Austro-Asiatic, Dravidian or Tibeto-Burman languages, it was concluded that these different language groups represented distinct founding groups, with Austro-Asiatic speakers being the most ancient inhabitants of the region.19

With the aim of furthering our understanding of their origins and relationships, we report here a comprehensive study of mtDNA variation in Indian tribal groups. We analyzed sequences of the first hypervariable segment (HV1) of the mtDNA control region in more than 750 individuals belonging to different language families and compared these to available data from caste populations and from world populations, in order to obtain a global picture of the genetic structure and the relationships of populations inhabiting the Indian subcontinent.


Diversity indices and demographic parameters

Sequence data corresponding to nucleotide positions (np) 16 022–16 391 in the reference sequence24 were obtained from 752 individuals. Nucleotide substitutions were observed at 153 sites, which defined 316 different HV1 sequences. Some individuals exhibited length variation between np 16 181 and 16 183 (these positions were removed from the subsequent analyses). Deletions were observed at five sites (np 16 166, 16 179, 16 194, 16 195 and 16 258) and insertions at two sites (a C between np 16 169 and 16 170 and an A between np 16 189 and 16 190).

The 752 HV1 sequences from the present study were subsequently analyzed together with 219 previously published sequences from 10 Indian tribes, enabling a more comprehensive study of HV1 variation in Indian tribal populations. Since small sample sizes could potentially affect the reliability of the analyses, some populations were pooled. Pooling was performed according to several criteria including geographical proximity, linguistic affiliation, historical record and population relationships deduced from Fst distances. For example, the Mullukurunan and Mullukurumba are both south Indian tribes speaking a Kannada language (a Dravidian language-subfamily). Some scholars have argued that they are the same although they nowadays live in different areas, and it has been reported that the Mullukurunan are also known as 'Mullu Kurumba'.47 Moreover, the two tribes were separated by an Fst distance of -0.030, which is not significantly different from zero (P=0.84). Therefore the data from these two groups were pooled. In summary, this approach resulted in 23 groups, each of which was composed of at least 20 individuals .

Diversity indices and demographic parameters estimated for these groups are reported in Table 2. Overall, haplotype diversity in Indian tribals ranged from 0.671 to 0.995 and nucleotide diversity from 0.005 to 0.023. Haplotype diversity was significantly higher (Mann–Whitney U-test: Z=3.24, P<0.01) in north, east and northeast India (0.940–0.995) than in south India (0.671–0.939). Intermediate haplotype diversity values were observed in central India (0.884–0.985); they were not significantly different from north, east and northeast India (Z=1.51, P=0.13) or south India (Z=1.70, P=0.09). Similarly, nucleotide diversity in north, east and northeast India (0.014–0.023) was significantly higher (Z=2.78, P<0.01) than in south India (0.005–0.017). Again, central India exhibited intermediate values (0.012–0.017); they were not significantly different from north, east and northeast India (Z=1.61, P=0.11) or south India (Z=1.78, P=0.07). Therefore, north, east and northeast Indian tribes showed greater mtDNA diversity than south Indian tribes.

These patterns of genetic diversity in Indian tribes were further strengthened by the analysis of mean pairwise differences (MPD). MPD for south tribes (1.77–5.80) were significantly lower than MPD from north, east and northeast tribes (5.06–7.69; Z=2.78, P<0.01) or from central tribes (4.46–5.91; Z=2.04, P=0.04), whereas MPD from central and north, east and northeast tribes were not significantly different (Z=1.70, P=0.09). Mismatch distributions were computed for the 14 tribal and four nontribal populations from Table 1 whose sample size is greater than or equal to20. Unimodal distributions were observed mostly in northeast tribes and some central tribes, whereas all of the south tribes exhibited multimodal distributions with a higher frequency of the low difference classes (0 and 1): 25–61% of the pairwise differences for the south tribes were in the 0/1 classes vs only 1–13% for the other tribes. Unimodal distributions are interpreted as signs of demographic expansions while multimodal distributions are interpreted as signs of constant population size over time.27 Moreover, the peaks observed at 0/1 classes in the mismatch distributions indicated bottlenecks in these populations.48 In parallel, the raggedness index r was generally less than 0.03 in north, east and northeast tribes but more than 0.07 in south tribes . Values of r lower than 0.05 suggest demographic expansions while values of r greater than 0.05 are more consistent with constant population sizes.27 Fu's Fs also support these patterns of demographic history in India . Negative values of Fs that differ significantly from zero, indicative of population demographic expansions,26 were obtained in 86% of north, east and northeast tribes, 50% of central tribes and only 25% of south tribes. Therefore, several approaches provided congruent evidence for different demographic histories in Indian tribes. In general, north, east and northeast tribes showed signs of expansion while south tribes, and to a lesser extent central tribes, were likely to have experienced bottlenecks and/or constant population sizes over time.

The four nontribal populations exhibited high gene and nucleotide diversities . In addition, the Fs statistic as well as the raggedness index and mismatch distributions suggested demographic expansions in these populations.

Genetic structure of Indian populations

AMOVA was used to investigate the genetic structure of Indian populations, focussing either on tribes only or on both tribes and castes (Table 4). In the total tribal sample (model 1), 88% of the variance was found within populations and 12% among populations. Indian tribes were then grouped according to geographic proximity (model 2), to linguistic affinities (model 3) and to the results suggested by the MDS analysis, namely two groups separating northeast tribes from all others (model 4). Under these models, 86–88% of the variance was found within populations, 10% among populations within groups and 2–4% among groups. A model that accurately reflects the genetic structure should maximize the variance among groups and minimize the variance among populations within groups; therefore, none of these models provides a good description of the genetic structure, although model 4 is the best.

When the northeast, central and south groups of populations were analyzed separately, 22% of the variance was among populations in south tribes but only 3–5% in central and northeast tribes. These results emphasize the distinctiveness of south tribes from one another that was also evident in the MDS plots (Figure 3 and Figure 4). Therefore, the analyses corresponding to models 1–4 were repeated with the south tribes excluded (not shown). Again, none of these models provided a good description of the genetic structure. The best model grouped populations on the basis of linguistic criteria, and was the only model for which the 'among groups' component was larger than the 'among populations within groups' component.

We also compared tribes to caste populations. East and northeast tribes were excluded from the analyses since: (i) no HV1 data from east and northeast castes are avail-able, and (ii) these tribes speak Austro-Asiatic and Tibeto-Burman languages (respectively) which are spoken exclusively by tribal populations (preventing linguistically based analyses between tribes and castes). For the model based on social criteria (ie castes vs tribes; Table 4, model 5), 91% of the variance was within populations and 9% among populations within groups. The variance among groups did not differ significantly from zero (P=0.33), suggesting that the social distinction of castes vs tribes does not accurately reflect the genetic structure of Indian populations. Removing south groups from the analysis did not change the picture (not shown), with the variance among groups still not differing significantly from zero (P=0.57).


Origins of tribal groups

Our analyses of mtDNA variation in tribal populations of India indicate that groups in different geographic regions have different demographic histories. In general, southern tribes have reduced mtDNA diversity and mismatch distributions strongly indicative of recent bottlenecks. The distinctiveness of southern groups is also emphasized by the MDS analyses and AMOVA. However, it is difficult to distinguish from these data between old and severe bottlenecks or more recent and less severe bottlenecks. Present-day population sizes in southern tribes tend to be small (ie generally less than 30 000; Table 1), as compared to northern tribes (ie generally over 100 000; Table 1). Consequently, genetic drift could have generated large genetic distances both among southern groups and between southern and other groups, thereby masking their real affinities to other populations.48 According to this scenario, southern populations have related mtDNA sequences (albeit at different frequencies) and hence a shared history with other Indian populations. An alternative hypothesis is that southern tribes have a specific mtDNA gene pool as compared to other Indian populations, indicating a long period of isolation and/or different history from other tribal groups. An example of the latter is PNG,41,49 in which most sequences are found in two clusters (within clusters III and XI, Figure 5) characterized by long branches (not shown). However, southern Indian sequences are intermingled throughout the tree, clustering with sequences from multiple populations. In addition, south, central, and east Indian tribes all have similar mtDNA haplogroup compositions (Table 3). These results provide strong support for the hypothesis that southern tribes have mtDNA sequences closely related to those of other tribes, but with different frequencies, which would suggest fairly recent bottlenecks occurring in these populations.

A possible cause of these bottlenecks, put forth by Excoffier and Schneider,48 involves Neolithic human expansions. According to this hypothesis, the recent settlement of Indo-European speakers in India some 3500 years ago3 might have had a major impact on the demography of south tribes. West Eurasian mtDNA haplogroups H, JT and W represent 6–7% of north and central tribes (Table 3), which are located in the area where Indo-European languages are spoken. In contrast, these west Eurasian mtDNA types are virtually absent in south tribes, which are located where Dravidian languages are spoken. This might reflect different responses of local people to the Indo-European settlement of India. In the north and center, Indo-Europeans may have admixed with local people,50 concomitant with the spread of Indo-European languages. In contrast, in the southern part of India, local populations may have challenged the arrival of Indo-European newcomers, resulting in limited admixture, reduction of tribal population sizes and retention of their original languages, thus explaining why Dravidian languages survived the spread of Indo-European languages in south India.

Tibeto-Burman speakers from northeast India show closer genetic affinities with east Asian groups than with other Indian groups. This is suggested by the MDS analyses and mtDNA haplogroup composition, in that northeast Indian tribes possess haplogroups A and F, which are frequent in east Asians but virtually absent from other regions of India. The mtDNA evidence (Clark et al18, this study) thus agrees with Y chromosome evidence51 as well as linguistic evidence,52 indicating a probable east Asian origin of these particular tribes. Archeological, linguistic and genetic evidence51 suggests that proto-Tibeto-Burman languages arose 5000–6000 years ago in east Asia. The fact that Tibeto-Burman speaking tribes from India have retained genetic traces of east Asian origins for such a long time suggests that, despite the more recent migrations to India, these populations remained relatively isolated, explaining the close correlation between genetic and linguistic results. This contrasts with the situation observed in other regions in the world, for example in Scandinavia53 and the Caucasus,34 where migrations led to language replacements and hence to incongruencies between genetic and linguistic results.

Apart from northeast tribes, all other Indian tribes exhibited a similar and high frequency of mtDNA haplo-group M (ie 56% in northeast vs approx75% for others). These results are strikingly similar to a previous study based on RFLP analysis of a smaller set of Indian tribes,19 according to which haplogroup M had a frequency of approx51% in northeast tribes and approx76% in east, central and south tribes. This homogeneity in frequency of haplogroup M further supports the view that south tribes have sequences closely related to those of their neighboring populations. Thus, our mtDNA data are compatible with at least three major sources for the present-day mtDNA gene pool of Indian tribes: (i) a major one associated with all non-northeast tribes (whatever their linguistic or geographic ties), with a high frequency of mtDNA haplogroup M; (ii) one associated with Tibeto-Burman speakers from northeast India, with affinities to east Asians; and (iii) a third one associated with the presence of west Eurasian-typical mtDNA haplogroups (ie haplogroups H, JT and W, which represent 6–7% of mtDNA types in north and central tribes), most probably attributable to admixture with recent Indo-European-speaking migrants to India.50

Relationships between castes and tribes

The comparison between Indian castes and tribes revealed no strong difference between them, as pointed out by the AMOVA. A possible explanation for the observed similarities in caste and tribal mtDNA gene pools is common ancestry, with a proto-Asian origin of Indian castes.9 An alternative hypothesis involves a proto-west-Eurasian origin of castes, with the present-day similarities in caste and tribal mtDNA gene pools then being attributable to recent admixture with local Indian populations. The latter hypothesis would require extensive gene flow, which could seem a priori to be incompatible with the mating practices imposed by the caste system in India.6 However, there is evidence for female gene flow between Hindu castes10 and it has been suggested that male gene flow in south Indian populations may not be as negligible54 as previously thought.11 Our results show the presence of west-Eurasian typical mtDNA haplogroups in Indian tribes, presumably resulting from admixture with Indo-Europeans (ie who probably introduced the caste system in India). This interpretation would suggest that caste people initially possessed west-Eurasian mtDNAs rather than Asian mtDNAs. This view is reinforced by the fact that caste groups are more similar to west Eurasians (average Fst: 0.080) than are the tribals (average Fst: 0.149; and 0.117 if south tribes are excluded). Therefore, the similarities in caste and tribal mtDNA gene pools might reflect extensive maternal gene flow rather than common ancestry. However, caste and tribal populations are separated by an average Fst distance of 0.049 (if south tribes are excluded; 0.082 if they are included), suggesting that overall, castes are closer to Indian tribes than to west Eurasians. This makes the hypothesis of proto-Asian ancestry of castes equally plausible. The mtDNA data alone do not support one hypothesis over the other. Moreover, caste populations from different regions of India may have different origins,55 some of them being derived from west Eurasian ancestors with subsequent admixture with local populations, others being derived from local population ancestors via acculturation.

Relationships with other populations

mtDNA variation in India suggests that overall, Indian tribes show more affinities to east Eurasians than to west Eurasians. This means that migrations from the west (ie involving Indo-Europeans and Neolithic expansions of farmers) have not had a major distorting impact on the original gene pool. This view is consistent with the relatively small proportion of west Eurasian typical mtDNA haplogroups present in Indian tribes. On the other hand, three typical east-Asian mtDNA haplogroups (A, B and F) are absent or virtually absent from non-northeast India (Bamshad et al9 Kivisild et al,12 Roychoudhury et al,19 this study). Furthermore, the fourth typical east Asian mtDNA haplogroup M has a different structure in India as compared to other Asian areas.9 This suggests that, although they show close affinities, the east Asian and Indian mtDNA gene pools are fairly distinct. This result is consistent with the suggestion that the east Asian and Indian mtDNA pools have been separated from each other for about 30 000 years.49

It has been hypothesized that the peopling of Sahul (PNG and Australia) may have been the result of an early migration from east Africa through the Indian subcontinent following the 'southern route'.1,3,56 Australian populations exhibited an average Fst distance of 0.067 with east Eurasians and of 0.089 with Indians (but only 0.062 if South tribes excluded), whereas the average Fst values separating Australians from PNG or African (!Kung excluded) populations were 0.194 and 0.145, respectively. These results suggested close genetic affinities between Australian populations and both Indian and east Eurasian populations. An India–Australia connection is consistent with other mtDNA41 and Y chromosome57 evidence. Taken together with other conclusions,41,57 the present results give credence to the trihybrid model of peopling of Australia58 involving 'Negrito', east Asian and Indian sources. The Indian influence on Australia may be recent (ie <5000 years),41,57 thus much later than (and therefore independent from) the early migration that would have followed the southern route approx60 000 years ago.

In addition, Forster et al49 have proposed an mtDNA control region motif (16223C and 16357C) which could represent a signature of an early migration from Africa to Sahul through the southern route. This motif was not found in any Indian tribal mtDNA; 16357C had a frequency of only 2.1% and was always associated with 16223T, while 16223C had a frequency of 27.9%. Furthermore, Indians do not show particular affinities to Africans. A possible exception is the typical African HV1 sequence found in a Kuruchian from south India. However, there are communities in India such as the Siddis, who are known to be recent migrants from Africa.6 The African-like sequence found in India could therefore originate from admixture between recent African migrants and Indian tribals, or it may represent a remnant of an ancient migration from Africa to India; it is difficult to draw conclusions from a single sequence.

In summary, although the data support a recent India–Australia connection, we could not find in Indian tribals any unquestionable genetic signature of the approx60 000 year-old migration from Africa to Sahul following the postulated southern route. A possible explanation would be that such migration never occurred along that route. Alternatively, the early migrants from Africa may have made their way to Sahul following the southern route without settling in India. Another possibility, which is probably the most reasonable one, is that in India the genetic traces of early migrations along the southern route were erased by the subsequent migrations which shaped the present-day mtDNA gene pool of India.3


== Vysya (Andhra Pradesh)== Annals of Human Biology | Volume 29, 2002 - Issue 5

Population structure and genetic differentiation among the substructured Vysya caste population in comparison to the other populations of Andhra Pradesh, India

Authors N. Lakshmi, D. A. Demarchi, P. Veerraju & T. V. Rao

Pages 538-549 | Published online: 09 Jul 2009


Objectives :

The present paper focuses on the study of the patterns of genetic microdifferentiation among one of the substructured caste populations of Andhra Pradesh, namely Vysya, with reference to 17 other Telugu speaking populations from the same region of India. Subjects and methods : A total of 302 individuals from the three Vysya subgroups (101 of Arya Vysya, 100 from Kalinga Vysya and 101 from Thrivarnika) were typed in 17 blood groups and protein polymorphisms. Nei's gene diversity analysis, as well as neighbour-joining tree and UPGMA cluster diagrams, derived from standard genetic distances, R-matrix analysis and a regression model for investigating the patterns of external gene flow and genetic drift due to isolation under the island model, were done at two levels: (1) considering only the three Vysya populations and (2) considering common loci among 20 populations of Andhra Pradesh. Results : Seven of the 17 systems investigated were found to be monomorphic among all the three Vysya groups. The UPGMA tree and bidimensional scaling of the D 2 distances derived from R-matrix analysis show a very distinct cluster of Vysya populations. Application of the model of regression of average heterozygosity versus the distance of populations from the centroid shows the three Vysya populations placed as clear outliers above the theoretical regression line. Conclusions : Different approaches employed in this study give support to the hypothesis of different origin and/or demographic story for the three Vysya groups compared with other populations of Andhra Pradesh.

See also

Caste in India: genetics and heredity (easy reading)

Caste among Hindus

Caste in India: genetics and heredity (academic studies)

Racial Classification of Indian People

Anthropology in India

Personal tools