Language Continuums and People Clusters:  Adjusting Ethnic Entities in Reference to Language Research Updates
Dr. Orville Boyd Jenkins

An editor at PeopleGroups.org contacted me with a question about the entities Ndengereko and Rufiji.  He was referring to research indications that the name Rufiji refers to one group of people that actually speak the same language as that designated Ndengereko, by the people of the same name.

Linguists investigating the speech forms in the southeastern area of Tanzania indicated their findings led to these conclusions.  This editor asked for my evaluation on proposed changes in the PeopleGroups.org database concerning these and a related entry in their database.

Clarifying the Context
First, these are primarily language questions.  The language analysis is important in determining the discrete groups of African peoples.  Language is an important factor, and sometimes the determining factor, in classifying ethnicities, particularly in the Bantu family.

All these languages are forms of speech in a continuum across three major groupings: Matumbi, Yao-Makonde and Zigula-Zaramo.  It also involves the Swahili cluster, in regard to some forms of speech called "Maraba" (related to Makwe and Makonde).  

There has been active investigation and vigorous discussion among linguists for over a hundred years on the dialects that are grouped into these 4 language clusters.  There are differences in where different linguists draw the line between dialects/languages, and which dialects they place together into larger "languages."

Analysis by Pioneer Bible Translators (PBT) indicates that the speech of the group called Rufiji is the same as the speech of the group called Ndengereko.  The Ethnologue has assigned separate language codes to these two, based on earlier indications that they were different enough to be considered separate languages.

PBT report they have experienced resistance from the editors of the Ethnologue in making definitive changes to the Ethnologue language trees based on the new research presented by PBT.  We need to watch for published studies or reports from PBT, to remain of aware of their ongoing findings.  These and other field linguists can provide details that will affect our understanding of the ethnicities related to these languages.

In some of these cases, the information from PBT, reported to us by a contact in the field, is inconclusive, especially as relating to ethnicity.  Perhaps we can follow up later.  I'll mention some critical items in basic comments below.

Important work has been done just in the past year affecting these specific questions, and I will mention some considerations from that, published in a doctoral dissertation at the University of Leiden in Holland.  The author, Pieter Kraal, references many linguistic sources to provide background into the related dialects, which he had to reference as he prepared to develop a dictionary of one of the dialects of Makonde.

I am also in communication with a Swedish PhD student who performed a field study in July 2006 on the Ndengereko and related speech forms.  She will be developing a comparative dictionary for Ndengereko.  This study will provide important clues for our understanding of the related ethnicities.

This question, based on my previous detailed discussions with field personnel working in the Rufiji River are of Tanzania, was very complicated.  After some extensive probing and detailed investigation into this complex group, the immediate answer to the question appeared easier than I first thought.

A confusion arose from the term "Greater Ndengereko" in an initial communication from a related researcher.  This term at first seemed to refer to the whole cluster of languages in which SIL and others have classified the Ndengereko language (called Matumbi in the Ethnologue).

As I got farther and farther into the details, it appeared the focus was on the various forms of Ndengereko speech, such as the Rufiji, and secondarily the Matumbi.  These are all part of the larger language-culture cluster called Matumbi by the Ethnologue.  So I answer the question in regard to these three languages/dialects.

In my original analysis of the situation, I suggested that Rufiji be considered as a sub-group of Ndengereko. My Swedish contact also agreed from her separate field analysis just finished at that time.

In our email exchanges, she makes the following initial determination relating to Ndengereko:

Regarding your analysis, I think I can say now with some certainty that Rufiji is a dialect of Ndengereko. There are other dialects as well, but differences seem to be small and they all refer to themselves as Ndengereko, depending on what level of identity they answer on, as you wrote. Matumbi are almost always referred to as another group, although most Ndengereko say they understand Kimatuumbi to a large extent. Only one old man referred to the Matumbi as a subgroup of Ndengereko. A peculiar difference between these (probably) two ethnic groups is that the Matumbi seem to take more pride in being Matumbi and speaking Kimatuumbi than do the Ndengereko. The Ndengereko also seem to be more looked down upon by others.

        Eva-Marie Ström, personal email communication, 15 September 2006

Suggested Solution:
I found that PeopleGroups.org already had these two as one entity.  I agree with the suggestions to make two segment entities under that.  To implement the data manager's suggestion, he would just change the name of his current entity, Rufiji-Ndengereko, to Ndengereko of Tanzania, as he proposed.

This seems simpler than deleting the current entity named Rufiji-Ndengereko, and creating a new entity named Ndengereko.  All he is doing in the edit he proposes is changing the name of the parent entity, to which he will add two segments for added information.

Coding the Segments
Currently the Registry of Peoples (ROP) assigns separate codes for these two entities, Rufiji and Ndengereko.  In later edits, these may be combined.  The purpose of the ROP codes is to enable various databases to make matches when comparing and exchanging data between the two.

A database that wishes to show these entities as segments of one people can use the applicable ROP people codes and ROL (SIL) language codes for the two entities.  As PeopleGroups.org had the single entry, Ndengereko-Rufiji, this was not an option for them.  Thus before the addition of the segments under the primary entity, the PeopleGroups.org info did not fully correlate to the Ethnologue language info and the ROP ethnic codes.

Thus the idea of filling this out with two segments offers an option which did not appear with only one record (entry).  Adding two segments under the main entry of Ndengereko or Ndengereko-Rufiji enables the database to match the segments more specifically to the SIL and ROP codes representing these.  

If at some later edit, either the ROP people code or the ROL (SIL) language code should change, the single code would still be applicable to the parent entity, as well as to the two segment entities.

Ndengereko ROL [ndg]  ROP code 107158
Rufiji     ROL [rui]  ROP code 108434

For now this indicates that the two names Rufiji and Ndengereko refer to two smaller groups of the primary people Ndengereko, with the name Rufiji indicating those who live along the River and on its islands, according to the demographic information indicated by the agency's research.

In regard to this edit, the relationship within the greater Matumbi cluster or the related Yao or Makonde families is a side issue.  This editor and researchers sourcing the database can also keep separate population and other statistics, giving them better flexibility for future adjustments as language data changes or becomes firmer.  Additional changes can be made after other relationships are clearer.

Codes for Parent Entities
As for the parent entity, Ndengereko (now named Rufiji-Ndengereko), there is no specific language or people code that matches this parent entity of the two segments.  Because the two codesets currently have a separate code for the two segments as separate entities, this parent entity of Ndengereko will remain without a code.

When matchups and comparisons are done with other databases, the codes assigned to the separate segment entities in PeopleGroups.org will turn up a match with other entities at any level in the partner database, where the corresponding codes for Ndengereko or Rufiji language or people have been assigned.

On the parent entity, you can leave off both these codes, as with most of the parent entities in PeopleGroups.org, or you can apply the one associated with the parent entity name (Ndengereko: ndg, 107158).

Name Change
In displaying their information on their website, PeopleGroups.org encounters one technical limitation with the new entry.  On parent entities with segments, the web interface is set to show only the primary (parent) entity.  Here if the new name would be only Ndengereko (instead of the current Rufiji-Ndengereko), there will be no notice to users that Rufiji is included (has been accounted for) in the new entity.  This could be indicated by retaining the older entity name Rufiji-Ndengereko, or changing the name to Ndengereko-Rufiji, instead of just Ndengereko.

The Broader Clusters
There are other closely-related speech forms that are involved in the broad evaluation from which the Rufiji-Ndengereko question arose.

Matumbi came up as another factor, and we can monitor the results of research as this comes up.  The names used here reflect a common naming pattern all over Africa.  The Bantu trend is to identify in clan or family units as they move, and to refer to them by where they live.  Sometimes a famous ancestor's name will be given to the place, or the clan will take his name and that becomes a tribal designation.

Thus we find that the use of words meaning "river," "mountain," plain, "rock," etc., is common.  These help us to understand the history and sometimes the current relationship.  Determining factors lie in how similar their speech remains, and how they identify themselves in relation to others.

The speech forms might be classified as one language, while their self-identification indicates separate ethnicity.  Often it is a gradation of variations.  This is why the analyses and classifications vary so much.

PeopleGroups.org is primarily meant to be an ethnic database, with language as a prominent factor in the identification.  So language is often, but not always, the deciding factor.

The related problem in Africa has been the tendency to subdivide too much, so linguistically, the PBT researchers who are trying to simplify the family tree of languages and dialects are on the right track.  We still would need to know more relational and social factors to determine ethnicity.

The Matumbi people appear to be well-accounted for, as the name represented an identifiably different form of speech (as in the Ethnologue) and a distinct location (mountains, as well as an established ethnic classification).  We should monitor ongoing investigation to see if more information discovered (or known but not yet reported to us) by the translators would affect the ethnic classification status related to this name.

Another name mentioned in this discussion was "Ngatwa," from the word for "island."  I can find absolutely no reference at all the use of the word "ngatwa" (island) as either a linguistic (kingatwa) or ethnic (wangatwa) designation.  It appears that, if locally the term Ngatwa" is used to refer to some group of people, the "Ngatwa" referred to are accounted for either in the Rufiji or the Ndengereko.

Researchers for PBT report that the name Machinga might not really refer to a language.  If SIL's Ethnologue still lists a Machinga (MVW/mvw) language, but new research indicates this was not really a separate language, and researchers cannot find a separate people called this today, should we continue to list an entity called Machinga?

Two matters are involved here.
1. One question is the validity of the older information on which the Ethnologue report is based.  The Ethnologue has an entry for a Machinga language, indicating that this name at some time was reported to refer to a distinct, identifiable speech form.  Secondary indications are that this name was also likely associated with a distinct ethnicity speaking that language variety.  

The PBT researchers report means current indications lead them to believe there is no evidence for this entry, and it should therefore be removed in an update to the SIL language database.  (The editing decision is another step after various field reports or recommendations are submitted by various linguists.)

Keep in mind, Ethnologue entries and linguistics reports, such as information from field surveys by PBT report, relate to speech.  On the other hand, ethnic databases must consider additional cultural factors.  PeopleGroups.org, for instance, focuses on ethnicity, so language is only one of multiple factors involved here.

Initial Editing Strategy
I suggest an entity be retained under the name Machinga, pending further information from field investigations.  Hopefully, PBT will publish their findings.  I also expect other field analysts will have contributions.  We need further clarification on what is involved here.

Multiple Groups
2. A second matter is whether other listings of the name "Machinga" as a language or dialect are referring to the same cluster related to the Ndengereko.  This word is used to refer to at least two different clusters of languages or dialects, in a broader related set of 4 clusters. Pieter Kraal (PhD in Linguistics, University of Leiden, 2005), in his October 2005, dissertation, reports on extensive personal field investigations related to this along with many other ethnic and language groups.

Kraal refers to and quotes other linguists who refer to both people and speech forms by this name.  The language consistently mentions "the Machinga" or "the Machinga people" as well as "Machinga dialect."

There is a reported Machinga dialect of Yao, spoken in Masasi district.  The other uses of Machinga are related to Mozambique, also related to Yao, but different enough to be considered a separate language.  It is also proposed by some that this speech is related to Mwera or Makonde, since the related forms of speech all run together.

There are other factors here inconsistent with the minimal report we had from the Pioneer translators.  The word machinga, according to Kraal, means "mountains" in Yao-related speech, and is used to refer to the people of this cluster who live in hilly areas among the Yao.

This is obviously a different factor than the Pioneer people have noted, and might indicate a productive alternate track to investigate for clarification.  This is a factor that indicates caution in considering the derivation reported by the Pioneer translators.  There may various uses of this word other than as a tribal self-name.

Kraal also reports that the name "Machinga" seems -- at least in one case -- to be associated with the town of Mchinga, near Lindi.  Perhaps this line of investigation has been tried, but it indicates a possible source of clues to this discrepancy.

Thus we have three possible meanings for the name:
hawker (from Swahili?),
mountain/hill (from Yao, the associated speech form) and
Mchinga, a town name.

The PBT indicate they have been able to find no person who calls himself by the name Machinga.  The fact that the translators have found no one who CALLS HIMSELF "Machinga" is not exactly definitive, or even relevant, for ethnic identification.

The way Bantu peoples refer to themselves, their speech, their location and the other groups of people they are related to is complex.  The point here is not what name is used.  But what this word was meant to represent as used by linguists or ethnic researchers.

The name could refer to a speech form documented in a mountainous or hilly area, but the people might call themselves simply "Yao," or some clan or ancestor or village name.  This is a maddeningly common factor among Bantu people.

The name could refer to a certain group or groups of people or families in hilly areas associated with Yao speech, while the individual peoples might be referred to by a village name, river name, individual hill name, ancestor name, etc.

Thus we need more information before we make any changes to the entry for Machinga people or language entities.

Current Recommendation for Editing Related to "Machinga"
This strengthens my initial recommendation that the Machinga entry be retained, as further verification and clarification is obtained.

Still in Current Use
Current working discussions on current reports from the field indicate this name is still used to refer to current entities.

There was one odd report of the name "Machinga" associated with the Maasai, who are not contiguous with these Bantu clusters we are considering here.  The word form and existing instances of the word "machinga" mean "mountain."

I expect the indication about a Machinga group in Masasi is wrong.  Perhaps this is based on referenced older information.  Specific investigation by linguists or ethnologists with the meaning "mountain" or "hill" in mind might uncover definitive information on this question.

Kwere and Nghwele
Evidence seems to indicate that entities Kwere and Nghwele should be merged under the heading of Nghwele, to represent one entity.  I address this in another article about the Ndengereko and related cluster of peoples  Language, Tribe and Ethnic Clusters: Analyzing the Ndengereko Cluster in Tanzania].

It seems that these entries should be considered duplicates.  However, I note that in the database in question, there are different populations associated with the two PeopleGroups.org entities.  This should be clarified.  Should the populations of the two entities be simply added?  Or is the smaller Nghwele meant to be a sub-group out of the total under the current name Kwere?  I could not get anything definitive on this from other sources.

Internal data should be evaluated to determine what was intended by the earlier separate entities.  Maybe some Tanzania source has a current population estimate.  Data managers should check to see if their sources list either or both names, and if there is any related description that might clarify separate population and other demographic info.  I did not find any such sources.

The PeopleGroups.org data review suggested retaining the entry for Kami, with the added note that there appears to be an ongoing assimilation of the Kami people into neighbouring peoples and language groups.

I agree.  However, I found an interesting anomaly here.  I found the name Kami in listings for Tanzania in PeopleGroups.org, but it had language code kmi, which is Kami in Nigeria, and people code 112407, which is Kami people in Nepal and India.  I found it is also missing from the Joshua Project list.  I have investigated this and found this is a valid entity, accounted for in other sources.  It apparently was just "missed out" somewhere along the line.

Is this entity already a segment of something, so that it does not show up in PeopleGroups.org?

Even more interesting, I also found these three entities with name Kami had been combined under one ROP code in the proto-data from the ORIGINAL COMPILATION OF ALL THE source DATABASES originally comprising the Registry of Peoples (ROP).  Multiple sources currently indicate this is a valid ethnic group and language name for an entity in Tanzania (unrelated to another people of the same name in Nigeria).

The Kami entity in Tanzania now appears in the ROP, with code 114977.  The Registry of Languages (Ethnologue) code for Kami is kcu.

The other Kami entities appear as:
        Kami, Nigeria, 114978, kmi
        Kami, India/Nepal, 112407, nep (Nepali language)

Chart of languages and peoples involved with respective codes
 Name            ROL         ROP codes

 Kami            [kcu]           ROP code 114977
 Kwere          [cwe]          ROP code 107266
 Machinga     [mvw]          ROP code 105974
 Machinga     [yao]            ROP code 110980
    (Yao Dialect)
 Matumbi       [mgw]         ROP code 106409
 Ndengereko [ndg]           ROP code 107158
 Rufiji             [rui]            ROP code 108434
 Yao              [yao]           ROP code 110980

Also related:
Language, Tribe and Ethnic Clusters:  Analyzing the Ndengereko Cluster in Tanzania

Initially addressed 4 July 2006 in an email response to questions from a researcher.
Finalized and posted on Thoughts and Resources 05 December 2006
Last edited 4 August 2010

Orville Boyd Jenkins, EdD, PhD
Copyright © 2006 Orville Boyd Jenkins
Permission granted for free download and transmission for personal or educational use.  Other rights reserved.

Email:  orville@jenkins.nu
