Research and Development

Research & Development

Our research & development in HLT


We combine research experts and the essential technical and administrative support in order to conduct cutting-edge research in text technology and use that as the basis for the development of innovative and relevant technological applications.

We establish ourselves as leaders in the field of Human Language Technology within South Africa and promote multilingualism and diversity within the digital environment.





Our Research


Research outputs in 2023


Deep learning and low-resource languages: How much data is enough? A case study of three linguistically distinct South African languages

Gaustad, T. & Eiselen, E.R. 2023. Data in Brief. April 2023

Read Article

A dataset of self-reported attitudes to Afrikaans swearwords

Van Huyssteen, G.B., Eiselen, E.R., Du Toit, J.S. 2023. Journal of Open Humanties Data. 2023

Read Article

Translation Technology in South Africa

Van Huyssteen, G.B., Puttkammer, M.J., McKellar, C.A., Griesel, M. 2023. In Routledge Encyclopaedia of Translation Technology, edited by S.W. Chan. Routledge, 373-383

Read Article

Ouderdoms- en inhoudsadvies vir Afrikaanse boeke vir kinders: resultate van ’n eerste kwalitatiewe en kwantitatiewe ondersoek

Van Huyssteen, G.B., Rabé, M, and Puttkammer, M.J. 2023. LitNet Akademies (Geesteswetenskappe) 20(1):185–212

Read Article

'n Empiriese vergelyking van die potensiële aanstootlikheid van enkele skelnaampare in Afrikaans [An empirical comparison of the potential offensiveness of some epithet pairs in Afrikaans]

Van Huyssteen, G.B., Koekemoer, S. 2023. Tydskrif vir Geesteswetenskappe 63:560–584

Read Article

Investigating the extent and usability of webtext available in South Africa's official languages

De Wet, F., Eiselen, E.R., Schillack, E., Puttkammer, M.J. 2023. Artificial Intelligence Research. SACAIR 2023. Communications in Computer and Information Science. Springer

Read Article

IsiXhosa Named Entity Recognition Resources

Eiselen, E.R. & Bukula, A. 2023. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22:2. pp. 1-19

Read Article

A methodology for the description of constructionalisation networks: Constructions with [in] as a case study

Van Huyssteen, G.B., Breed, A., Butler, A., Botha, L., Partridge, M., and Pilon, S. 2023. Stellenbosch Papers in Linguistics

A comparison of Statistical Tests for Likert-type data: The case of swearwords

Eiselen, E.R. and Van Huyssteen, G.B.  2023. Journal of Open Humanities Data

Read Article

Research outputs in 2022


isiXhosa named entity recognition resources

Eiselen, E.R.E., & Bukula, A. 2022. IsiXhosa Named Entity Recognition Resources. ACM Trans. Asian Low-Resour. Lang. Inf. Process, 22:2 pp. 1-19

Read Article

Linguistically annotated dataset for four official South African languages with a conjunctive orthography: IsiNdebele, isiXhosa, isiZulu, and Siswati >

Gaustad, T. & Puttkammer, M.J. 2022. Data in Brief. Volume 41, April 2022, 107994

Read Article


Research outputs during 2021


Standaardisering as ’n produk van die tydsgees

Van Huyssteen, G.B & Pilon, S. 2021. Ontlaering – Geworteldheid: Die onderrig van Afrikaans in spesifieke ruimtes

Read Article

When a word is befok

Van Huyssteen, G.B. 2021. Afrikaans Grammar Workshop III

Read Article

How Afrikaans women became fierce-tempered

Van Huyssteen, G.B. & Eiselen, E.R. 2021. Zürich Workshop on Afrikaans Linguistics.

Read Article

Swearing in South Africa: Multidisciplinary research on language taboos

Van Huyssteen, G.B. 2021. International Conference of the Digital Humanities Association of Southern Africa 2021

Read Article

Using ordinal logistic regression to analyse self-reported usage of, and attitudes towards swearwords

Van Huyssteen, G.B. & Eiselen, E.R. 2021. International Conference of the Digital Humanities Association of Southern Africa 2022

Read Article

Development of linguistically annotated parallel language resources for four South African languages.

Gaustad, T. & Puttkammer, M.J. 2021. 2nd Workshop on Resources for African Indigenous Languages (RAIL 2021), co-located with DHASA 2021.

Read Article

Canonical Segmentation and Syntactic Morpheme Tagging of Four Resource-scarce Nguni Languages

Du Toit, J.S. & Puttkammer, M.J. 2021. 2nd Workshop on Resources for African Indigenous Languages (RAIL 2021), co-located with DHASA 2021

Read Article

Oor feekse en helleveë [On shrews and harridans]

Van Huyssteen, G.B. & Eiselen, E.R.E. 2021. Tydskrif vir Geesteswetenskappe

Read Article

Quantitative analysis of Sesotho sa Leboa part of speech taggers

Mathe, D.S. and Eiselen, E.R.E. 2021. South African Journal of African Languages

Read Article

Content developers as stakeholders in the blended learning ecosystem: The Virtual Institute for Afrikaans’ Language Education Portal as a case study

Breed, A., Fouché, N., Brink, N., Coetzee, M., Erasmus, C., Kapp, S., Pilon, S., Huyssteen, G.B. and Wierenga, R. 2021. Re-Envisioning and Restructuring Blended Learning for Underprivileged Communities

Read Article

Research outputs from 2016 to 2020



NCHLT Web Services and CTexTools

Puttkammer, M. 2020. Tour de clarin, vol III

Read Article

Viability of Neural Networks for Core Technologies for Resource-Scarce Languages

Loubser, M. & Puttkammer, M.J. 2020. Information 11(1), 41

Read Article

Dataset for comparable evaluation of machine translation between 11 South African languages

McKellar, C.A. & Puttkammer, M.J. Data in Brief, Volume 29, 2020, 105146, ISSN 2352-3409,

Read Article

Die /r/ in Afrikaans: Fonetiese en fonologiese eienskappe

Wissing, D.P. & Pienaar, W. 2020. Literator

Read Article


Wissing, D.P. 2020. Journal of the International Phonetic Association

Read Article


“Wat gaan word van geskrewe Standaardafrikaans? [What is going to happen to written Standard Afrikaans?]”

VAN HUYSSTEEN, G.B. 2019. In Van der Elst, J. (ed.). SA Akademie vir Wetenskap en Kuns: Verlede, hede toekoms (1909-2019). 86-89. ISBN: 978-0-949976-97-0. Pretoria: SAAWK.

Read Article

Herbesoek aan Afrikaanse klemtoon: is dit (nog) ’n inisiëleklemtoontaal?

Wissing, D.P. LitNet Akademies, 16.2 (2019): 214-239.

Read Article

Perspektief op/ɛ/-verlaging in Afrikaans.

Wissing, D.P. LitNet Akademies, 16.1 (2019): 166-206.

Read Article


The Hulle en Goed Constructions in Afrikaans.


Read Article

Stabilising determinants in the transmission of phonotactic systems: Diachrony and acquistition of coda clusters in Dutch and Afrikaans

Wissing, D.P. 2018.

Read Article

The Status of Tone in Sesotho: A Production and Perception Study.

Wissing, D.P. 2018.

Read Article

Naar een Wikifonia.

VAN OOSTENDORP, M., VISSER, W. & WISSING, D. Nederlandse Taalkunde, 23.2 (2018): 141-150.

Read Article

Die ontwikkeling van [ʃ] in Afrikaans

WISSING, D. 2018. Literator. 39. 10.4102/lit.v39i2.1486.

Read Article


Afrikaanse Woordelys en Spelreëls [Afrikaans Wordlist and Spelling Rules].

TAALKOMMISSIE VAN DIE SUID-AFRIKAANSE AKADEMIE VIR WETENSKAP EN KUNS (COMP). 2017. Eleventh edition. ISBN (printed): 978-1-86890-207-1; ISBN (online): 978-1-86890-208-8. Cape Town, Pharos, 775pp.

Read Article

Voorwoord [Preface].

VAN HUYSSTEEN, G.B. 2017. In: Suid-Afrikaanse Akademie vir Wetenskap en Kuns. Afrikaanse woordelijs en spelreëls. Faksimilee-uitgawe [Afrikaans wordlist and spelling rules. Facsimile edition]. Pretoria: Protea Boekhuis.

Read Article

Morfologie. [Morphology].

VAN HUYSSTEEN, G.B. 2017. In: Carstens, WAM & Bosman, N. (reds.). Kontemporêre Afrikaanse Taalkunde. [Contemporary Afrikaans Linguistics]. Second edition. ISBN 978-0-627-03437-4. Pretoria: Van Schaik Uitgewers. pp. 177-214.

Read Article

Plosive voicing in Afrikaans: differential cue weighting and sound change.

WISSING, D.P. 2017. Journal of Linguistics

Read Article

Elektroniese woordeboeke en die Afrikaanse gemeenskap [Electronic dictionaries and the Afrikaans community]

VAN HUYSSTEEN, G.B. & Luther, J. 2017. Gents colloquium over het Afrikaans [Ghent colloquium on Afrikaans], University of Ghent, Ghent, Belgium.

Read Article

Constructionist perspectives on two competing associative plural constructions.

VAN HUYSSTEEN, G.B. 2017. 11th International Mediterranean Morphology Meeting, Nicosia, Cyprus.

Read Article


South African Language Resources: Phrase Chunking.

EISELEN, R. 2016. Tenth International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia. pp. 689-693

Read Article

Government Domain Named Entity Recognition for South African Languages.

EISELEN, R. 2016. Tenth International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia. pp. 3344-3348.

Read Article

Optical character recognition for South African Languages.


Read Article

The effect of respondents' skill levels in collaborative data annotation.

PUTTKAMMER, M.J. & VAN HUYSSTEEN, G.B. 2016. "Under-resourced Languages, Collaborative Approaches and Linked Open Data : Resources, Methods and Applications". Springer. Language Resources and Evaluation Journal Special Issue.

AfriBooms: An Online Treebank for Afrikaans.


Read Article

A stepwise methodology for establishing natural language processing evaluation reliability.

EISELEN, R. & VAN HUYSSTEEN, G.B. 2016. Language Resource and Evaluation.

What French for Gabonese French Lexicography.

NDINGA-KOUMBA, S., ASSAM, B.N. & OMPOUSSA, V. 2016. Lexikos. 26(2016): 1-31

Read Article

Die Virtuele Instituut vir Afrikaans (VivA) en markbehoeftes in die Afrikaanse gemeenskap.

VAN HUYSSTEEN, G. B., BOTHA, M. & ANTONITES, A. 2016. Tydskrif vir Geesteswetenskappe. 56(2-1): 410-437.

Read Article

Research outputs from 2011 to 2015


Afrikaans and Dutch as closely-related languages: A comparison to West Germanic languages and Dutch dialects.

HEERINGA, W., DE WET, F. & VAN HUYSSTEEN, G.B. 2015. Stellenbosch Papers in Linguistics Plus. 47(2015): 1-18.

Read Article

Planning and Macrostructural Elements for a Multilingual Culinary Dictionary of Gabonese Languages.

OMPOUSSA, V. & NDINGA_KOUMBA-BINZA, S. 2015. Lexikos. 25 (2015): 507-524.

Read Article

Translation Technology in South Africa.

VAN HUYSSTEEN, G.B. & GRIESEL, M. 2015. In: Chan, S-W. (ed.). Routledge Encyclopedia of Translation Technology. ISBN: 978-0-415-52484-1. New York: Routledge. 326-336pp.

Read Article

Aan die en besig in Afrikaanse progressiwiteitskonstruksies : 'n korpusondersoek (2) : navorsings- en oorsigartikel.

VAN HYUSSTEEN. G.B. & BREED. A. 2015. Tydskrif vir Geesteswetenskappe. 55(2):251-269.

Read Article

Palatalisation of /s/ in Afrikaans.

WISSING, D.P., PIENAAR W. & VAN NIEKERK, D. 2015. Spilplus. 48(2015): 137-158.

Read Article

Bilingual speech rhythm: Spanish-Afrikaans in Patagonia.

COETZEE, A.W., LORENZO., G.A., HENRIKSEN, A. & WISSING. D.P. 2015. In The Scottish Consortium for ICPhS 2015, eds. Proceedings of the 18th International Congress of Phonetic Sciences. London: International Phonetic Association: London.

Read Article

HLT and the changing face of translation - a CTexT perspective.

FOURIE. W. 2015. Boers, M. ed. Proceedings of the South African Translators' Institute's Second Triennial Conference. Johannesburg: SATI. p. 18-20). ISBN: 978-0-620-68208-4


Afrikaans and Dutch as closely-related languages: A comparison to West Germanic languages and Dutch dialects.

HEERINGA, W., DE WET, F. & VAN HUYSSTEEN, G.B. 2015. Stellenbosch Papers in Linguistics Plus. 47(2015): 1-18.

Read Article

Planning and Macrostructural Elements for a Multilingual Culinary Dictionary of Gabonese Languages.

OMPOUSSA, V. & NDINGA_KOUMBA-BINZA, S. 2015. Lexikos. 25 (2015): 507-524.

Read Article

Translation Technology in South Africa.

VAN HUYSSTEEN, G.B. & GRIESEL, M. 2015. In: Chan, S-W. (ed.). Routledge Encyclopedia of Translation Technology. ISBN: 978-0-415-52484-1. New York: Routledge. 326-336pp.

Read Article

Aan die en besig in Afrikaanse progressiwiteitskonstruksies : 'n korpusondersoek (2) : navorsings- en oorsigartikel.

VAN HYUSSTEEN. G.B. & BREED. A. 2015. Tydskrif vir Geesteswetenskappe. 55(2):251-269.

Read Article

Palatalisation of /s/ in Afrikaans.

WISSING, D.P., PIENAAR W. & VAN NIEKERK, D. 2015. Spilplus. 48(2015): 137-158.

Read Article

Bilingual speech rhythm: Spanish-Afrikaans in Patagonia.

COETZEE, A.W., LORENZO., G.A., HENRIKSEN, A. & WISSING. D.P. 2015. In The Scottish Consortium for ICPhS 2015, eds. Proceedings of the 18th International Congress of Phonetic Sciences. London: International Phonetic Association: London.

Read Article

HLT and the changing face of translation - a CTexT perspective.

Fourie, W. 2015. Boers, M. ed. Proceedings of the South African Translators' Institute's Second Triennial Conference. Johannesburg: SATI. p. 18-20). ISBN: 978-0-620-68208-4

Research outputs from 2007 to 2010



Accelerating the Annotation of Lexical Data for Less-Resourced Languages.

Van Huyssteen, G.B. & Puttkammer, M.J. 2007. Accelerating the Annotation of Lexical Data for Less-Resourced Languages. (In Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007). p. 1505-1508.)

Read Article

Evaluating Wrapped Progressive Sampling for Automatic Algorithmic Parameter Optimisation.

GROENEWALD, H.J., VAN HUYSSTEEN, G.B. & PUTTKAMMER, M.J. 2007. (In Angelova, G., Bontcheva, K., Mitkov, R., & Nikolov, N., eds. Proceedings of Recent Advances in Natural Language Processing 2007, Borovets, Bulgaria. p. 251-255.)

Read Article

Using Machine Learning to Annotate Data for NLP Tasks Semi-Automatically

Van Huyssteen, G.B., Puttkammer, M.J., Pilon, S., & Groenewald, H.J. 2007. (In Orasan, C. & Kuebler, S., eds. Proceedings of International Workshop on Computer-Aided Language Processing, Borovets, Bulgaria.)

Read Article

Accelerating the Annotation of Lexical Data for Less-Resourced Languages

Van Huyssteen, G.B. & Puttkammer, M.J. 2007. Presentation delivered at the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp

Read Article

Datagebaseerde Aspekte van Afrikaanse Reduplikasies

Van Huyssteen, G.B. & Wissing, D.P. 2007. Southern African Linguistics and Applied Language Studies, 25(3): 419-439

Read Article

Global and local durational properties in three varieties of South African English

Coetzee, A.W. & Wissing, D.P. 2007. Linguistic Review, 24:263-289

Read Article

Gevorderde akoestiese korrelate van Afrikaanse klemtoon

Wissing, D.P. 2007. Southern African Linguistics and Applied Language Studies, 25

Read Article

Basiese akoestiese korrelate van klemtoon in Afrikaans

Wissing, D.P. 2007. Southern African Linguistics and applied Language Studies, 25:441-458

Read Article

Testing the use of Lessac's Tonal NRG as a voice building tool for female students at a South African University

MUNRO, M & Wissing, D.P. 2007. Voice and Speech Review

Read Article

Automatic Parameter Selection for Effective Afrikaans Lemmatisation.

Groenewald, H.J., Van Huyssteen, G.B. & Puttkammer, M.J. 2007. Presentation delivered at the Recent Advances in Natural Language Processing (RANLP) 2007, Borovets, Bulgaria.

Read Article

Heroorweging van Fleksie in Afrikaans

Van Huyssteen, G.B. & Groenewald, H.J. 2007. Voordrag gelewer by LVSA/SAALA/SAVTO 2007, NWU Potchefstroomkampus

Read Article

Requirements for Machine-Aided translation Tools

Van Huyssteen, G.B. & Groenewald, H.J. 2007. Presentation delivered at LVSA/SAALA/SAVTO 2007, NWU, Potchefstroom campus

Feature Selection and Parameter Optimisation for Effective Afrikaans Lemmatisation

Van Huyssteen, G.B. & Groenewald, H.J. 2007. Presentation delivered at the International 17th Meeting of Computational Linguistics in the Netherlands (CLIN) 2007, University of Leuven, Leuven.

Read Article

ʼn Fleksievormgenereerder

Pilon, S. 2007. Voordrag gelewer by LVSA/SAALA/SAVTO 2007, NWU, Potchefstroomkampus

MT for English-isiZulu/Afrikaans

Pilon, S. & Pienaar, J.A. 2007. Presentation delivered at LVSA/SAALA/SAVTO 2007, NWU, Potchefstroomcampus

Lexicon Creation and Management: TurboAnnotate

Van Huyssteen, G.B. & Puttkammer, M.J. 2007. Presentation delivered at LVSA/SAALA/SAVTO 2007, NWU, Potchefstroom campus

Developing Web-Based Word-Translators

Van Huyssteen, G.B., Puttkammer, M.J. & Schlemmer, M. 2007. Presentation delivered at LVSA/SAALA/SAVTO 2007, NWU, Potchefstroom campus

Nadruk in Afrikaans: akoestiese kenmerke en metodologiese oorwegings by die vasstellings daarvan

Wissing, D.P. 2007. Voordrag gelewer by LVSA/SAALA/SAVTO 2007, NWU, Potchefstroomkampus

More on acoustic correlates of stress.

Wissing, D.P. 2007. Presentation delivered at the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp.

Read Article


  All the latest on our Projects

Have a look at the recent projects we've done for our various clients. The development of the resources and or software is described with the output available for downloading.

Autshumato PP


Autshumato encompasses a series of projects that develop machine translation systems for South African languages. Here you'll find the work we've done for the various tools offered within Autshumato.       




Projects Details

Sadilar PP


We are the official text node for SADiLaR with a focus on the advancement of multilingualism. We develop text resources for our under-resource languages which is crucial for being able to develop within big data and artificial intelligence within the SA context. Here we develop linguistically enriched corpora, core technologies and proofing tools. 

Full Projects Details

Viva PP

VivA Afrikaans

We've been collaborating with the Virtuele instituut vir Afrikaans (VivA) by maintaining the content and technical services of the Corpus and Dictionary Portals. With over 85 million words in the Corpus Portal and over 50 dictionaries and word lists in the Dictionary Portal, we make sure that the systems are up to date with the latest etymology, spelling, and meanings.





We ensure long-term sustainability for research and development activities. This establishes valuable partnerships with academic and industry partners with in an interest in natural language processing and computational linguistics.



Software & Resources

Have a look at our applied technologies.


Our compilation of collections of texts with a focus on resource-scarce languages of South Africa for further research and development.


Core Technologies

Morphological analysers, Part-of-Speech (POS) taggers and Lemmatisers are the core technologies we develop resources for.


Translation Aids

Have a look at our work within machine translation and other tools within our Autshumato projects and our Spelling Checkers.


Kom ons ondersoek die moontlikhede
wat menslike taaltegnologie vir jou kan bring.

Wil jy saamwerk aan navorsing, 'n projek in gedagte hê of stel jy belang om taaltegnologie te studeer?

Kontak ons