Research and Development

 
 
 
 
Research & Development

Our research & development in HLT

 

We combine research experts and the essential technical and administrative support in order to conduct cutting-edge research in text technology and use that as the basis for the development of innovative and relevant technological applications.

We establish ourselves as leaders in the field of Human Language Technology within South Africa and promote multilingualism and diversity within the digital environment.

 

 

 

Research

Our Research

Publications

Research outputs in 2023

 

Deep learning and low-resource languages: How much data is enough? A case study of three linguistically distinct South African languages

Gaustad, T. & Eiselen, E.R. 2023. Data in Brief. April 2023

Read Article


A dataset of self-reported attitudes to Afrikaans swearwords

Van Huyssteen, G.B., Eiselen, E.R., Du Toit, J.S. 2023. Journal of Open Humanties Data. 2023

Read Article


Translation Technology in South Africa

Van Huyssteen, G.B., Puttkammer, M.J., McKellar, C.A., Griesel, M. 2023. In Routledge Encyclopaedia of Translation Technology, edited by S.W. Chan. Routledge, 373-383

Read Article


Ouderdoms- en inhoudsadvies vir Afrikaanse boeke vir kinders: resultate van ’n eerste kwalitatiewe en kwantitatiewe ondersoek

Van Huyssteen, G.B., Rabé, M, and Puttkammer, M.J. 2023. LitNet Akademies (Geesteswetenskappe) 20(1):185–212

Read Article


'n Empiriese vergelyking van die potensiële aanstootlikheid van enkele skelnaampare in Afrikaans [An empirical comparison of the potential offensiveness of some epithet pairs in Afrikaans]

Van Huyssteen, G.B., Koekemoer, S. 2023. Tydskrif vir Geesteswetenskappe 63:560–584

Read Article


Investigating the extent and usability of webtext available in South Africa's official languages

De Wet, F., Eiselen, E.R., Schillack, E., Puttkammer, M.J. 2023. Artificial Intelligence Research. SACAIR 2023. Communications in Computer and Information Science. Springer

Read Article


IsiXhosa Named Entity Recognition Resources

Eiselen, E.R. & Bukula, A. 2023. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 22:2. pp. 1-19

Read Article


A methodology for the description of constructionalisation networks: Constructions with [in] as a case study

Van Huyssteen, G.B., Breed, A., Butler, A., Botha, L., Partridge, M., and Pilon, S. 2023. Stellenbosch Papers in Linguistics


A comparison of Statistical Tests for Likert-type data: The case of swearwords

Eiselen, E.R. and Van Huyssteen, G.B.  2023. Journal of Open Humanities Data

Read Article

Research outputs in 2022

 

isiXhosa named entity recognition resources

Eiselen, E.R.E., & Bukula, A. 2022. IsiXhosa Named Entity Recognition Resources. ACM Trans. Asian Low-Resour. Lang. Inf. Process, 22:2 pp. 1-19

Read Article


Linguistically annotated dataset for four official South African languages with a conjunctive orthography: IsiNdebele, isiXhosa, isiZulu, and Siswati >

Gaustad, T. & Puttkammer, M.J. 2022. Data in Brief. Volume 41, April 2022, 107994

Read Article


 

Research outputs during 2021

 

Standaardisering as ’n produk van die tydsgees

Van Huyssteen, G.B & Pilon, S. 2021. Ontlaering – Geworteldheid: Die onderrig van Afrikaans in spesifieke ruimtes

Read Article


When a word is befok

Van Huyssteen, G.B. 2021. Afrikaans Grammar Workshop III

Read Article


How Afrikaans women became fierce-tempered

Van Huyssteen, G.B. & Eiselen, E.R. 2021. Zürich Workshop on Afrikaans Linguistics.

Read Article


Swearing in South Africa: Multidisciplinary research on language taboos

Van Huyssteen, G.B. 2021. International Conference of the Digital Humanities Association of Southern Africa 2021

Read Article


Using ordinal logistic regression to analyse self-reported usage of, and attitudes towards swearwords

Van Huyssteen, G.B. & Eiselen, E.R. 2021. International Conference of the Digital Humanities Association of Southern Africa 2022

Read Article


Development of linguistically annotated parallel language resources for four South African languages.

Gaustad, T. & Puttkammer, M.J. 2021. 2nd Workshop on Resources for African Indigenous Languages (RAIL 2021), co-located with DHASA 2021.

Read Article


Canonical Segmentation and Syntactic Morpheme Tagging of Four Resource-scarce Nguni Languages

Du Toit, J.S. & Puttkammer, M.J. 2021. 2nd Workshop on Resources for African Indigenous Languages (RAIL 2021), co-located with DHASA 2021

Read Article


Oor feekse en helleveë [On shrews and harridans]

Van Huyssteen, G.B. & Eiselen, E.R.E. 2021. Tydskrif vir Geesteswetenskappe

Read Article


Quantitative analysis of Sesotho sa Leboa part of speech taggers

Mathe, D.S. and Eiselen, E.R.E. 2021. South African Journal of African Languages

Read Article


Content developers as stakeholders in the blended learning ecosystem: The Virtual Institute for Afrikaans’ Language Education Portal as a case study

Breed, A., Fouché, N., Brink, N., Coetzee, M., Erasmus, C., Kapp, S., Pilon, S., Huyssteen, G.B. and Wierenga, R. 2021. Re-Envisioning and Restructuring Blended Learning for Underprivileged Communities

Read Article


Research outputs from 2016 to 2020

 

2020

NCHLT Web Services and CTexTools

Puttkammer, M. 2020. Tour de clarin, vol III

Read Article


Viability of Neural Networks for Core Technologies for Resource-Scarce Languages

Loubser, M. & Puttkammer, M.J. 2020. Information 11(1), 41

Read Article


Dataset for comparable evaluation of machine translation between 11 South African languages

McKellar, C.A. & Puttkammer, M.J. Data in Brief, Volume 29, 2020, 105146, ISSN 2352-3409, https://doi.org/10.1016/j.dib.2020.105146.

Read Article


Die /r/ in Afrikaans: Fonetiese en fonologiese eienskappe

Wissing, D.P. & Pienaar, W. 2020. Literator

Read Article


Afrikaans

Wissing, D.P. 2020. Journal of the International Phonetic Association

Read Article


2019

“Wat gaan word van geskrewe Standaardafrikaans? [What is going to happen to written Standard Afrikaans?]”

VAN HUYSSTEEN, G.B. 2019. In Van der Elst, J. (ed.). SA Akademie vir Wetenskap en Kuns: Verlede, hede toekoms (1909-2019). 86-89. ISBN: 978-0-949976-97-0. Pretoria: SAAWK.

Read Article


Herbesoek aan Afrikaanse klemtoon: is dit (nog) ’n inisiëleklemtoontaal?

Wissing, D.P. LitNet Akademies, 16.2 (2019): 214-239.

Read Article


Perspektief op/ɛ/-verlaging in Afrikaans.

Wissing, D.P. LitNet Akademies, 16.1 (2019): 166-206.

Read Article


2018

The Hulle en Goed Constructions in Afrikaans.

VAN HUYSSTEEN, G.B. 2018.

Read Article


Stabilising determinants in the transmission of phonotactic systems: Diachrony and acquistition of coda clusters in Dutch and Afrikaans

Wissing, D.P. 2018.

Read Article


The Status of Tone in Sesotho: A Production and Perception Study.

Wissing, D.P. 2018.

Read Article


Naar een Wikifonia.

VAN OOSTENDORP, M., VISSER, W. & WISSING, D. Nederlandse Taalkunde, 23.2 (2018): 141-150.

Read Article


Die ontwikkeling van [ʃ] in Afrikaans

WISSING, D. 2018. Literator. 39. 10.4102/lit.v39i2.1486.

Read Article

2017

Afrikaanse Woordelys en Spelreëls [Afrikaans Wordlist and Spelling Rules].

TAALKOMMISSIE VAN DIE SUID-AFRIKAANSE AKADEMIE VIR WETENSKAP EN KUNS (COMP). 2017. Eleventh edition. ISBN (printed): 978-1-86890-207-1; ISBN (online): 978-1-86890-208-8. Cape Town, Pharos, 775pp.

Read Article


Voorwoord [Preface].

VAN HUYSSTEEN, G.B. 2017. In: Suid-Afrikaanse Akademie vir Wetenskap en Kuns. Afrikaanse woordelijs en spelreëls. Faksimilee-uitgawe [Afrikaans wordlist and spelling rules. Facsimile edition]. Pretoria: Protea Boekhuis.

Read Article


Morfologie. [Morphology].

VAN HUYSSTEEN, G.B. 2017. In: Carstens, WAM & Bosman, N. (reds.). Kontemporêre Afrikaanse Taalkunde. [Contemporary Afrikaans Linguistics]. Second edition. ISBN 978-0-627-03437-4. Pretoria: Van Schaik Uitgewers. pp. 177-214.

Read Article


Plosive voicing in Afrikaans: differential cue weighting and sound change.

WISSING, D.P. 2017. Journal of Linguistics

Read Article


Elektroniese woordeboeke en die Afrikaanse gemeenskap [Electronic dictionaries and the Afrikaans community]

VAN HUYSSTEEN, G.B. & Luther, J. 2017. Gents colloquium over het Afrikaans [Ghent colloquium on Afrikaans], University of Ghent, Ghent, Belgium.

Read Article


Constructionist perspectives on two competing associative plural constructions.

VAN HUYSSTEEN, G.B. 2017. 11th International Mediterranean Morphology Meeting, Nicosia, Cyprus.

Read Article


2016

South African Language Resources: Phrase Chunking.

EISELEN, R. 2016. Tenth International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia. pp. 689-693

Read Article


Government Domain Named Entity Recognition for South African Languages.

EISELEN, R. 2016. Tenth International Conference on Language Resources and Evaluation, LREC 2016, Portorož, Slovenia. pp. 3344-3348.

Read Article


Optical character recognition for South African Languages.

PUTTKAMMER, M.J. & HOCKING, J. 2016. PRASA 2016.

Read Article


The effect of respondents' skill levels in collaborative data annotation.

PUTTKAMMER, M.J. & VAN HUYSSTEEN, G.B. 2016. "Under-resourced Languages, Collaborative Approaches and Linked Open Data : Resources, Methods and Applications". Springer. Language Resources and Evaluation Journal Special Issue.


AfriBooms: An Online Treebank for Afrikaans.

VAN HUYSSTEEN, G.B., AUGUSTTINUS, L., VAN EYNDE, F., VAN NIEKERK, D., SCHUURMAN, I. & VANDEGHINSTE, V. 2016. LREC 2016.

Read Article


A stepwise methodology for establishing natural language processing evaluation reliability.

EISELEN, R. & VAN HUYSSTEEN, G.B. 2016. Language Resource and Evaluation.


What French for Gabonese French Lexicography.

NDINGA-KOUMBA, S., ASSAM, B.N. & OMPOUSSA, V. 2016. Lexikos. 26(2016): 1-31

Read Article


Die Virtuele Instituut vir Afrikaans (VivA) en markbehoeftes in die Afrikaanse gemeenskap.

VAN HUYSSTEEN, G. B., BOTHA, M. & ANTONITES, A. 2016. Tydskrif vir Geesteswetenskappe. 56(2-1): 410-437.

Read Article


Research outputs from 2011 to 2015

2015

Afrikaans and Dutch as closely-related languages: A comparison to West Germanic languages and Dutch dialects.

HEERINGA, W., DE WET, F. & VAN HUYSSTEEN, G.B. 2015. Stellenbosch Papers in Linguistics Plus. 47(2015): 1-18.

Read Article


Planning and Macrostructural Elements for a Multilingual Culinary Dictionary of Gabonese Languages.

OMPOUSSA, V. & NDINGA_KOUMBA-BINZA, S. 2015. Lexikos. 25 (2015): 507-524.

Read Article


Translation Technology in South Africa.

VAN HUYSSTEEN, G.B. & GRIESEL, M. 2015. In: Chan, S-W. (ed.). Routledge Encyclopedia of Translation Technology. ISBN: 978-0-415-52484-1. New York: Routledge. 326-336pp.

Read Article


Aan die en besig in Afrikaanse progressiwiteitskonstruksies : 'n korpusondersoek (2) : navorsings- en oorsigartikel.

VAN HYUSSTEEN. G.B. & BREED. A. 2015. Tydskrif vir Geesteswetenskappe. 55(2):251-269.

Read Article


Palatalisation of /s/ in Afrikaans.

WISSING, D.P., PIENAAR W. & VAN NIEKERK, D. 2015. Spilplus. 48(2015): 137-158.

Read Article


Bilingual speech rhythm: Spanish-Afrikaans in Patagonia.

COETZEE, A.W., LORENZO., G.A., HENRIKSEN, A. & WISSING. D.P. 2015. In The Scottish Consortium for ICPhS 2015, eds. Proceedings of the 18th International Congress of Phonetic Sciences. London: International Phonetic Association: London.

Read Article


HLT and the changing face of translation - a CTexT perspective.

FOURIE. W. 2015. Boers, M. ed. Proceedings of the South African Translators' Institute's Second Triennial Conference. Johannesburg: SATI. p. 18-20). ISBN: 978-0-620-68208-4

2014

Afrikaans and Dutch as closely-related languages: A comparison to West Germanic languages and Dutch dialects.

HEERINGA, W., DE WET, F. & VAN HUYSSTEEN, G.B. 2015. Stellenbosch Papers in Linguistics Plus. 47(2015): 1-18.

Read Article


Planning and Macrostructural Elements for a Multilingual Culinary Dictionary of Gabonese Languages.

OMPOUSSA, V. & NDINGA_KOUMBA-BINZA, S. 2015. Lexikos. 25 (2015): 507-524.

Read Article


Translation Technology in South Africa.

VAN HUYSSTEEN, G.B. & GRIESEL, M. 2015. In: Chan, S-W. (ed.). Routledge Encyclopedia of Translation Technology. ISBN: 978-0-415-52484-1. New York: Routledge. 326-336pp.

Read Article


Aan die en besig in Afrikaanse progressiwiteitskonstruksies : 'n korpusondersoek (2) : navorsings- en oorsigartikel.

VAN HYUSSTEEN. G.B. & BREED. A. 2015. Tydskrif vir Geesteswetenskappe. 55(2):251-269.

Read Article


Palatalisation of /s/ in Afrikaans.

WISSING, D.P., PIENAAR W. & VAN NIEKERK, D. 2015. Spilplus. 48(2015): 137-158.

Read Article


Bilingual speech rhythm: Spanish-Afrikaans in Patagonia.

COETZEE, A.W., LORENZO., G.A., HENRIKSEN, A. & WISSING. D.P. 2015. In The Scottish Consortium for ICPhS 2015, eds. Proceedings of the 18th International Congress of Phonetic Sciences. London: International Phonetic Association: London.

Read Article


HLT and the changing face of translation - a CTexT perspective.

Fourie, W. 2015. Boers, M. ed. Proceedings of the South African Translators' Institute's Second Triennial Conference. Johannesburg: SATI. p. 18-20). ISBN: 978-0-620-68208-4

Research outputs from 2007 to 2010

 

2007

Accelerating the Annotation of Lexical Data for Less-Resourced Languages.

Van Huyssteen, G.B. & Puttkammer, M.J. 2007. Accelerating the Annotation of Lexical Data for Less-Resourced Languages. (In Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007). p. 1505-1508.)

Read Article


Evaluating Wrapped Progressive Sampling for Automatic Algorithmic Parameter Optimisation.

GROENEWALD, H.J., VAN HUYSSTEEN, G.B. & PUTTKAMMER, M.J. 2007. (In Angelova, G., Bontcheva, K., Mitkov, R., & Nikolov, N., eds. Proceedings of Recent Advances in Natural Language Processing 2007, Borovets, Bulgaria. p. 251-255.)

Read Article


Using Machine Learning to Annotate Data for NLP Tasks Semi-Automatically

Van Huyssteen, G.B., Puttkammer, M.J., Pilon, S., & Groenewald, H.J. 2007. (In Orasan, C. & Kuebler, S., eds. Proceedings of International Workshop on Computer-Aided Language Processing, Borovets, Bulgaria.)

Read Article


Accelerating the Annotation of Lexical Data for Less-Resourced Languages

Van Huyssteen, G.B. & Puttkammer, M.J. 2007. Presentation delivered at the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp

Read Article


Datagebaseerde Aspekte van Afrikaanse Reduplikasies

Van Huyssteen, G.B. & Wissing, D.P. 2007. Southern African Linguistics and Applied Language Studies, 25(3): 419-439

Read Article


Global and local durational properties in three varieties of South African English

Coetzee, A.W. & Wissing, D.P. 2007. Linguistic Review, 24:263-289

Read Article


Gevorderde akoestiese korrelate van Afrikaanse klemtoon

Wissing, D.P. 2007. Southern African Linguistics and Applied Language Studies, 25

Read Article


Basiese akoestiese korrelate van klemtoon in Afrikaans

Wissing, D.P. 2007. Southern African Linguistics and applied Language Studies, 25:441-458

Read Article


Testing the use of Lessac's Tonal NRG as a voice building tool for female students at a South African University

MUNRO, M & Wissing, D.P. 2007. Voice and Speech Review

Read Article


Automatic Parameter Selection for Effective Afrikaans Lemmatisation.

Groenewald, H.J., Van Huyssteen, G.B. & Puttkammer, M.J. 2007. Presentation delivered at the Recent Advances in Natural Language Processing (RANLP) 2007, Borovets, Bulgaria.

Read Article


Heroorweging van Fleksie in Afrikaans

Van Huyssteen, G.B. & Groenewald, H.J. 2007. Voordrag gelewer by LVSA/SAALA/SAVTO 2007, NWU Potchefstroomkampus

Read Article


Requirements for Machine-Aided translation Tools

Van Huyssteen, G.B. & Groenewald, H.J. 2007. Presentation delivered at LVSA/SAALA/SAVTO 2007, NWU, Potchefstroom campus


Feature Selection and Parameter Optimisation for Effective Afrikaans Lemmatisation

Van Huyssteen, G.B. & Groenewald, H.J. 2007. Presentation delivered at the International 17th Meeting of Computational Linguistics in the Netherlands (CLIN) 2007, University of Leuven, Leuven.

Read Article


ʼn Fleksievormgenereerder

Pilon, S. 2007. Voordrag gelewer by LVSA/SAALA/SAVTO 2007, NWU, Potchefstroomkampus


MT for English-isiZulu/Afrikaans

Pilon, S. & Pienaar, J.A. 2007. Presentation delivered at LVSA/SAALA/SAVTO 2007, NWU, Potchefstroomcampus


Lexicon Creation and Management: TurboAnnotate

Van Huyssteen, G.B. & Puttkammer, M.J. 2007. Presentation delivered at LVSA/SAALA/SAVTO 2007, NWU, Potchefstroom campus


Developing Web-Based Word-Translators

Van Huyssteen, G.B., Puttkammer, M.J. & Schlemmer, M. 2007. Presentation delivered at LVSA/SAALA/SAVTO 2007, NWU, Potchefstroom campus


Nadruk in Afrikaans: akoestiese kenmerke en metodologiese oorwegings by die vasstellings daarvan

Wissing, D.P. 2007. Voordrag gelewer by LVSA/SAALA/SAVTO 2007, NWU, Potchefstroomkampus


More on acoustic correlates of stress.

Wissing, D.P. 2007. Presentation delivered at the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp.

Read Article

    Development

  All the latest on our Projects

Have a look at the recent projects we've done for our various clients. The development of the resources and or software is described with the output available for downloading.

Autshumato PP
PROJECTS

Autshumato | Coming Soon

Autshumato encompasses a series of projects that develop machine translation systems for South African languages. Here you'll find the work we've done for the various tools offered within Autshumato. 


Coming Soon!

Sadilar PP
PROJECTS

SADiLaR

We are the official text node for SADiLaR with a focus on the advancement of multilingualism. We develop text resources for our under-resource languages which is crucial for being able to develop within big data and artificial intelligence within the SA context. Here we develop linguistically enriched corpora, core technologies and proofing tools.


Full Projects Details
Viva PP
PROJECTS

VivA Afrikaans

We've been collaborating with the Virtuele instituut vir Afrikaans (VivA) by maintaining the content and technical services of the Corpus and Dictionary Portals. With over 85 million words in the Corpus Portal and over 50 dictionaries and word lists in the Dictionary Portal, we make sure that the systems are up to date with the latest etymology, spelling, and meanings.

 


Coming Soon!

 


 

We ensure long-term sustainability for research and development activities. This establishes valuable partnerships with academic and industry partners with in an interest in natural language processing and computational linguistics.

 

 


Software & Resources

Have a look at our applied technologies.


Corpora

Our compilation of collections of texts with a focus on resource-scarce languages of South Africa for further research and development.

See our corpora resources

Core Technologies

Morphological analysers, Part-of-Speech (POS) taggers and Lemmatisers are the core technologies we develop resources for.

See our core technology resources

Translation Aids

Have a look at our work within machine translation and other tools within our Autshumato projects and our Spelling Checkers.

See our collection of translation aid software and resources



Kom ons ondersoek die moontlikhede
wat menslike taaltegnologie vir jou kan bring.

Wil jy saamwerk aan navorsing, 'n projek in gedagte hê of stel jy belang om taaltegnologie te studeer?

Kontak ons