TS Corpus is a Free&Independent Project that aims building Turkish corpora, NLP tools and linguistic datasets…

What is TS Corpus Project?

TS Corpus IconTS Corpus is a Free&Independent Project that aims building Turkish corpora, developing Natural Language Processing tools and compiling linguistic datasets. The project started in 2011 and in March 2012 the first corpus named TS Corpus Version 1 had published. Later in August 2012 the updated TS Corpus version 2 had released. This was the first online available, part of speech tagged Turkish corpus ever released.

Since then many other corpora, NLP tools and linguistic datasets had published. Please check relevant pages for further information.

The project is free for academic studies and researches. All the corpora and NLP tools published by the project are presented without any usage limitations. Users are free to run queries, save queries and download the hit sets to their computers. All the 9 published corpora serves a dataset of over 500 million tokens derived from various sources; online newspapers, forums, social media, academic papers etc.

TS Corpus is a growing project. We strongly believe that, information and data should be shared freely. Therefore, TS Corpus Project is build upon free software.

  • TS Corpus Main Screen
  • TS Corpus Basic Query
  • TS Corpus Restricted Query
  • TS Corpus Sort Window
  • TS Corpus Collocations

Our Team

Taner Sezer
Taner SezerResearcher
Computer&Language Addict
Türker Sezer
Türker SezerLinux SysAdmin
Million Tokens
Queries Ran in All Corpora
Active Users
NLP Tools Published
Corpora Published by TS Corpus

TS Corpus is free for academics studies and researches

Sign Up Now