TS Abstract Corpus
This corpus samples academic writing from various disciplines. The data is presented by two major domains, social and physical sciences and six genres, humanities&arts, medicine, natural sciences, politics&law&education, social sciences, technology&engineering. Also text-type classification includes 32 scientific disciplines that the data is formed by.
TS Abstract Corpus is specially a useful source for text genre classification studies. A list of frequency list for each discipline could be downloaded by this link.
The source data of this corpus is obtained from the dataset form for Turkish Labeled Text Corpus by Öztürk et. al.