TS Corpus Documentation

TS Corpus supports two query modes that CQP (Corpus Query Processor) features. Each query mode has its own advantages. Selecting a query mode is absolutely based on the users intent. Before using TS Corpus for your study, we recommend you to try these query modes for getting used to them. Below, you’ll find sample queries for each query mode.

Performing Basic Queries

Your first query

When you reach corpus interface, the query mode will be set to “Basic Query” as default option. Simply enter a “key” into the query box and hit “Start Query” button.

The simple query mode also allows users to perform Lemma and Part-of-Speech queries. 

  • Lemma Queries

For performing a lemma query in simple query mode simply write your key within curly braces ({}). For instance, in order to find every occurrence of the lemma “burun” simply put it into curly braces. The key “{burun}” (without quotes). This query will fetch every occurrences of the given lemma. These queries are useful for Turkish as phonetic change is observed frequently.

  • Part-of-Speech Queries

For performing a query using part-of-speech tags in simple query mode enter the key followed by an underscore and the part-of-speech tag. For instance the key “gel*_Verb” (without quotes) will fetch every word in the corpus tagged as verb, starting with g+e+l letter sequence and followed by any character.

Performing CQP Syntax Queries

CQP may be used with command line via Linux Bash or with CQPWeb interface. The samples given here are for CQPWeb interface.

Please note that for using CQP Syntax, “Query Mode” should be set to CQP Syntax.

CQP Syntax

CQP Syntax refers to the usage of CQP (Corpus Query Processor) in queries with its own predefined syntax in order to take advantage of CQP over annotated data.

  • Lemma Queries

For performing a lemma query in CQP Syntax query mode, the key should be given in by using a set of strict syntax. A lemma for query for the word “burun” in CQP syntax mode should be as given below:

[Lemma=”burun”]

  • Part-of-Speech Queries

For performing a part-of-speech query in CQP Syntax query mode, the key should be given as follow:

[PosTag=”Verb”]

  • Morphological Queries

For performing a morphological query in CQP Syntax query mode, the key should be given as follow:

[Morph=”.*\+While\.*”]

  • Complex Queries

For performing a complex query in CQP Syntax query mode, the key should be given as follow:

[Morph=”.*\+While\.*” & PosTag=”Verb”]