Definitions

Token - one letter or a sequence of letters which is treated as one item (if there is tokenization step).

Alphabet ( $\Sigma$ - sigma) is a finite, non-empty set of tokens. Example: $\Sigma = \{a, b\}$ .

String ( $w$ ) is a finite sequence of tokens chosen from $\Sigma$ . Example: $ababba$ .

Language ( $L$ ) a (possibly infinite) set of strings. Example: $L = \{a, aa, aaa\}$ .

String length is a number of tokens in it: $|aba| = 3$ .

String concatenation: $w_1w_2$ . Example: $ab \cdot ba = abba$ . Alternative notations: $w_1 \cdot w_2$ . In plain text space can be used: a b

Empty string: $\varepsilon$ (epsilon). $|\varepsilon| = 0$ , $\forall w: w \varepsilon = \varepsilon w = w$ .

Language concatenation: $L_1L_2 = \{w_1w_2 \text{ | } w_1 \in L_1, w_2 \in L_2\}$ - Cartesian product ↗ of two sets. Example: $\{1,2\}\cdot\{a,b,\ldots\} = \{1a,2a,1b,2b,\ldots\}$ .

String exponentiation: $w^k = ww \ldots w$ (k times). Example: $a^3 = aaa$ .

Language exponentiation: $L^k = LL \ldots L$ (k times). $LL = L^2$ , $L^k = L(L^{K-1})$ , $L^0 = \{\varepsilon\}$ . Example: $\{0,1\}^{32}$ - all binary words of length of 32.

String reversal: $w^R$ . Example: $(aabc)^R=cbaa$ .

Language reversal: $L^R = {w^R \text{ | } w \in L}$ . Example: $\{ab, cd\}^R = \{ba, dc\}$ .

Language union: $L_1 \cup L_2 = \{w \text{ | } w \in L_1 \text{ or } w \in L_2\}$ - set union ↗. Other name: disjunction. Example: $\{a,b\} \cup \{b,aa\} = \{a,b,aa\}$ .

Language intersection: $L_1 \cap L_2 = \{w \text{ | } w \in L_1 \text{ and } w \in L_2\}$ - set intersection ↗. Other name: conjunction. Example: $\{a,b\} \cap \{b,aa\} = \{b\}$ .

Language difference: $L_1 \setminus L_2 = \{w \text{ | } w \in L_1 \text{ and } w \not{\in} L_2\}$ - set difference ↗. Example: $\{a,b\} \setminus \{b,d\} = \{a\}$ .

Kleene closure: $L^\ast = L^0 \cup L^1 \cup L^2 \ldots$ , $L^+ = L^1 \cup L^2 \cup L^3 \ldots$ . Example: $\{a\}^\ast = \{\varepsilon,a,aa,aaa \ldots\}$ , $\{a\}^+ = \{a,aa,aaa \ldots\}$

All finite strings (over alphabet $\Sigma$ ): $\Sigma^\ast$ , $\forall L: L \subseteq \Sigma^\ast$ . Note: strictly speking I didn’t provide definition how to exponentiate alphabet - just treat it as language with strings of length 1.

“Trivial” language: $\epsilon = \{\varepsilon\}$ . $\forall L: \epsilon L = L \epsilon = L$ .

Empty language: $\emptyset = \{\}$ . $\emptyset^\ast = \epsilon$ , $\forall L: \emptyset L = L \emptyset = \emptyset$ .

Notations:

$\forall$ - for all
$\in$ - in a set
$\not{\in}$ - not in a set