# Alphabets and Strings

• Alphabets
• Strings
• Languages

## Alphabets and Symbols

• Alphabet: any (nonempty) finite set
• Usually assume elements are some kind of characters

• Members of an alphabet are called symbols
• Symbols are usually letters and digits

• Example alphabets: {a}, {a, x}, {0}, {0, 1}
• Usually don't use quotes for characters (ie 'a')
• Usually use different font or context to recognize symbols

• Represent an alphabet with the Greek letters (eg Σ and Γ  which are capital Sigma and capital Gamma)
• Example: Σ = {0, 1}
• Example: let x be an element of Σ

## Strings Over an Alphabet

• String: finite sequence of symbols over (ie from) an alphabet
• Symbols are usually written adjacent without commas
• Examples over Σ = {a,b,c,d}: aaa, bac
• Examples over Σ = {0,1}: 000, 01011
• acb is a shorthand for (a, c, b)

• Typically represent strings with letters like u, v, w, x, y, z
• Examples: u = acb, x = 10110

• Frequently the alphabet that strings are formed over is unspecifed and assumed
• Example: 0 or more a's followed by a single b

## Strings Operations

• We can define operation on strings

• The most common operations are
• Concatenation: Join two or more strings and/or symbols
• Example: x = cab, y = aba , xy = ??
• We freely concatenate any combination of strings and symbols
• Example: xay =

• Superscript: xk

• Reverse - Notation: wR

• Length of string w:
• number of symbols in w
• written |w|
• |cab| = 3

## The Empty String

• The string with no symbols
• If x is the empty string, then |x| = 0

• Conundrum: How do we write the empty string?
• Answer: A special symbol: ε
• Symbol λ also used

• Question: Can ε be a symbol in the alphabet we are using to form strings?

• How is ε represented in Ada and Java?

• Plays role of 0 or 1 in a number system
• wε = ?? = ??
• This is called ...

## Lexicographic Ordering

• Need a standard order in which to list strings
• Lexicographic ordering: shorter precede longer (eg {ε, 0, 1, 00, 01, 10, 11, 000, ...})
• Assume symbols have an ordering

## Languages

• Language: A set of strings (over an alphabet)

• Examples:
• Over alphabet {a, b, c}:
• L1 = {aaa, aba}
• L2 = {a, aa, aaa}

• Over alphabet {0, 1}:
• L1 = {0}
• L2 = {0, 1, 010}

• Frequently the alphabet that strings are formed over is unspecifed and assumed

## Empty Strings and Languages

• Are these two languages different? {00, 11} and {ε, 00, 11}

• Is this a language: {ε}

• Is this a language: {}
• Eventually we will see language specifications that turn out to define empty languages, although that might not be obvious at first
• Notice the connection between the string (ie sequence of symbols) with no symbols and the language (ie set of strings) with no elements

• Is this a valid alphabet: Σ = {}
• If it is, what are the strings over Σ?
• If it is, what are the languages over Σ?
• Is it valid? See page 13.

## Infinite Languages

• Languages can be finite or infinite

• Infinite languages contain an infinite number of strings

• Example infinite languages:
• What's the shortest string in the language?

• L = {wwR} | for any string w}
• What's wrong with this definition?
• What's the shortest string in the language?
• What language is this: Even length palindromes
• How do we specify odd length palindromes

## Infinite Alphabets, Strings, and Languages

• Can an alphabet be infinite?

• Can a string be of infinite length?

• Look at infinite languages above: infinite number of finite strings

• In an infinite language, there is no limit to the length of a string, but the length of each string is finite.

• Remember these definitions!!!

## Tools for working with languages on Languages

• We construct 3 tools for working with languages:
• Automata:

• Regular Expressions: Generate (describe) languages

• Grammars: Generate languages

## Some Questions when Working with Languages

• Each tool can be used to ask 2 questions:
• What strings are in the language?
• Is a given string in the language?

ITEC 420 Course Page,