Strings
Goals
- Describe implementation and memory allocation of Ada strings
- Describe difference between use and implementation of java and Ada
strings
- Be able to use Ada strings, their attributes, and associated i/o
routines
- Describe the differences in implementation and use of Ada's three
types of strings: fixed length, bounded length, unbounded.
- Describe the tradeoffs involved in using each of Ada's types of
strings and how they relate to java's strings
- Describe how C strings are implemented
Differences Between Strings in Ada and Java
- So far, java and ada strings look similar:
// Java:
System.out.println("Value of x is " + x.toString());
-- Ada:
put("Value of x is " & Integer'Image(x));
Three differences between Ada and Java Strings:
Ada strings are
- not references
- fixed length
- arrays of characters
In other words,
- Java strings are immutable objects that are accessed with reference
semantics
- Characters in the string cannot change, but the reference can
be changed to point to another string
- Ada strings are arrays of characters that are accessed using
value semantics
- Values in the array can change, but the array size cannot
Implementation and Memory Allocation of Java Strings
- Reference type variable points to an object that contains a
reference to a string:
String s;
s = "Hi Mom!";
for (int i=6; i>=0; i--)
System.out.println(s.charAt(i));
Memory diagram is frequently simplified
Individual characters can't be accessed directly
Implementation and Memory Allocation of Ada Strings
- A string is an array of characters
s: String(1 .. 7);
...
s := "Hi Mom!";
for i in reverse 1 .. 7 loop
put( s(i) );
end loop;
s(4) := 'T';
put(s);
Reference Semantics vs Value Semantics
- Java Strings have Reference Semantics
- String variables are references to strings
String s, t;
s = "Hi Mom!";
t = s;
Ada Strings have Value Semantics
- String variables are values
s, t: String(1 .. 7);
...
s := "Hi Mom!";
t := s;
Mutability of Strings
- Java Strings can't change the characters
- But string variables can point to different strings
String s;
s = "Hi Mom!";
s = "Hi Tom!";
Ada Strings have Value Semantics
- String variables are values
s: String(1 .. 7);
...
s := "Hi Mom!";
s := "Hi Tom!";
Ada String Declarations
- A String is declared implicitly or explicitly:
- Implicitly when it is initialized, or
- Explicitly with or without initialization
s1: String := "Hello"; -- Implicit length 5. Initialized.
s2: String := "Hi"; -- Implicit length 2. Initialized.
s3: String(1 .. 5); -- Explicit length 5. Uninitialized.
s4: String(1 .. 5) := "Hello:; -- Explicit length 5. Initialized.
Contrast with a Java unitialized String
The declaration determines the length of the string
Ada Strings are Fixed Length
- An array's length never changes:
- As a result, the length of a string never changes
- A String variable can only be assigned strings that match its length
- Array assignment requires arrays of same length on both sides
- Examples:
s: String := "Hi Mom!";
u: String(1 .. 10);
...
-- s := u; -- Compile error
-- u := s; -- Compile error
s := "Hi Bob!"; -- Okay
-- s := "Hi Billy-Bob!"; -- Compile error
-- s := s & "!"; -- Compile error
u = s & "!!!"; -- Okay
Equality Test
- Strings of different lengths can be compared
- Example:
r: String(1..5);
s, t: String(1..7);
r := "Hi Mo";
s := "Hi Mom!";
t := "Hi Tom!";
if s = t then -- works
...
if r = s then -- also works
...
When comparing strings:
- If lengths are equal, each pair of characters is compared
Slices can be compared
if s(5..7) = t(5..7) then
Any arrays of the same type can be compared
Attributes
- 'length
- 'first
- 'last
- 'range
- Examples:
for i in 1 .. s'length loop
for i in s'first .. s'last loop
for i in s'range loop
String Slices
Characters and Strings
Some things to be aware of:
- In Ada.Text_IO, procedure put is defined for both Character and
String, but put_line is defined only for String
- A String of length 1 is NOT the same as a Character
u: String(1 .. 1) := "B";
u := 'C'; -- Compile error
Empty strings are possible
u: String(1 .. 0) := ""; -- Empty range - length 0 string
A string is actually an array of character
Gaining More Flexibility with Strings
- Strings lack flexibility - here are some ways to compensate
- Procedures can have parameters that are strings of
unconstrained length (ie can be of any length) -
- eg put("Hi"); - the actual parameter for put can be a String
of any length
- In some circumstances you can use a declare block to declare a string after the length is known
- Allocate a long string, fill up part of it, and keep track of
how many valid characters it contains (this is how input routine
get_line works)
- Use Bounded or Unbounded strings:
- Bounded length - the current size and the string are combined
in one type.
The size of a string can change, but it must
always be less than some predefined maximum - we mostlyignore these.
- Unbounded - size can change as needed - see below
- Tradeoff: convenience, space, runtime efficiency
- Fixed length are time and space efficient, but inconvenient
- Unbounded length are time and space inefficient, but convenient
Get_Line
- Problem: How to input string of unknown length
if must declare length first
- One Solution: Use get_Line(S, L)
- S receives the input string
- L receives the number of characters that were input
- Example (getline.adb and prettified):
v: String(1 .. 80);
len: Integer;
begin
get_Line(v, len);
put(len); -- Number of characters read
put(v(1 .. len)); -- Output a slice
put(v); -- Don't do this
put(v(1 .. v'length)); -- Don't do this, either
Caution: if the input line is as long or longer than the length
of the string variable (80 or more characters), then things are
somewhat different
Ada.Strings.Unbounded
- Unbounded Strings are dynamically sized, similar to Java
Strings
- Allocated on heap, like Java strings
- Reference semantics
- Automatic garbage collection for strings
- Programmer typically converts between Strings and Unbounded_String
and back, as needed
- Ada.Strings.Unbounded.Text_IO contains i/o routines for
unbounded strings
- Example programs:
-
unbounded.adb
( prettified )
-
getunbounded.adb
( prettified )
-
documentation.
Unconstrained Arrays and Strings
- Unconstrained arrays are declared with unspecified length
- Strings are declared as unconstrained arrays in package
Ada.Standard, which is automatically available
-
type string is array (Positive range <> of Character
- Unconstrained arrays can only be used in certain locations, such as
procedure parameters
- eg
put("Hi"); -
the actual parameter for put can be a String
of any length
-
procedure put(Item: String)...
- The actual bounds of a string/unconstrained array must be
specified before a variable of that type is declared
Bounded Strings
- Bounded length - the current size and the string are combined
in one type.
The size of a string can change, but it must
always be less than some predefined maximum
- Essentially, allocate a long string, fill up part of it, and keep track of
how many valid characters it contains
- Similar to how we use
getline(len, s), but the string
and length are combined in a single type
-
We ignore bounded length strings.
Tradeoffs with Strings
- Tradeoff: convenience, space, runtime efficiency
- Fixed length are time and space efficient, but inconvenient
- Unbounded length are time and space inefficient, but convenient
- Unconstrained are somewhat efficient and somewhat convenient
Strings in C
- Characteristics of Strings in C:
- Strings are arrays of characters
- Strings are 0 terminated
- Flexible: varying length string within maximum length easily implemented
- Problems
- Array bounds are not checked
- Easy to write outside of allocated space (eg strcpy)
- Example:
char s[] = "ABC";
- Note: local variables are allocated on the stack
- As the stack grows, addresses get smaller
- Simple Example: simplestrings.c (prettified)
- Example showing how to display characters in decimal and hex: decimalstrings.c (prettified)
- Example showing possible errors with strings (and arrays): errorstrings.c (prettified)
- Example showing possible errors with string copy (ie strcpy): copystrings.c (prettified)