Strings
Goals
- Describe implementation and memory allocation of Ada strings
- Describe difference between use and implementation of java and Ada
strings
- Be able to use Ada strings, their attributes, and associated i/o
routines
- Describe the differences in implementation and use of Ada's three
types of strings: fixed length, bounded length, unbounded.
- Describe the tradeoffs involved in using each of Ada's types of
strings and how they relate to java's strings
- Describe how C strings are implemented
Differences Between Strings in Ada and Java
- So far, java and ada strings look similar:
// Java:
System.out.println("Value of x is " + x.toString());
-- Ada:
put("Value of x is " & Integer'Image(x));
Three differences between Ada and Java Strings:
Ada strings are
- not references
- fixed length
- arrays of characters
In other words,
- Java strings are immutable objects that are accessed with reference
semantics
- Characters in the string cannot change, but the reference can
be changed to point to another string
- Ada strings are arrays of characters that are accessed using
value semantics
- Values in the array can change, but the array size cannot
Implementation and Memory Allocation of Java Strings
- Reference type variable points to an object that contains a
reference to a string:
String s;
s = "Hi Mom!";
for (int i=6; i>=0; i--)
System.out.println(s.charAt(i));
Memory diagram is frequently simplified
Individual characters can't be accessed directly
Implementation and Memory Allocation of Ada Strings
- A string is an array of characters
s: String(1 .. 7);
...
s := "Hi Mom!";
for i in reverse 1 .. 7 loop
put( s(i) );
end loop;
s(4) := 'T';
put(s);
Reference Semantics vs Value Semantics
- Java Strings have Reference Semantics
- String variables are references to strings
String s, t;
s = "Hi Mom!";
t = s;
Ada Strings have Value Semantics
- String variables are values
s, t: String(1 .. 7);
...
s := "Hi Mom!";
t := s;
Mutability of Strings
- Java Strings can't change the characters
- But string variables can point to different strings
String s;
s = "Hi Mom!";
s = "Hi Tom!";
Ada Strings have Value Semantics
- String variables are values
s: String(1 .. 7);
...
s := "Hi Mom!";
s := "Hi Tom!";
Ada String Declarations
- A String is declared implicitly or explicitly:
- Implicitly when it is initialized, or
- Explicitly with or without initialization
s2: String := "Hi"; -- Implicit length 2. Initialized.
s3: String(1 .. 5); -- Explicit length 5. Uninitialized.
s4: String(1 .. 5) := "Hello:; -- Explicit length 5. Initialized.
Contrast with a Java unitialized String
The declaration determines the length of the string
Ada Strings are Fixed Length
- An array's length never changes:
- As a result, the length of a string never changes
- A String variable can only be assigned strings that match its length
- Array assignment requires arrays of same length on both sides
- Examples:
s: String := "Hi Mom!";
u: String(1 .. 10);
...
-- s := u; -- Compile error
-- u := s; -- Compile error
s := "Hi Bob!"; -- Okay
-- s := "Hi Billy-Bob!"; -- Compile error
-- s := s & "!"; -- Compile error
u = s & "!!!"; -- Okay
Equality Test
- Strings of different lengths can be compared
- Example:
r: String(1..5);
s, t: String(1..7);
r := "Hi Mo";
s := "Hi Mom!";
t := "Hi Tom!";
if s = t then -- works
...
if r = s then -- also works
...
When comparing strings:
- If lengths are equal, each pair of characters is compared
Slices can be compared
if s(5..7) = t(5..7) then
Any arrays of the same type can be compared
Attributes
- 'length
- 'first
- 'last
- 'range
- Examples:
for i in 1 .. s'length loop
for i in s'first .. s'last loop
for i in s'range loop
String Slices
Characters and Strings
Some things to be aware of:
- In Ada.Text_IO, procedure put is defined for both Character and
String, but put_line is defined only for String
- A String of length 1 is NOT the same as a Character
u: String(1 .. 1) := "B";
u := 'C'; -- Compile error
Empty strings are possible
u: String(1 .. 0) := ""; -- Empty range - length 0 string
A string is actually an array of character
Gaining More Flexibility with Strings
- Strings lack flexibility - here are some ways to compensate
- Procedures can have parameters that are strings of
unconstrained length (ie can be of any length) -
- eg put("Hi"); - the actual parameter for put can be a String
of any length
- In some circumstances you can use a declare block to declare a string after the length is known
- Allocate a long string, fill up part of it, and keep track of
how many valid characters it contains (this is how input routine
get_line
works)
- Use Bounded or Unbounded strings:
- Bounded length - the current size and the string are combined
in one type.
The size of a string can change, but it must
always be less than some predefined maximum - we mostlyignore these.
- Unbounded - size can change as needed - see below
- Tradeoff: convenience, space, runtime efficiency
- Fixed length are time and space efficient, but inconvenient
- Unbounded length are time and space inefficient, but convenient
Options: 2 kinds of Get_Line, 3 kinds of strings
- Ada has two versions of get_line: a procedure and a function
- In this course, we ignore the procedure version
- Ada has three kinds of strings:
Get_Line Function
- Problem: How to input string of unknown length if must declare length first
- Solutions (more details below):
- use get_line(s, l); [careful if input line exceeds length of s]
- use get_line function and a declare block
declare
s: string := get_line; -- reads entire line
begin
put(s'length); -- Number of characters read
Use an unbounded string (must use required library:
declare
s: unbounded_string := get_line; -- reads entire line
begin
put(length(s)); -- Number of characters read
Use a bounded string ...
Another Use of the Get_Line Function
while not end_of_file loop
put_line ( get_line ));
end loop;
Get_Line Procedure
- Another Solution: Use get_Line(S, L)
- S receives the input string
- L receives the number of characters that were input
- Example (getline.adb and prettified):
v: String(1 .. 80);
len: Integer;
begin
get_Line(v, len);
put(len); -- Number of characters read
put(v(1 .. len)); -- Output a slice
put(v); -- Don't do this
put(v(1 .. v'length)); -- Don't do this, either
Caution: if the input line is as long or longer than the length
of the string variable (80 or more characters), then things are
somewhat different
Unbounded Strings
- Unbounded Strings:
- Are dynamically sized, like Java
- Are not arrays of characters
- Require package
Ada.Strings.Unbounded
- Require package
Ada.Strings.Unbounded.Text_IO
for IO
- documentation.
Unbounded Strings: Example
- Example showing dynamic length of an unbounded string:
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
with Ada.Strings.Unbounded.Text_IO; use Ada.Strings.Unbounded.Text_IO;
procedure showUnbounded is
s: Unbounded_String := To_Unbounded_String("first");
begin
s := To_Unbounded_String("different");
s := s & "!!";
put_line(s); -- different!!
s := get_line; -- Read entire line into s
put_line(s); -- Prints line from input
...
Unbounded Strings: Memory Management
- Allocated on heap, like Java Strings
- An unbounded string variable is a reference,
- But assignment copies the string, not the reference
- Example:
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
with Ada.Strings.Unbounded.Text_IO; use Ada.Strings.Unbounded.Text_IO;
...
s: Unbounded_String := To_Unbounded_String("first");
t: Unbounded_String := s;
...
s := s & "!!";
put_line(s); -- first!!
put_line(t) -- first;
Automatic deallocation, like Java Strings
Unbounded Strings: Conversions and More Examples
Unconstrained Arrays and Strings
- Unconstrained arrays are declared with unspecified length
- Strings are declared as unconstrained arrays in package
Standard, which is automatically available
-
type string is array (Positive range <> of Character
- Unconstrained arrays can only be used in certain locations, such as
procedure parameters
- eg
put("Hi");
-
the actual parameter for put can be a String
of any length
-
procedure put(Item: String)...
- The actual bounds of a string/unconstrained array must be
specified before a variable of that type is declared
Bounded Strings
- Bounded length - the current size and the string are combined
in one type.
The size of a string can change, but it must
always be less than some predefined maximum
- Essentially, allocate a long string, fill up part of it, and keep track of
how many valid characters it contains
- Similar to how we use
getline(len, s)
, but the string
and length are combined in a single type
- documentation.
-
We mostly ignore bounded length strings.
Tradeoffs with Strings
- Tradeoff: convenience, space, runtime efficiency
- Fixed length are time and space efficient, but inconvenient
- Unbounded length are time and space inefficient, but convenient
- Unconstrained are somewhat efficient and somewhat convenient
Strings in C
- Characteristics of Strings in C:
- Strings are arrays of characters
- Strings are 0 terminated
- Flexible: varying length string within maximum length easily implemented
- Problems
- Array bounds are not checked
- Easy to write outside of allocated space (eg strcpy)
- Example:
char s[] = "ABC";
- Note: local variables are allocated on the stack
- As the stack grows, addresses get smaller
- Simple Example: simplestrings.c (prettified)
- Example showing how to display characters in decimal and hex: decimalstrings.c (prettified)
- Example showing possible errors with strings (and arrays): errorstrings.c (prettified)
- Example showing possible errors with string copy (ie strcpy): copystrings.c (prettified)