Package: GNAT.Spitbol.Patterns

Dependencies

with Ada.Finalization; use Ada.Finalization;
with Ada.Strings.Maps; use Ada.Strings.Maps;
with Ada.Text_IO;      use Ada.Text_IO;

Description

GNAT.Spitbol.Patterns (files g-spipat.ads/g-spipat.adb) This is a completely general patterm matching package based on the pattern language of SNOBOL4, as implemented in SPITBOL. The pattern language is modeled on context free grammars, with context sensitive extensions that provide full (type 0) computational capabilities.

Header

package GNAT.Spitbol.Patterns is
 
pragma Elaborate_Body (Patterns);

Pattern Matching Tutorial

Exceptions

Pattern_Stack_Overflow
Exception raised if internal pattern matching stack overflows. This is typically the result of runaway pattern recursion. If there is a genuine case of stack overflow, then either the match must be broken down into simpler steps, or the stack limit must be reset.

Type Summary

Boolean_Func
Match_Result
Primitive Operations:  Match, Replace
Natural_Func
Pattern
Primitive Operations:  "&", "&", "&", "&", "&", "*", "*", "*", "*", "*", "*", "**", "**", "**", "**", "**", "**", "+", "+", "+", "+", "or", "or", "or", "or", "or", "or", "or", "or", "or", Any, Any, Any, Any, Any, Any, Arb, Arbno, Arbno, Arbno, Bal, Break, Break, Break, Break, Break, Break, BreakX, BreakX, BreakX, BreakX, BreakX, BreakX, Cancel, Dump, Fail, Fence, Fence, Image, Image, Len, Len, Len, Match, Match, Match, Match, Match, Match, Match, Match, Match, Match, NotAny, NotAny, NotAny, NotAny, NotAny, NotAny, NSpan, NSpan, NSpan, NSpan, NSpan, NSpan, Pos, Pos, Pos, Rest, Rpos, Rpos, Rpos, Rtab, Rtab, Rtab, Setcur, Span, Span, Span, Span, Span, Span, Succeed, Tab, Tab, Tab
VString_Func

Constants and Named Numbers

Output : constant File_Access := Standard_Output;
Two handy synonyms for use with the above pattern write operations.
Stack_Size : constant Positive := 2000;
Size used for internal pattern matching stack. Increase this size if complex patterns cause Pattern_Stack_Overflow to be raised.
Terminal : constant File_Access := Standard_Error;

Variables

Anchored_Mode : Boolean := False;
This global variable can be set True to cause all subsequent pattern matches to operate in anchored mode. In anchored mode, no attempt is made to move the anchor point, so that if the match succeeds it must succeed starting at the first character. Note that the effect of anchored mode may be achieved in individual pattern matches by using Fence or Pos(0) at the start of the pattern.
Debug_Mode : Boolean := False;
This global variable can be set True to generate debugging on all subsequent calls to Match. The debugging output is a full trace of the actions of the pattern matcher, written to Standard_Output. The level of this information is intended to be comprehensible at the abstract level of this package declaration. However, note that the use of this switch often generates large amounts of output.

Other Items:

type Pattern is private;
Type Declarations Type representing a pattern. This package provides a complete set of operations for constructing patterns that can be used in the pattern matching operations provided.

type Boolean_Func is access function return Boolean;
General Boolean function type. When this type is used as a formal parameter type in this package, it indicates a deferred predicate pattern. The function will be called when the pattern element is matched and failure signalled if False is returned.

type Natural_Func is access function return Natural;
General Natural function type. When this type is used as a formal parameter type in this package, it indicates a deferred pattern. The function will be called when the pattern element is matched to obtain the currently referenced Natural value.

type VString_Func is access function return VString;
General VString function type. When this type is used as a formal parameter type in this package, it indicates a deferred pattern. The function will be called when the pattern element is matched to obtain the currently referenced string value.

subtype PString is String;
This subtype is used in the remainder of the package to indicate a formal parameter that is converted to its corresponding pattern, i.e. a pattern that matches the characters of the string.

subtype PChar is Character;
Similarly, this subtype is used in the remainder of the package to indicate a formal parameter that is converted to its corresponding pattern, i.e. a pattern that matches this one character.

subtype VString_Var is VString;

subtype Pattern_Var is Pattern;
These synonyms are used as formal parameter types to a function where, if the language allowed, we would use in out parameters, but we are not allowed to have in out parameters for functions. Instead we pass actuals which must be variables, and with a bit of trickery in the body, manage to interprete them properly as though they were indeed in out parameters.

function "&"  (L : Pattern; R : Pattern) return Pattern;

function "&"  (L : Pstring; R : Pattern) return Pattern;

function "&"  (L : Pattern; R : Pstring) return Pattern;

function "&"  (L : PChar;   R : Pattern) return Pattern;

function "&"  (L : Pattern; R : PChar)   return Pattern;
Pattern concatenation. Matches L followed by R.

function "or" (L : Pattern; R : Pattern) return Pattern;

function "or" (L : Pstring; R : Pattern) return Pattern;

function "or" (L : Pattern; R : Pstring) return Pattern;

function "or" (L : Pstring; R : Pstring) return Pattern;

function "or" (L : PChar;   R : Pattern) return Pattern;

function "or" (L : Pattern; R : PChar)   return Pattern;

function "or" (L : PChar;   R : PChar)   return Pattern;

function "or" (L : Pstring; R : PChar)   return Pattern;

function "or" (L : PChar;   R : Pstring) return Pattern;
Pattern alternation. Creates a pattern that will first try to match L and then on a subsequent failure, attempts to match R instead.

function "*" (P : Pattern; Var : VString_Var)  return Pattern;

function "*" (P : Pstring; Var : VString_Var)  return Pattern;

function "*" (P : PChar;   Var : VString_Var)  return Pattern;
Matches P, and if the match succeeds, assigns the matched substring to the given VString variable S. This assignment happens as soon as the substring is matched, and if the pattern P1 is matched more than once during the course of the match, then the assignment will occur more than once.

function "**" (P : Pattern; Var : VString_Var) return Pattern;

function "**" (P : Pstring; Var : VString_Var) return Pattern;

function "**" (P : PChar;   Var : VString_Var) return Pattern;
Like "*" above, except that the assignment happens at most once after the entire match is completed successfully. If the match fails, then no assignment takes place.

function "+" (Str : VString_Var)  return Pattern;
Here Str must be a VString variable. This function constructs a pattern which at pattern matching time will access the current value of this variable, and match against these characters.

function "+" (Str : VString_Func) return Pattern;
Constructs a pattern which at pattern matching time calls the given function, and then matches against the string or character value that is returned by the call.

function "+" (P : Pattern_Var)    return Pattern;
Here P must be a Pattern variable. This function constructs a pattern which at pattern matching time will access the current value of this variable, and match against the pattern value.

function "+" (P : Boolean_Func)   return Pattern;
Constructs a predicate pattern function that at pattern matching time calls the given function. If True is returned, then the pattern matches. If False is returned, then failure is signalled.

function Arb                                             return Pattern;
Constructs a pattern that will match any string. On the first attempt, the pattern matches a null string, then on each successive failure, it matches one more character, and only fails if matching the entire rest of the string.

function Arbno  (P : Pattern)                            return Pattern;

function Arbno  (P : Pstring)                            return Pattern;

function Arbno  (P : PChar)                              return Pattern;
Pattern repetition. First matches null, then on a subsequent failure attempts to match an additional instance of the given pattern. Equivalent to (but more efficient than) P & ("" or (P & ("" or ...

function Any    (Str : String)                           return Pattern;

function Any    (Str : VString)                          return Pattern;

function Any    (Str : Character)                        return Pattern;

function Any    (Str : Character_Set)                    return Pattern;

function Any    (Str : access VString)                   return Pattern;

function Any    (Str : VString_Func)                     return Pattern;
Constructs a pattern that matches a single character that is one of the characters in the given argument. The pattern fails if the current character is not in Str.

function Bal                                             return Pattern;
Constructs a pattern that will match any non-empty string that is parentheses balanced with respect to the normal parentheses characters. Attempts to extend the string if a subsequent failure occurs.

function Break  (Str : String)                           return Pattern;

function Break  (Str : VString)                          return Pattern;

function Break  (Str : Character)                        return Pattern;

function Break  (Str : Character_Set)                    return Pattern;

function Break  (Str : access VString)                   return Pattern;

function Break  (Str : VString_Func)                     return Pattern;
Constructs a pattern that matches a (possibly null) string which is immediately followed by a character in the given argument. This character is not part of the matched string. The pattern fails if the remaining characters to be matched do not include any of the characters in Str.

function BreakX (Str : String)                           return Pattern;

function BreakX (Str : VString)                          return Pattern;

function BreakX (Str : Character)                        return Pattern;

function BreakX (Str : Character_Set)                    return Pattern;

function BreakX (Str : access VString)                   return Pattern;

function BreakX (Str : VString_Func)                     return Pattern;
Like Break, but the pattern attempts to extend on a failure to find the next occurrence of a character in Str, and only fails when the last such instance causes a failure.

function Cancel                                          return Pattern;
Constructs a pattern that immediately aborts the entire match

function Fail                                            return Pattern;
Constructs a pattern that always fails.

function Fence                                           return Pattern;
Constructs a pattern that matches null on the first attempt, and then causes the entire match to be aborted if a subsequent failure occurs.

function Fence  (P : Pattern)                            return Pattern;
Constructs a pattern that first matches P. if P fails, then the constructed pattern fails. If P succeeds, then the match proceeds, but if subsequent failure occurs, alternatives in P are not sought. The idea of Fence is that each time the pattern is matched, just one attempt is made to match P, without trying alternatives.

function Len    (Count : Natural)                        return Pattern;

function Len    (Count : access Natural)                 return Pattern;

function Len    (Count : Natural_Func)                   return Pattern;
Constructs a pattern that matches exactly the given number of characters. The pattern fails if fewer than this number of characters remain to be matched in the string.

function NotAny (Str : String)                           return Pattern;

function NotAny (Str : VString)                          return Pattern;

function NotAny (Str : Character)                        return Pattern;

function NotAny (Str : Character_Set)                    return Pattern;

function NotAny (Str : access VString)                   return Pattern;

function NotAny (Str : VString_Func)                     return Pattern;
Constructs a pattern that matches a single character that is not one of the characters in the given argument. The pattern Fails if the current character is in Str.

function NSpan  (Str : String)                           return Pattern;

function NSpan  (Str : VString)                          return Pattern;

function NSpan  (Str : Character)                        return Pattern;

function NSpan  (Str : Character_Set)                    return Pattern;

function NSpan  (Str : access VString)                   return Pattern;

function NSpan  (Str : VString_Func)                     return Pattern;
Constructs a pattern that matches the longest possible string consisting entirely of characters from the given argument. The string may be empty, so this pattern always succeeds.

function Pos    (Count : Natural)                        return Pattern;

function Pos    (Count : access Natural)                 return Pattern;

function Pos    (Count : Natural_Func)                   return Pattern;
Constructs a pattern that matches the null string if exactly Count characters have already been matched, and otherwise fails.

function Rest                                            return Pattern;
Constructs a pattern that always succeeds, matching the remaining unmatched characters in the pattern.

function Rpos   (Count : Natural)                        return Pattern;

function Rpos   (Count : access Natural)                 return Pattern;

function Rpos   (Count : Natural_Func)                   return Pattern;
Constructs a pattern that matches the null string if exactly Count characters remain to be matched in the string, and otherwise fails.

function Rtab   (Count : Natural)                        return Pattern;

function Rtab   (Count : access Natural)                 return Pattern;

function Rtab   (Count : Natural_Func)                   return Pattern;
Constructs a pattern that matches from the current location until exactly Count characters remain to be matched in the string. The pattern fails if fewer than Count characters remain to be matched.

function Setcur (Var : access Natural)                   return Pattern;
Constructs a pattern that matches the null string, and assigns the current cursor position in the string. This value is the number of characters matched so far. So it is zero at the start of the match.

function Span   (Str : String)                           return Pattern;

function Span   (Str : VString)                          return Pattern;

function Span   (Str : Character)                        return Pattern;

function Span   (Str : Character_Set)                    return Pattern;

function Span   (Str : access VString)                   return Pattern;

function Span   (Str : VString_Func)                     return Pattern;
Constructs a pattern that matches the longest possible string consisting entirely of characters from the given argument. The string cannot be empty , so the pattern fails if the current character is not one of the characters in Str.

function Succeed                                         return Pattern;
Constructs a pattern that succeeds matching null, both on the first attempt, and on any rematch attempt, i.e. it is equivalent to an infinite alternation of null strings.

function Tab    (Count : Natural)                        return Pattern;

function Tab    (Count : access Natural)                 return Pattern;

function Tab    (Count : Natural_Func)                   return Pattern;
Constructs a pattern that from the current location until Count characters have been matched. The pattern fails if more than Count characters have already been matched.

function Match
  (Subject : VString;
   Pat     : Pattern)
   return    Boolean;

function Match
  (Subject : VString;
   Pat     : Pstring)
   return    Boolean;

function Match
  (Subject : String;
   Pat     : Pattern)
   return    Boolean;

function Match
  (Subject : String;
   Pat     : Pstring)
   return    Boolean;
Replacement functions. The subject is matched against the pattern. Any immediate or deferred assignments or writes are executed, and the returned value indicates whether or not the match succeeded. If the match succeeds, then the matched part of the subject string is replaced by the given Replace string.

function Match
  (Subject : VString_Var;
   Pat     : Pattern;
   Replace : VString)
   return    Boolean;

function Match
  (Subject : VString_Var;
   Pat     : Pstring;
   Replace : VString)
   return    Boolean;

function Match
  (Subject : VString_Var;
   Pat     : Pattern;
   Replace : String)
   return    Boolean;

function Match
  (Subject : VString_Var;
   Pat     : Pstring;
   Replace : String)
   return    Boolean;
Simple match procedures. The subject is matched against the pattern. Any immediate or deferred assignments or writes are executed. No indication of success or failure is returned.

procedure Match
  (Subject : VString;
   Pat     : Pattern);

procedure Match
  (Subject : VString;
   Pat     : Pstring);

procedure Match
  (Subject : String;
   Pat     : Pattern);

procedure Match
  (Subject : String;
   Pat     : Pstring);
Replacement procedures. The subject is matched against the pattern. Any immediate or deferred assignments or writes are executed. No indication of success or failure is returned. If the match succeeds, then the matched part of the subject string is replaced by the given Replace string.

procedure Match
  (Subject : in out VString;
   Pat     : Pattern;
   Replace : VString);

procedure Match
  (Subject : in out VString;
   Pat     : Pstring;
   Replace : VString);

procedure Match
  (Subject : in out VString;
   Pat     : Pattern;
   Replace : String);

procedure Match
  (Subject : in out VString;
   Pat     : Pstring;
   Replace : String);
Deferred Replacement

type Match_Result is private;
Type used to record result of pattern match

subtype Match_Result_Var is Match_Result;
This synonyms is used as a formal parameter type to a function where, if the language allowed, we would use an in out parameter, but we are not allowed to have in out parameters for functions. Instead we pass actuals which must be variables, and with a bit of trickery in the body, manage to interprete them properly as though they were indeed in out parameters.

function Match
  (Subject : VString_Var;
   Pat     : Pattern;
   Result  : Match_Result_Var)
   return    Boolean;

procedure Match
  (Subject : in out VString;
   Pat     : Pattern;
   Result  : out Match_Result);

procedure Replace
  (Result  : in out Match_Result;
   Replace : VString);
Given a previous call to Match which set Result, performs a pattern replacement if the match was successful. Has no effect if the match failed. This call should immediately follow the Match call.

function "*"  (P : Pattern; Fil : File_Access)           return Pattern;

function "*"  (P : Pstring; Fil : File_Access)           return Pattern;

function "*"  (P : PChar;   Fil : File_Access)           return Pattern;

function "**" (P : Pattern; Fil : File_Access)           return Pattern;

function "**" (P : Pstring; Fil : File_Access)           return Pattern;

function "**" (P : PChar;   Fil : File_Access)           return Pattern;
These are similar to the corresponding pattern assignment operations except that instead of setting the value of a variable, the matched substring is written to the appropriate file. This can be useful in following the progress of a match without generating the full amount

function Image (P : Pattern) return String;

function Image (P : Pattern) return VString;
This procedures yield strings that corresponds to the syntax needed to create the given pattern using the functions in this package. The form of this string is such that it could actually be compiled and evaluated to yield the required pattern except for references to variables and functions, which are output using one of the following forms:

access Natural NP(16#...#) access Pattern PP(16#...#) access VString VP(16#...#)

Natural_Func NF(16#...#) VString_Func VF(16#...#)

where 16#...# is the hex representation of the integer address that corresponds to the given access value


procedure Dump (P : Pattern);
This procedure writes information about the pattern to Standard_Out. The format of this information is keyed to the internal data structures used to implement patterns. The information provided by Dump is thus more precise than that yielded by Image, but is also a bit more obscure (i.e. it cannot be interpreted solely in terms of this spec, you have to know something about the data structures).

private

   --  Implementation-defined ...
end GNAT.Spitbol.Patterns;