INTRODUCTION
   - Perl History: Larry Wall 1986
     - scripting lang: string processing, pattern match, fast file processing
     - web, cgi
     - like awk and sed and python and ...
     
   - multiple ways to do things (deliberately violates simplicity)

   - different from Java and Ada: 
      - no variable declarations: 
            - a variable's type is determined by what it holds

	  - variable type is dynamic (ie can change at runtime)
              $i = 3;   print i;   # 3
              $i = 'a'; print i;   # a

      - Prefix dereferencer - prefix specifies the kind of variable
               $ - scalar: eg, $i, $name, eg $a = 3;  
                     (no structure: number, string, undef, reference)

               @ - array (ie list): eg, @a = (11, 'x', 13); 
                           print @a;   # 11x13

                           $j = $a[1]; 
                           print $j;   # x

                           $a = $a[2]; 
                           print $a;   # 13

               % - hash tables: eg, %table

               & - subroutine call (if needed to avoid ambiguity)

	  - automatic conversions: 
                 $i="1"; 
                 $i=$i+1; 
                 print $i;  

	        - QN: is $i now a string or an integer?  
                     Hard to tell because of conversions

	        - QN: "a" + 1;  
                  A: Converts non-numeric strings to 0.  Error if -w specified.
      
      - lots of c'isms: 
            - no boolean, 
            - use = for asst
            - = is an expression with a value

	  - default variables and parameters

	  - semicolon at end of statements, not required at end of block

HELLO WORLD
   - The following program is in file hello.p 

      print  "Hello World!\n" ;      # Comment to end of line
      print  "Hello"," World\n" ;    # Case sensitive
      print ("Hello"," World\n")    ;
      print 'Hello',' World' , "\n";
      print ('Hello World\n');       # prints Hello World\n
      # More ways are possible

   - execution: invoke interpreter directly (w/ warning option)
         perl -w hello.p 

   - execution as script: 
           - put this line first in file: #!/usr/local/gnu/bin/perl -w
           - chmod 700 hello.p 
           - execute with: hello.p

   - Another example: Print squares (note: {} required)
      $i=0; 
      $limit=5; 
      while ($i<$limit)
         {print $i*$i; $i++}

BOOLEAN and truth:  
           - These evaluate to false: 0, "" and "0",  undef
           - These evaluate to true: other numbers and strings, any reference
           - Boolean operators return 0 and 1

      - Consider: $i = 5;
                  while ($i)
                    {print $i; $i = $i - 1;}


                  $i=5; while($i--)   {print $i} 
                  $i=5; while($i=$i-1){print $i} 


INPUT, end of file, AND STANDARD VARIABLE $_ aka $ARG:
   # Read a record from standard input
   # Returns (and halts loop) empty string on EOF
   while ($temp = <STDIN>)   
       {print $temp;}         

    # Do an explicit check for end of file
    while (!(eof STDIN))
    {
        $line = <STDIN>;
        print $line;
    }




   while (<STDIN>)
      {print $_;}       # DEFAULT variable

   use English;           # Module English makes it possible to use $ARG 
                          #    as an alias for $_
   while (<STDIN>)
      {print $ARG;}       # English name for $_

   while (<STDIN>)
      {print ;}         # default parameter is DEFAULT variable


   while (<STDIN>)
      {chop; print;}

     - chop ; chop var; chop list;
     - modifies argument: $x="abcd"; chop $x; chop $x; print $x # ab
     - chop a list: $x="abcd"; chop ($x, $x); print $x # ab
     - returns character chopped
     - chomp is similar but it 
          - removes value of $/ - record separator (unix \n) 
          - safer
          - returns # of characters removed

   # Be default, in a loop context, <> refers to <STDIN> 
   while (<>)           
      {chomp; print;}


   # OPENING A NAMED FILE
   $myFile = "foo";
   open AFILE, $myFile or die "unable to open file $myFile ($OS_ERROR)\n";

   while (<AFILE>)           # Use file handle FILE
      {chomp; print;}        # Print the contents of the file without newlines

   

STRINGS: There are two types of strings: literal and interpolated:
   LITERAL: 
      - enclosed by single quotes: 'abc'
      - no substitutions (except for \' and \\)
           $a = 3;
           print '$a "abc" \n \' stuff'   
           # Prints: $a "abc" \n ' stuff

           print '$a "abc" \n \\ stuff'         
           # Prints: $a "abc" \n \ stuff

   INTERPOLATED: 
      - enclosed by double quotes: "abc"
      - substitutions of variables and escape characters:
           $a = 3;
           print "$a 'abc' \n \' stuff"
           # Prints: 3 'abc' 
                      ' stuff

NUMBERS: integer, float, hex, octal, scientific, underlines

UNDEF: 
        - function (aka unary operator): undefines a variable 
	    what's this do: 
                 undef $/;      # undefine the record separator
                 $l = <STDIN>;  # read entire file
                 print $i;      # prints entire file

	- can also be used as a value 
	    - undef $i same as $i = undef
	    - returned by some functions on failure
	    - also used to undefine hash entries ...

	- check if undef with defined: if (defined $i)

        - uninitialized values default to 0 or ""


LISTS:
  Lists are ordered collection of values or variables

    - can be stored in arrays or used to create hashes

    - can be used in assignments and as function parameters and return values

    - formed using comma, the list operator 

          - 4 example lists (the last 2 are identical):
                 1,2,3  
                 'a', 'b', 'c', 1,'aa'
                 'aa', 'bb', 'cc'     
                 qw(aa bb cc)         # qw means quote word

    - print 1,2,3;  123         # No separator
    - print join '+', 1,2,3;    # 1+2+3  Join using + as separator
    
    - lists are frequently enclosed in parens  print (1,2,3)   
        (sometimes precedence requires parentheses to get what you want) 

    - empty list ()

    - can lists contain lists?  
        Yes but they are flattened   (1,(2,3)) == (1,2,3)  
        (note, == works on lists)

    - range operator (1..5) == (1,2,3,4,5)

    - Assignment:
        (a,b) = (1,2)
        (a,b,c) = (1,2)    # c assigned undef
        (a) = (1,2)        # 2 is ignored

    - many functions operate on lists
	  $x="aa"; $y="bb"; chop ($x,  $y);print $x,$y  #ab
	  $x="aa"; $y="bb"; chop ($x,  $x);print $x,$y  #bb
    
    - Lists can be indexed using []: 
            print( (1,2,3,4,5)[1]  );  #2
            print( (1,2,3,4,5)[-1] );  #5

    - map: Operate on each element of a list
       @a = (-5, -6, -7, -8);
       @b = map {abs $_} @a;
       print @b;     # 5678

    - split: 
          $a = "one two";
          ($b, $c) = split / /, $a;
          print $c, $b

    - sort: print sort 4,2,5,1;   # 1245

    - reverse

    - grep: return a list of elements that match
       print grep /2/, 4,2,5,1;  # 2
       print grep !/2/, 4,2,5,1;  # 451
       print grep \!/2/, a,2,c,d;  # acd

    - can use , at end

    - output record term/sep
        $, - output field seperator: printed between list elements, default ""
        $\ - output record seperator: printed at end of list , default ""

	'$,="a"; $\="x"; print 1,2,3,4; # 1a2a3a4x



ARRAYS - allow you to give a name to a list 
   - Example:  
        @a = (1,2,3,4);    # Use the name @a for the list 1,2,3,4 
        print @a ;         # prints 1234
        print $#a ;        # prints 3, the index of the last element in the list

   - Careful, = has higher precedence than ,

       @array = 1,2,3; print @array;      #Prints 1.  2,3 ignored

   - array indexing:!!
      print $a[0]  # say the scalar a 0

   - push, pop, shift, unshift:
         @a = (11,12,13);
         $x = pop @a;          # remove from back of list
         print join ', ', @a;  # 11, 12
         print $x;  # 13
         push  @a, (55, 56);   # add to back of list
         print join ', ', @a;  # 11, 12, 55, 66
         # shift and unshift do the same to the front of the list


CONTEXT: determines how some expressions are evaluated

      - Two contexts: list and scalar

      - Example:
            @a = (5,6,7,8);
            
            $x = @a;    # @a is evaluated in scalar context
            @y = @a;    # @a is evaluated in list   context

            print $x;   # Prints 4, the length of the array @a
            print @y;   # Prints 5678, the elements of @a

      - Context affects: 
          - a variable - how it is interpreted, 
	  - function results - how they are interpreted
	  - operators - what returned

      - Context is set by: 
          - LHS of assignment sets the context for the RHS, 
	  - what function expects sets the context for its parameters
	  - boolean expression - never cause any conversions
          - subroutines can return different things in different contexts 


COMMAND LINE ARGUMENTS
    - Command line arguments are in the array @ARGV (NOT in @ARG)

         for ($i=0; $i<=$#ARGV; $i++){   
            print $ARGV[$i]
         }
             
OPERATORS: 
        - List operators have different right and left precedence:
	      print 1, 3, sort 4, 6, 5 
	         sort higher than terms to it's left, so done first
	         sort low wrt terms on right , so done last

        - String comparison: lt, gt, le, ge, eq, ne, cmp (-1, 0, 1)

	       True: 'aa' lt 'ab'     false: '10' < '2'   # convert to integer
	             'ab' lt 'abc'           '2.00' != '2'
	             'ba' lt 'c'
	             '10' lt '2'
		     '2.00' ne '2'

		     '2.00' == '2'
		     '2.00' ==  2 

	    =~ for patternmatch (See regular expressions)


CONTROL FLOW (bodies require braces):

if () {} elsif () {}  else {}

    # Equivalent to if !($i == 5)
    unless ($i == 5) 
        {print "Not 5";} # executes when condition is false
    else 
        {print "Is 5";}

    ####################################################
    LOOPS:

        for ($i=0; $i<10; $i++) { print $i }


        for($i=0; $i < 5; $i++){
            if ($i==3) 
                {next}   # next moves to next iteration
            print $i
        }       

        for($i=0; $i < 5; $i++){
            if ($i==3) 
                {last}  # exit loop immediately
            print $i;
        }


	FOREACH STATEMENT: 
       @a=(1,2,3);
       foreach $x (@a) {print $x;}  # $x is local to loop

       foreach (@a) {print $_;}
       foreach (@a) {print ;}
	   for (@a) {print;}


    NAMED LOOPS
        MYNAME: 
        for($i=0; $i < 10; $i++){
            if ($i==3) 
                {next MYNAME}  # next iteration of loop MYNAME
            if ($i==6) 
                {last MYNAME}  # exit loop MYNAME immediately
            print $i;
        }



    MODIFYING STATEMENTS: if unless, until, while
              print $i if $i > 3    # only prints if i > 3
              print $i unless $i > 3    # only prints if i <= 3

     exit - exits program

     die list - prints list and exits
      

FALL 04: SKIP to HASH TABLES 

REGULAR EXPRESSIONS: RE
     Expression that describes a string or collection of strings
       - Compare w/ arithmetic expr: 5 and 3+2 both describe 5
       - Compare w/ boolean expr: false and 3 < 2 both describe false
       - Both have symbols (eg 5, false) and operators (eg = <)

     For RE, 
        - building blocks are characters describes themselves
	- concatenation of characters describes strings
	- some characters have special meaning, ie basic operators | *  ?
	  RUCS describes RUCS
	  RU CS describes RU CS  ie space is a character
	  | represents a choice  
	     R|U|C|S describes one character, either R or U or C or S
	     rucs|rucs2 describes one string, either rucs or rucs2
	  * represents 0 or more repeittions
	     r* describes empty string or r or rr or rrr or ...
	     ru*cs describes rcs, rucs, ruucs, ruuucs, ...
	  ? represents optional
	     rucs2? describes rucs and rucs2
	     rucs?2 describes rucs2 and ruc2
	       
	  Precedence: HIGH: *?, concatenation, choice low
	     ru|c?s describes rus or rcs or rs
	     
	  () can be used to group
	     ru(cs2)? describes ru and rucs2
	     ((login|logout) (rucs|rucs2|ruacad) now!)* 

     PATTERNS: Put an RE inside slashes to define a pattern: /RE/
        - sentences that contain a string that is described by a RE is said 
	       to MATCH a pattern

        - specify string to match a pattern  using the operator =~

	     if ($name =~ /rucs2?/) # true if $name contains string rucs or rucs2 

             By default, the operator =~ operates on $_ :

	       if (/rucs2?/)  # true if $_ contains string rucs or rucs2 

               while ($line = ){print if $line =~ /hello/}

               while (<>){print if /hello/}  # Same thing

               while (<>){print unless /hello/}

        - /RE/ is a shortcut for m/RE/
        - Other delimiters can be used: m{RE}, m'RE'
        - m'RE' does not string substitution
        - $& remembers what was matched:
                $line = "login to rucs today!"
                $line =~ m/rucs2?.*d/     # . matches any character
                print $&                  # prints rucs tod

	     
     SHORTCUTS: 
        use [rucs] for r|u|c|s
        use [a-z] for a|b|c|d|...
	\d for any digit
	\s for any whitespace character: space, newline, tab, formfeed
	\w for any word character: letter, digit, underline
	\D, \S, \W for not a digit, not whitespace, not a word character

	use x+ for xx* ie one or more x's, what is \w+

	use ^ and $ for beginning and end of line: /^Hello/

	REPETITION: 
	   use x{3} for xxx
	   use x{3,5} for xxx or xxxx or xxxxx
	   use x{3,} for xxx or xxxx or xxxxx or xxxxxx ...
	   what are x{0,} x{1,} x{0,1}

        METACHARACTERS 
	    |, *, [, ], *, +, are metacharacters

            They describe REs, not themselves

	    put \ in front of metacharacter to make it describe itself

	    \ in front of a regular character is ignored
	        \\ means \ - whether or not \ is a metacharacter

	    within [], only ^, -, and ] need \

       OTHER DELIMITERS FOR PATTERNS: 
         Can use m to specify different delimiter than /
	    In  m!www.runet.edu/~nokie! the delimiter is !
	    In m{www.runet.edu/~nokie}  the delimiter is {}
               () and [] can also be used as delimiter

       SUBSTITUTION
             $a = "abcdef"
             $a =~ s/abc/def/ 
             print $a   # defdef
             while (<>) { s/\n//; print}  # prints input without newline


HASH TABLES: provide tables of (key, value) pairs

     # Create a hash table of state capitals

     %caps = ( 
        va => 'richmond',  # Keys are strings by default
        nc => 'raleigh'
             );

     # Print an element
     # Notice the $ and {}

     print $caps{'va'};    # prints richmond


     # Check if an element exists in the table

     exists $caps{'va'};   # evaluates to 1  (ie true)
     exists $caps{'VA'};   # evaluates to "" (ie false)


     # Add an element to the table

     $caps{'tn'} = 'nashville',
     print $caps{'tn'};   # prints nashville


     # Another way to create the table of capitals

     %caps2 = ( 'va',  'richmond', 'nc', 'raleigh');
     print $caps2{'va'};   # prints richmond


     # Yet another way to create the table of capitals
     # Remember qw is quoteword

     %caps3 = qw( va   richmond nc raleigh);
     print $caps3{'va'};   # prints richmond


    # Prints values in sequence.  
    #   each returns each pair in table.

    while ( ($state, $cap) = each %caps){
        print "$state, $cap"; 
    }


    # Prints pairs in table, sorted by key (ie state)
    #   keys returns a list of the table's keys

    foreach  $state (sort keys   %caps) {
        print "$state, $caps{$state} "; 
    }

    # Prints values in table

    print "caps from values\n";
    foreach  $aCap (sort values   %caps) {
        print "$aCap "; 
    }


    # Whoops, tries to look up by capitals
    # Prints 3 (caps, null string) pairs

    foreach  $aCap (sort values %caps) {
        print "$aCap, $caps{$aCap} "; 
    }

    # Need to invert caps table for this to work.  Can invert using a
    #      reversed key value pair list to create a new list, something like
    #     %revcaps = reverse join each %caps 


     #Ways to remove an entry:
       $caps{'nc'} = undef;
       delete $caps{'nc'};


SUBROUTINES:
     - Basic definition: use keyword sub:  
            sub printhello {
                print "hello";
            }

     - Can declare anywhere in program, but it's better to define a sub before using it

     - Variable scope: 
        - variables visible in entire program unless restricted
        - think of the subroutine code being expanded at the point of call
        - restrict to block with my:
             foreach $i (0..6){my $sq = $i*$i; print "$i, $sq "; }
        - use my for local variables

        - keyword local is similar, but more complex.

     - Use return expr to return a value. 

       If no return statement, value of last expr evaluated is returned

     - Parameters:

	- Subs normally take a list of params of any length
	   (prototypes, which we won't discuss, allow specification of params)

	- params found in are found in the list @ARG:
           sub printargs {print @ARG;}


           sub lastchar{  # returns last character of last parameter
	       my $lastarg=$ARG[-1];
               my $c=substr($lastarg,-1,1);
               return $c;
           }

	- Careful: the parameter list is flattened

	- params are passed by reference (ie in/out parameters):

            use English;
            sub changefirst{
               print "changefirst";  
               $ARG[0] = 99;
            }

            @a = (1..5); 
            changefirst @a; 
            print "changed list is ", @a;  # 992345


	- Use locals to get effect of named in params.  
           None of these modify the parameter list.

	    sub doesntchangefirst{my $first = $ARG[0]; $first = 99;}

            # The following causes a compiler error:
	    sub doesntchangefirst{my $first = shift $ARG; $first = 99;}

            # The one above should be like this:
	    sub doesntchangefirst{my $first = shift @ARG; $first = 99;}

	    sub doesntchangefirst{my $first = shift ; $first = 99;}

	- Assign to a list of locals:
            sub printFirstTwo{
               my ($first, $second) = @ARG;
               print $first, ", ", $second;
            }

            @a = (1..5); 
            printFirstTwo @a; 

	- Result of return can be based on how called:
              $x = sub;  # result evaluated in scalar context
              @a = sub;  # result evaluated in list context


	- Returning a list:

      sub returnFirstTwoRev{
               my ($first, $second) = @ARG;
               return ($second, $first);
            }

      @a = (1..5);
      print returnFirstTwoRev @a;  # Prints 21


   sub first_letters{
     my $phrase = shift;    
     my @words = split /\s+/, $phrase;
     return join '', map { substr($ARG, 0, 1) } @words;
   }

   while (<>) {
      chomp;
      $initials = fisrstletters $ARG
      print "$ARG -> \U$initials\E\n";
   }