[Perl] Opening text files for reading, and simple regexp (regular expressions)

Perl is easy to use, and pretty good for working on text files.

If you have a text file named records.txt with the following,


Following is an example for opening the file for reading, and printing out all lines in the file.

----- begin example -------
open FILE, "records.txt";

while ($line=<FILE>){
print $line;
----- end example -------

Instead of FILE, you can use FILEIN, MYFILE, or STUDENT etc.
What if you want the filename to be specified at runtime, instead of hardcoded in the script?
use shift, as shown in the following example,

----- begin example -------

open STUDENT, $filename or die "error opening $filename\n";

while ($line=<STUDENT>){
print $line;
----- end example -------

if the script is named, it can be executed as follows, ./ records.txt

In the example above, I also sneaked in the use of "die"; If the open fails, print out a error message, and end the script.

The use of "or" is similar to following
if (! (open STUDENT, $filename) ){
die "error opening $filename\n";

i.e. if whatever happens on the left of "or" does not return 1, evaluate whatever is on the right hand side of "or".

In case you are not familar with \n, it is a linefeed.

To search for a string, eg. "Science", you can read in each line, and use pattern matching. Following code opens file records.txt, reads in each line, check for "Science", and prints out the line if it matches.

----- begin example -------
open FILE, "records.txt";

while ($line=<FILE>){
if ($line=~/Science/){
print $line;
----- end example -------

How about printing out each line, split into Name, class, subject, score?

We know the format of the file, each line contains name, class, subject and score, delimited by ",". A regexp of ^(.*?),(.*?),(.*?),(.*?)$ will allow us to extract 4 variables from the line.

^ matches beginning of line
(.*?), matches one or more characters till ",", and assigns the matched string into a variable
$ matches end of line

----- begin example -------
open FILE, "records.txt";

while ($line=<FILE>){
if ($line=~/^(.*?),(.*?),(.*?),(.*?)$/){
print "Name:$1 Class:$2 Subject:$3 Score:$4\n";
----- end example -------

