[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[MacPerl] Generic Search Script




Hi,

I wrote this search script and am submitting here for two reasons:

   a) I'm hoping some of you MacPerl experts can give me some
      tips how to improve this script.

   b) I hope that it will be of some benefit to the MacPerl
      community.

Even though this script is intended to be generic (run on any
platform running Perl), I have been using it in MPW.  It appears
to be faster and more flexible than the searching facilities
built into MPW.

I would appreciate suggestions for improvements.

Thanks,

Ero Brown





#========================================================================
#========================================================================
#
# Search.pl
#
# Author: Ero Brown <ero@fiber.net>
#
# Version: 1.3
#
#========================================================================
#========================================================================

#!/usr/local/bin/perl

### required package needed for this script
require 'find.pl';


#========================================================================

### define true and false.
$TRUE = 1;
$FALSE = 0;


#========================================================================

### Forward declare all sub-routines used
sub Usage;
sub GetHostOS;
sub GetDirSeperatorChar;
sub ReadArguments;
sub DoAFile;
sub wanted;


#========================================================================
# Function:	Usage
# Purpose:	this details the usage of this script -- we can call this
#           function when a invalid usage error occurs.
#========================================================================
sub Usage {
	die <<END_USAGE;
usage: $0 [-r] [-p] [-c] [-i] [-m] [-f <expression>] -e <expression>
<folder(s) &| file(s) . . .>

	-r   recursively process encountered sub-directories
	-c   only list a total count of matching occurrances
	-i   make pattern matching case insensitive
	-f   file expression representing a file filter
	-e   the expression representing what you are searching for
	-m   mpw style output
	-p   print progress and summary information

	*Parameters can be given in any order.

	*Fully resolved pathnames are required, partial pathnames will
	 cause this script to barf.

	Version 1.3
END_USAGE
}


#========================================================================
# Function:	GetHostOS
# Purpose:	Determine what OS we are running on -- $^O is a Perl variable
#           that contains to name of the OS that Perl was compiled on.
#========================================================================
sub GetHostOS
{
	local($tOSString) = $^O;
	## print "Running this script on a computer running $tOSString...\n\n";
	return($tOSString)
}


#========================================================================
# Function:	GetDirSeperatorChar
# Purpose:	Each platform (OS) uses a unique character to seperate
#           directory names within a full path specification string.
#           Here we set a variable to contain the correct directory
#           seperator character to be correct for the OS that we are
#           currently running this script on.
#========================================================================
sub GetDirSeperatorChar
{
	local ($tOSString) = @_;
	local ($tDir);
	if ($tOSString =~ /mac/i) {  #it's a Macintosh!
		$tDir = ':';
	}
	elsif ($tOSString =~ /nix/i) {  #it's probably some flavor of Unix
		$tDir = '/';
	}
	else {  #well, it could be a bunch of other things, but assume a
Wintel PC
		$tDir = '/';   #don't know what else to use!?! -- only the
unix flavor works!
	}
	return ($tDir)
}


#========================================================================
# Function:	ReadArguments
# Purpose:	Parse the arguments array and seperate everything into two
#           distinct arrays, one containing the "-<opt>" command options,
#           and the other containing the list of directories (and possibly
#           files) to be checked; then go through the command options
#           array and set the appropriate command flags . . . we do it
#           this way so we can put commands and regualr arguments in any
#           order on the command line.
#========================================================================
sub ReadArguments
{
	local ($i,$hasArgs);

	#if no arguments are passed in, assume the user needs some help
	if (@ARGV < 1) {
		Usage;
	}

	#these are the two lists (arrays) we're going to seperate our args into
	@ARG_Options = ();
	@ARG_Dirs = ();
	$expression = "";
	$fileFilter = "";

	$hasArgs = $FALSE;
	$i = 0;
	while ($i <= $#ARGV) {
		$option = $ARGV[$i];
		if ($option =~ /-[rcfmpei]/i) {  #if it's one of the valid
command options...
			push(@ARG_Options,$option);  #add it to the command
options list
			if (($option =~ /-e/i) && ($i+1 <= $#ARGV)) {  #the
expression command...
				$i++;
				$expression = $ARGV[$i];
			}
			if (($option =~ /-f/i) && ($i+1 <= $#ARGV)) {  #the
fileFilter command...
				$i++;
				$fileFilter = $ARGV[$i];
			}
			$hasArgs = $TRUE;
		}
		else {
			push(@ARG_Dirs,$option);  #add it to the
directories list
		}
		$i++;
	}

	$recurseOpt = $FALSE;
	$progressOpt = $FALSE;
	$expressionOpt = $FALSE;
	$fileFilterOpt = $FALSE;
	$mpwOutputOpt = $FALSE;
	$countOnlyOpt = $FALSE;
	$caseInSensitive = $FALSE;
	$pad = "";

	#now that we have all the command options, go through as set the
	#appropriate flags that tell us how this script should behave.
	if ($hasArgs) {
		while (@ARG_Options) {
			$op = shift(@ARG_Options);
			if ($op =~ /-r/i) {
				$recurseOpt = $TRUE;
			}
			if ($op =~ /-p/i) {
				$progressOpt = $TRUE;
				$pad = "\t";
			}
			if ($op =~ /-e/i) {
				$expressionOpt = $TRUE;
			}
			if ($op =~ /-f/i) {
				$fileFilterOpt = $TRUE;
			}
			if ($op =~ /-m/i) {
				$mpwOutputOpt = $TRUE;
			}
			if ($op =~ /-c/i) {
				$countOnlyOpt = $TRUE;
			}
			if ($op =~ /-i/i) {
				$caseInSensitive = $TRUE;
			}
		}
	}
}


#========================================================================
# Function:	DoAFile
# Purpose:	process the file/directory
#           this is where the action is -- look at every file and
#           directory and perform the desired operations.
#========================================================================
sub DoAFile {
	local ($theFileName) = @_;
	local ($shortName, $doSearch);

	if (-f $theFileName) {
		$fileCount++;
		if ($progressOpt) {
			print $pad, "FILE: $theFileName\n";
		}
		$shortName = $theFileName;
		@dirNameParts = split($Dir,$shortName);  #break the full
path name apart
		$shortName = $tempName = pop(@dirNameParts);  #now we have
the filename without the full path

		$doSearch = $TRUE;
		if ($fileFilterOpt) {
			$doSearch = $FALSE;
			if ($caseInSensitive) {
				if (eval($shortName =~ /$fileFilter/oi)) {
					$doSearch = $TRUE;
				}
			}
			else {
				if (eval($shortName =~ /$fileFilter/o)) {
					$doSearch = $TRUE;
				}
			}
		}

		if ($doSearch) {
			local ($tempFoundCount) = 0;
			local ($lineCounter) = 1;
			if ($progressOpt) {
				print $pad, "Searching $theFileName...\n";
			}
			$exp = "/".$expression."/".$setting;

			open(THEFILE, $theFileName) || die "Can't open file
$theFileName\n";
			while ($lineStr= <THEFILE>) {
				$found = $FALSE;
				if ($caseInSensitive) {
					if (eval($lineStr =~
/$expression/oi)) {
						$found = $TRUE;
					}
				}
				else {
					if (eval($lineStr =~ /$expression/o)) {
						$found = $TRUE;
					}
				}
				if ($found) {
					$tempFoundCount++;
					if (!$countOnlyOpt) {
						if ($mpwOutputOpt) {
							print $pad, $pad,
"File \"", $theFileName, "\"; Line ", $lineCounter, ":¤ \t# ", $lineStr;
						}
						else {
							print $pad, $pad,
$theFileName, " -- line #", $lineCounter, "\n",  $pad, $pad, $pad, $lineStr;
						}
					}
				}
				$lineCounter++;
			}
			close(THEFILE);
			if (($progressOpt) && ($tempFoundCount > 0)) {
				print $pad, $pad, $theFileName, " -- had ",
$tempFoundCount, " occurrances of the search expression.\n";
			}
			$foundCount += $tempFoundCount;
		}

	}
	# it must be a directory (-d $theFileName)
	else {
		$dirCount++;
		if ($progressOpt) {
			print "DIRECTORY: $theFileName\n";
		}
	}
}


#========================================================================
# Function:	wanted
# Purpose:	We need to provide this function for the &find() function in
#           the 'find.pl' package -- this is so we can control (via filter)
#           what kind of files and/or directories get processed.
#========================================================================
sub wanted
{
	#filter so we only process the files/directories we want
	((-f $_) ||   #do all files
			#only one level deep of directories if the recurse
option is not on
		((-d $_) && (!$recurseOpt) && ($File::Find::prune = 1)) ||
		((-d $_) && ($recurseOpt)) ) &&   #all subdirectories if
recurse option is on
		DoAFile($name);   #call our critical function that actually
does the checking
}


#========================================================================
#============================   MAIN   ==================================
#========================================================================

### main part of the program --
### we go through every file and directory passed in on
### the command line and check it for validity.

#what OS are we running on?
$osString = GetHostOS;

#what's the right dir seperator char to use?
$Dir = GetDirSeperatorChar($osString);

#parse the command line for commands and arguments
ReadArguments;

#initialize these counter variables
$fileCount = 0;
$dirCount = 0;
$unknownCount = 0;
$errorCount = 0;
$foundCount = 0;

#if no directories/files are passed in, assume the user needs some help
if (@ARG_Dirs < 1) {
	print "No search Directorie(s) was specified.\n";
	Usage;
}
elsif (!$expressionOpt) {
	print "The required -e expression option was NOT specified.\n";
	Usage;
}
elsif (!$expression) {
	print "The search expression specified was empty.\n";
	Usage;
}
elsif (($fileFilterOpt) && (!$fileFilter)) {
	print "The file filter expression was empty.\n";
	Usage;
}

#process every directory (and/or file) provided on the command line
while (@ARG_Dirs) {
	$arg = shift(@ARG_Dirs);  #get the next directory/file to examine
	if ((-f $arg) || (-d $arg)) {  #make sure we are dealing with a
valid disk entity
		find($arg);  #'find' is a function in the Perl module
"File::Find", it calls our "wanted"
	}				 #function which we use as a filter
and also calls our critical function "DoAFile".
	else {
		$unknownCount++;
		print STDERR "$PROGRAM: could not read $arg: $!\n\n";
	}
}

if (($progressOpt) || ($countOnlyOpt)) {
	print "FOUND $foundCount occurrances of \"$expression\" in the
specified files.\n";
}
if ($progressOpt) {
	print "Examined $fileCount files, and $dirCount directories";
}
if (($progressOpt) && ($errorCount > 0)) {
	print "-- $errorCount of which had invalid names";
}
if ($progressOpt) {
	print ".\n";
}
if (($progressOpt) && ($unknownCount > 0)) {
	print "THERE WERE $unknownCount UNKNOWN ENTITIES ENCOUNTERED.\n";
}
exit;

#========================================================================
#========================================================================



***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch