From RootdevWiki
Cheatsheet
By Randal Schwartz and Tom Phoenix
code examples - http://www.oreilly.com/catalog/lperl/
Introduction Perl - Practical Extraction and Report Language, created by Larry Wall 1980s
Pathologically Eclectic Rubbish Lister
Scalar data double-precision floating-point - used internally in Perl. It's is same as
double declaration in C.
6123234234 - or 6_123_234_234
Nondecimal interger
octal - 0377, start with 0
hex - 0xff, start with ox
binary - ob111, start with ob
exponetiation - 2**3=8
single-quoted string - 'hello\n', \n not new line. Only when \\ or \', does
\ has some speical meaning
double-quoted string - "hello world\n", \n is new line here
string cancatenation - use .
string repetition - 4 x 3 gets 444
built-in warnings - perl -w my_program, or #!/usr/bin/perl -w
print "i want 3 ${food}s" - avoid confusion with $foods
string comparsion - eq, ne, lt, gt, le, ge
The if condition - if the scalar value is undef, 0, , or '0', it's false
get user input - $line=<STDIN>, chomp($line=<STDIN>) to remove new line
parentheses - unless it changes the meaning to remove them, they are always optional
defined function - use defined($value) to check undef values
LIST AND ARRAYS
Notes on exercise: Read input lines into a list - a much simple way instead of using loop
chomp(@lines=<STDIN>);
subcript of array - $fred[1.712] is truncated to next lower int $fred[1]
index of the last element - $#rocks, $rocks[ $#rocks ] equi to $rocks[ -1 ]
list - qw/ fred barney bettey / == ("fred", "barney", "betty")
swap elements - ($fred, $barney)=($barney, $fred)
refer to entire array - @rocks = qw( fred, barney, betty )
pop and push - pop @$fred to pop the last element; push @$fred 8 shift and unshift - operate at start of the array
foreach - when the control variable is modified, so is the array element When the loop finishes, the control variable restores value b4 loop $_ is used as the default control variable (also default for print)
reverse - reverse @fred (doesn't affect the argument)
sort - by default sort as ASCII strings
scalar and list context - $number=5+@fred (5+3) @list=@fred (list) $n=@fred (scalar 3) print scalar @fred (force scalar) @betty=(); to empty the array
SUBROUTINES
define a subroutine - sub, &sub to invoke
global variable - sub marine { $n += 1; #global varaible, #use my ($n) to make it local #use local ($n) to temporarily reuse }
return value - the last operation of subroutine definition, list can be returned
argument - $_[0], $_[1], use (@_==2) to validate the arguments
private variables in subroutine - my($a, $b)=@_
use strict - to enforce varaible declaration rules. "my $bam_bam = 3"
omit the ampersand - &shuff, & can be omitted if perl sees the declaration before or judges from the context. use & to invoke only subroutine but build-in
CHAPTER HASHES
hash element access - $hash{$some_key}
define hash - %some_hash=("foo", 35, 2.5, "hello", "betty", "bye\n") =("foo" => 35, 2.5=> "hello", "betty" = > "bye\n")
hash to list - @any_arry=%some_hash
swap key/value - %inverse_hash=reverse %any_hash
functions - my @k=keys %hash, my $count=keys %hash my @v=values %hash
iterate hash - while( ($key, $value)=each %hash ) foreach $key (sort keys %hash){ $value=$hash{$key}; }
exit in keys - if (exists $books{"bino"})
delete the key - my $person="betty"; delete $books{$person};
I/O Basics
foreach and while - while(<STDIN>){print..} print each input foreach (<STDIN>){print..} print after all inputs
read file - ./d file1 file2, while(defined($line = <>)){chomp($line);} while(<>){ chomp; print "$_\n" }
print out - print @array, if array contains \n print "@array", to print array with added space
cat - print <>
sort - print sort <>
print - print (2+3), is a function call print 4*(2+3) instead of print (2+3)*4
printf - $g refers to general numeric values printf "%6d\n", 42, left justified printf "%-6d\n", 42, right justified printf "the percentage is %.2f%%\n", 2/12
printf array - printf "The items:\n", ("%10s\n" x @items), @items printf ("%*s\n", $width, $line); same as ("%{$width}s\n", ...)
CONCEPTS OF REGULAR EXPRESSIONS
check pattern matching - $_="patten", then if (/string/) to get yes/no
wild char . - any single char execpt \n. use /3\.142/ as a dot
MORE ABOUT REGULAR EXPRESSIONS
(EBCDIC) - Ehb-suh-dik, Extended Binary Coded Decimal Interchange Code, IBM
shortcut of [0-9] - \d, /[0-9]+/ == /\d+/
shortcut of word - [A-Za-z0-9_] can be written as \w
shortcut of space - [\f\t\n\r ] form-feed,tab,newline,carriage,space return: \s
negating shortcut - [^\d], [^\w], [^\s]: \D, \W, \S
hex number - [\dA-Fa-f]+ to match hexadecimal numbers
any character - [\d\D], include newline while '.' excludes newline
quantifiers - /a{5,15}/, repeat 5 ~ 15 time. Note curly braces
anchors - /^fred/, /rock$/
match the whole word - /\bfred\b/
nonword boundary - /\bfred\B/, matches fredfor, but not fred
backreference - /(.)/ same as /./ but saved in expression memory /\1/ the first expression in memory /(.)\1/ two same characters
the precedence chart for regular expressions -
1. parentheses: ()
2. quantifiers: * + ? {5,15}
3. anchors, sequence: ^ $ \b \B
4. vertical bar of alternation: |
e.g. /fred|barney/ isn't /fred[d|b]arney/ because seqence takes precedence
question ch8 1. /"([^*])"/ - tricky enough, it matches """. It actually matches the first two quotes
USER REGULAR EXPRESSIONS
m// - a shortcut for /.../, m(), m{}, m[], m<>, m,, m!!, m^^ also accepted
case-insensitive - if(/\byes\b/i)
match '. plus newline' - /s $_="this is \n an \n example"; if (/\bif\b.*\ban\b/s) will match
binding operator - =~ to match specified var instead of defaul one $_ $op=<STDIN>=~/\byes\b/ or $op=(<STDIN>=~/.../) return boolean value get argument from command line - my $what = shift @ARGV;
$4 - the fourth memory of an already completed pattern match, same string as \4 \4 - a backreference referring back to the fourth memory of the currently matching regular expression, work inside regular expression only
automatic match variables - $&, $', $`
$& - entire matched section, within /.../ $` - whatever come before the matched section $' - whatever come after the matched section if("Hello there, neighbor"=~/\s(\w+),/, then $&=( there,), $`=(Hello), $'=( neighbor)
search N replace - if (s/fred/wilma/)
global replacement - collapse whitespace s/\s+/ /g
change delimiter - s#..#..#, s{..}#..#
search N replace on var - $var=~s/../../
change case - \U all the followings uppercase \u the following word uppercase \L all the followings lowercase \l the following word lowercase \u\L, \L\u all lowercase, but capitalize the first letter e.g. s/(fred|barney)/\u$1/ig; \E, turn off case shifting
split - @f=split /:/, ":adf:dsf:df:df:";, trailing empty field discarded. use \1 to retain
join - @f=join ",", $str1, @list1
MORE CONTROL STRUCTURES
unless - execute when condition is false, opposite to if(..)
until - execute when condition of while is false
Expression modifiers - print "...\n" if $n<0 &error("invalid") unless &valid($input) $i *=2 until $i>10 print " ", ($n += 2) while $n<10 - note the parenthesis &greet($_) foreach @person;
naked blcok - {... } variables defined only within the curly braces
elsif clause - if (..) elsif(..) NOT elseif
a quick way list to hash with repetition count - $count{$_}++ foreach @people
last - same as break out jump out of the loop
next - same as continue in C
redo - go back to the top of the current loop block
labelled block - LINE: while(<>){ ...}
logical operators - ||, &&, which are called "short-circuit" logical operators
logic OR operator - my $name=$last_name{$someone} || 'not such name'
The ternary operator = my $location=&is_weekend()?"home":"work"; my $size= ($width<10)?"small": ($width<50)?"large": "extra large";
partial evaluation operators - ||, &&, ?:
other operators - and, or, not, xor
FILEHANDLES AND FILE TESTS
open a filehandle - open CONFIG, "dino"; open CONFIG, "<dino"; same as above open CONFIG, ">dino"; open CONFIG, ">>dino";
close filehandle - perl will auto close a filehandle if reopen or exit
die - open PASSWD, "/etc/passwd" or die "wrong: ($!)";
(open PASSWD, "/etc/passwd") || die ".."; must add parenthesis
write to file - print LOG "..."; note space after LOG print LOG "%d...", 12; or select LOG; then print or printf
default flush - by default, the output to the filehandle is buffered, use select LOG; $| = 1; to flush immediately, suitable for monitoring
file test - -e file or dir name exists -z file exists and zero size (false for dir) -s file or dir exists and non-zero size -f plain file -d dir -T text by guess -B binary by guess -M modification age (days) -A access age (days)
test file size in KB - my $size=(-s $file) / 1024;
stat and lstat - my ($dev, $ino, $mode, $nlink, $uid, $gid, $rdev, $size, $atime, $mtime, $ctime, $blksize, $blocks)=stat($file);
get timestamp - $time=time
get humma-readable time - $now=gmtime (Greenwich mean time, or universal time)
timestamp to human-readable - $date = localtime $timestamp. e.g 1085986170 my($sec, $min, $hr, $day, $mon, $yr, $wday, $yday, $isdst)=localtime $timestamp
bitwise - 10&12, 10|12, 10^12, b<<2, 25>>2, ~10
use underscore - if (-s) > 100_1000 and -A _ > 90, with _, get data from previous system call, instead of call again
DIRECTORY OPERATION
chdir - like cd, chdir "/etc" or die "can't $!";
globbing - perl show-args *.pm (*.pm auto expanded in @ARGV) my @all_files = glob ".* *"; (multiple patterns) my @pm_files = glob "*.pm"; my @pm_files = <*.pm>; same as above
file handle and glob - my @file=<FRED/*>; (glob) my @file=<FRED>; (file handle) my @file = readline FRED; (same as above)
directory handle - opendir, foreach (readdir DH), closedir
readdir returns only filename, so the pathname need to be patched up
MANIPULATING FILES AMD DIRECTORIES
remove files - unlink "file.txt" unlink glob "*.o"
rename files - rename "old.txt", "new.txt"
links and files - link "chicken", "egg" symlink "dodgson", "carroll" $where = readlink "carroll" - get the actual file
handling directories - mkdir "fred", 0755 (note 0 for oct) my ($name, $perm) = @ARGV; mkdir $name, oct($perm);
unlink "fred/* fred/.*"; (remove files if any) rmdir glob "fred/*"; (remove all empty dir under fred) above will fail if it has sub-dir, use rmtree instead
modifying permissions - chmod 755, "fred", "barney"
change ownership - defined($user=getpwnam "merlyn") or dir "bad user"; defined($group=getgrnam "users") or die "bad group"; chown $user, $group, glob "home/merlyn/*";
change timestamps - my $now=time; my $ago=$now-24*60*60; utime $new, $ago, "abc.txt"; #can't change ctime
file:name module - perldoc File::Basename use File::Basename; my $name="/homes/fh240/phd/perl/abc.txt"; my $basename = basename $name; my $dirname = dirname $name;
not import any function - use File::Basename gw/ /; #use full names $dirname = File::Base::dirname $name;
File::Spec module - a object oriented module, different from File::Basename my $new_name=File::Spec->catfile($dirname, $basename); instead of my $new_name="$dirname/$basename";
PROCESS MANAGEMENT
system function - system "date" system 'ls -l $home' (' used instead of ") !system "rm abc.txt" or print "failed\n";
exec function - exec "date"; /# not perl codes follow except die die "date couldn't run: $!";
environment variable - $ENV{'PATH'}="mydir:$ENV{'PATH'}";
catpure output - $now=`date`; #use backquote instead of system @users=`who`; #get data broken up by lines
The string can be broken according to space, like my($user, $tty, $date)=/(\S+)\s+(\S+)\s+(.*)/;
filehandle can also represent process, like open F, "find . -name abc.txt -print|" or die "$!"; while (<F>){ chomp; ... }
process status - $? shows the exit status for process or system calls close F; die "can't close" if $?;
child process - defined(my $pid=fork) or die "can't fork $!"; unless ($pid){ ... }
waitpid($pid,0);
send and receive signals - kill 2, 4201 or die kill 0, 4201 or die # 0 is to check whether can be done
interupt signal - $SIG{'INT'} = '#a sub routine#';
STRINGS AND SORTING
find substring - $where=index($big, $small) $where=index($big, $small, $position) - at that pos or later $last=rindex($big, $small) $last=rindex($big, $small, $position) - at that pos or before
substring - $part=substr($string, $ini, $length) - $ini=-1 is last char $part=substr($string, $ini) - all string after ini position
substitution - my $string="hello, world"; substr($string, 0, 5)="goodbye"; change the string above substr($string, 0, 5, "goodbye"; same as above substr($string, 0) =~ s/fred/barney/g;
substr and index are often faster than regular expression, since they don't have the overhead of the regular expression engine.
sprintf - returns the requested string istead of printing out. my $date_tag=sprintf "%4d/%02d...", $yr, $mo, ...; In %02d, the leading 0 pads the string if necessary
sort numbers - sort a_sub @number; sub a_sub {if ($a<$b) {-1} elsif ($a>$b) {1} else {0}} or simply sort {$a <=> $b} @number (spaceship operator)
sort string - sort { $a cmp $b } @string (swap a, b to reverse) sort {"\L$a" cmp "\L$b" } @string (case insensitive)
get keys whose values sorted - %score=("barney"=>195, "fred"=>205, "dion"=>30); @winners=sort {$score{$b}<=>$score{$a}} keys %score
sort by multiple keys - if value same, then sort key alphebetically %winner=sort {$score{$a}<=>$score{$b} or $a cmp $$b} keys %score
a compact way to read in data to an array - my @numbers; push @numbers, split while <>;
SIMPLE DATABASE
DBM - dbmopen(%DATA, "my_database", 0644) or die "can't: $!";
dbmclose(%DATA);
access DBM hash - foreach my $key (keys %DATA) but the keys may be in a very large list, a better way is while (my ($key, $value)) = each (%DATA);
access DBM maintained by C program, \0 need to be added or removed when needed my $value=$A{"merlyn\0"}; $value=~ s/\0$//;
one way to save more items under one key is to use pack my $buffer=pack("c s l", 31, 4159, 265359); #c7, c* if needed my ($char, $short, $long)=unpack("c s l", $buffer); if the format letter is a, "a20" is 20-char ASCII string null padded
another way to store data in a file open - open(FRED, "<fred"; open(FRED, "+<fred"; #read and write
open(WILMA, ">wilma"); open(WILMA, "+>wilma"); #read and write, new file
seek - seek(FRED, 55*$no, 0); #55-byte is user-defined
read - $number_readd=read(FRED, $buf, 55); #read 55 bytes my($name, $age, $score1, $score2, $score3, $score4, $when)=unpack "a40 C I5 L", $buf;
write - print FRED pack"a40 C I5 L", $name, $age, $new_score, $score1, $score2, $score3, $score4, time);
no hardcode - my $pack_format="a40 C I5 L"; my $pack_length=length pack($pack_format, "abc", 0, 1, 2, 3, 4, 5, 6);
search and replace in a file - $^I=".bak", #a backup copy while(<>){ s/../../; print; }
from command line - perl -p -i.bak -w -e 's/.../.../' fred.txt
update new data but not overwrite -
unless (exists($DATA{$_})){
$DATA{$1}=$.;
}
a better alternative:
$DATA{$1}=$DATA{$1} || $.;
try and catch - eval {...}; print "An error occurred: $@" if $@; e.g. my $barney=eval {$fred/$dino}; get undef if divide 0
get list from grep - @matched_lines=grep {/.../} <FILE>; @odd_num=grep {$_ % 2} 1..1000; a simpler way @matched_lines=grep /.../, <FILE>; (or @list)
new list from map - map can be interpretted as to re-format a list @new=map {...} @data; #no , before @data another syntax @new=map "...", 0..15
use map to print a list print map "$_\n", @list;
unquoted hash key - $score(fred) insteadd of $score("fred"), fred is a bareword my %score=( barney => 195, fred =>205);
non-greedy quantifier - /fred.+?barney/ match forward, matching shortest /fred.+barney/ match backtracing, matching longest
search and replace on each line - s/../../m
slices - my $card_num=(split /:/)[1] my ($card_num, $count)=(split /:/)[1,5] my ($first,$last)=(sort @names)[0,-1]
slice of array - print "Bedrock @names[2,4,3,3]\0";
slice of hash - slice of hash (array) is always a list @three_scores=($score{/qw player1 player2 player3/}); if write to a hash @score{qw/player1 player2 player3/}=(19,20,23)
Regular expression - check "Matering Regular Expressions", Jeffrey Friedl
modules - http://search.cpan.org/ man perlmodinstall
some important modules -
Cwd - user Cwd; $dir=cwd;
Fatal - use Fatal qw/ open chdir /;
chdir '/home'; # no need to write "or die" any more
File::Basename - use File::Basename; basename and dirname
File::Copy - use File::Copy; copy("source", "desc");
File::Spec - File::Spec->curdir to get current directory
File::Spec->catfile ($dirname, $filename)
Image::Size - ($height, $width)=imgsize("fred.jpg");
Net::SMTP - $smtp=Net::SMTP->new($smtp_host,Hello=>$site);
$smtp->mail($from); $smtp->to($to); $smtp->data(); $smtp->database("..."); $smtp->quit;
POSIX - provides asin, cosh, floor, isuupper, isalpha, ...
Sys::Hostname - $host=hostname;
Text::Wrap - provides wrap function to the texts
Time::Local - $time=timelocal($sec, $min, $hr, $day, $mon, $year);