ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

Compare and contrast bash and Perl for simple tasks

Updated on March 20, 2014
Bash and Perl go head to head in today's lineup
Bash and Perl go head to head in today's lineup | Source

How does Bourne-again shell line up against Practical Extraction and Reporting Language?

The stories of Perl and Bash overlap in many ways. Each is affected by the other. It's not entirely fair to pit them head-to-head, as they have different goals and strengths. In my experience, I have found that many of my Perl scripts began life as a shell script.

Shell scripting is a great way to automate complicated tasks. Sometimes I throw together a shell script just to remind myself of the syntax for a particularly complicated command line utility. Other times, a shell script grows in complexity to the point that I fall back to my old standby, Perl.

My goal in this hub is not as much a head-to-head line up, as it is to demonstrate where the chips fall in my particular approach to the daily grind of network and system administration.

Reader feedback

What kind of programmer are you?

See results

Find and solve your pain point

Before we dive too much deeper into the topic, ask yourself: why are you here? No no no, not existentially. Why do you want to learn more about shell scripting and/or Perl? Because you are a problem solver.

So the first question is always, what is the problem? And because you are naturally a smart person, you will find a solution. Then the next question to ask yourself follows shortly after, is it worth my time to automate the solution? If you expect to spend more than a few minutes of your average work day on the problem at hand, the answer is almost always yes. If you are insane (my kind of crazy) and you love to write a script just for the joy of creating it, then why are we still asking these questions?

Here's an example of the evolution of a problem that begins with a shell script and ends up in Perl. Or as those who know me well enough might say, the story of my life.

Useful options for tcpdump

buffer output
useful for piping output into other apps
no lookup
do not resolve hosts or ports
show ethernet
display fields from ethernet frame
specify which interface to listen on

Capture and analyze network traffic

There are countless reasons for capturing network traffic. Sometimes you want to verify communication between specific hosts at a strategic capture point in your network. Sometimes, you want to identify who is going crazy with broadcasts on your LAN because it disrupts everyone else's performance.

My primary workstation is a Mac. I use tcpdump from the command line to sniff packets on the LAN. Knowing that the Mac connects to the LAN on interface en0, I fire off the following from the command line:

tcpdump -lnnei en0 not ether host b9:11:57:3a:20:bd

See the sidebar for a quick rundown on what the command line options mean.

The directive "not ether host b9:11:57:3a:20:bd" leaves off my local adapter's traffic, so I only see broadcast, multicast, or the occasional unicast storm.

We're only interested in a frequency analysis of which MAC address shows up most often, so let's ignore all but the first few columns. We'll do this by piping tcpdump into a quick awk script.

tcpdump -lnnei en0 not ether host b9:11:57:3a:20:bd | awk '{print $1, $2}'

Now instead of a screen full of overwhelming information, I just get the two columns: timestamp and source MAC. But I still don't know who's sending the most broadcasts. Why not modify the awk one-liner to count how many frames are sent by each MAC?

Here's a decision point. If you're comfortable with your command line environment, then maybe you continue firing off one-liners that you modify as you go. Personally, I've got enough invested at this point that I'm willing to dedicate a shell script to the idea.

newline after pipe
newline after pipe

Transition to shell script

A shell script is just a text file that gets interpreted by a shell. For editing shell scripts, I am a big fan of vi, but any text editor will do.

The first line needs to be a shebang: pound sign, exclamation mark, path to interpreter.


The next line can be a copy/paste of your most recent command line.

tcpdump -lnnei en0 not ether host b9:11:57:3a:20:bd | awk '{print $1, $2}'

Choose your own destiny, but my quirk is that I like to follow the pipe symbol | with a newline.

tcpdump -lnnei en0 not ether host b9:11:57:3a:20:bd |
awk '{print $1, $2}'

That adds some readability, especially if the redirection gets too crazy.

Let's take another look at the awk piece. To count by MAC address, we need to tally each one we see. No problem.

tcpdump -lnnei en0 not ether host b9:11:57:3a:20:bd |
awk '{tally[$2]++} END {for(mac in tally) {print mac, tally[mac]}}'

Uh oh. Where's the output? Once you interrupt the packet capture with Ctrl-C, why doesn't the END stanza kick in? That's a by-product of redirection - the interrupt kills the whole chain, so awk doesn't survive long enough to process the end of its input. We need to try a different approach.

Since the timestamp changes for each row, let's track how it changes to periodically print out an update. The timestamp has the format of HH:MM:SS.microseconds - if we can lop off the microseconds, the remaining timestamp uniquely identifies which second the frame was received. The substr function in awk will do the trick.

print updated information once per second
print updated information once per second | Source

When to transition to Perl

The output now shows blocks of MAC addresses, with total number of hits, updated once per second. As an exercise, go back and modify the awk piece to output the timestamp to separate the updates. What else would you do to improve the report? I'd like to track the frequency of each MAC per second, as well as how many packets per MAC over time, and report only on the currently active MACs.

With practice, you develop your own sense of style. As I mentioned earlier, it comes down to a matter of identifying and relieving pain points in your process. Right about now, the thought of taking this problem space into a Perl solution becomes appealing. Open a text editor, put together a few lines, save and quit. Remember to set the file's mode to executable:

chmod +x

Then pipe the output of tcpdump into the perl script to see the results.

tcpdump -lnnei en0 not ether host b9:11:57:3a:20:bd | ./

print updates on packets per second, per MAC
print updates on packets per second, per MAC | Source

Sort comparison function explained

Input comparison
Output value
$a < $b
$a == $b
$a > $b

Taking a simple exercise over the top

From here out, it's all showboating. There may be something below that's useful to you, but at this point, I'm having too much fun to quit.

Read up on Perl package management via CPAN and cpanminus. Install the Curses module.

The code below dives into a few topics I haven't yet covered in any of my hubs.

For example, signal handling is a method for applications to trap signals (like the one sent by hitting Ctrl-C) to define non-default behavior. In this case (line 13 below), I set up an anonymous subroutine to clean up the Curses module and exit cleanly.

On line 23, I opened a filehandle to the output of a child process to simplify the invocation of the MAC counter. Otherwise, the tcpdump utility would have to be launched externally to the perl script, and its output piped into the script's input.

I use split to break apart the input along set delimiters. On line 27, I specify the delimiter as \s+ or "one or more whitespace".

Perl's built-in sort routine can take a comparison subroutine to override its default behavior. It may be confusing at first blush, but the format of the comparison subroutine expects two inputs ($a and $b) and one output (-1, 0, or 1). See the table "Sort comparison function explained" for more information.

Many of the unfamiliar functions listed in the script below - initscr, noecho, cbreak, addstr, refresh - are defined on the CPAN page for Curses. Also, by importing the Curses module, we also import certain builtin variables like $LINES and $COLS that describe the screen's environment.

Curses-based MAC counter


# Author: Jeff Wilson
# Created: 2014
# License: GPL 3.0 ... no warranty, free to re-use in any way

use warnings;
use strict;
use Curses;

# initialize Curses environment

# register interrupt handler
$SIG{INT} = sub {
  print "Quitting\n";

my %tally;
my ($ts,$prev);

# open tcpdump as a process handle
open(my $ph,"tcpdump -lnnei en0 not ether host b9:11:57:3a:20:bd 2>/dev/null |");
while (<$ph>) {

  # only the first two columns matter, discard the rest
  my ($ts,$mac) = split /\s+/,$_,3;

  # not interested unless there's a MAC in column 2
  next unless ($mac =~ m/([0-9a-fA-F]{2}\:){5}[0-9a-fA-F]{2}/);

  # grab the first 8 off the timestamp
  $ts = substr($ts,0,8);

  # initialize $prev to current timestamp
  $prev = $ts unless (defined($prev));

  # update screen if this row isn't in the same second
  unless ($prev eq $ts) {

    # clear previous info off the screen
    for (my $row=3; $row < $LINES-2; $row++) {
      addstr($row,0,' ' x $COLS);

    # update timestamp

    # format header row
    addstr(1,0,sprintf("%-17s %5s %10s",qw/MAC pps total/));
    addstr(2,0,sprintf("%17s %5s %10s",'-' x 17, '-' x 5, '-' x 10));

    # keep track of which row to update onscreen 
    # skip first three, since they're already updated
    my $row=3;
    # walk through each MAC, sorting by most active overall
    for my $m (sort {
                $tally{$b}{total} <=> $tally{$a}{total}
        } keys %tally) {
      my $c = 0;
      # get updates if any for this MAC this past second
      if (defined($tally{$m}{$prev})) { 
        # remove the record as we read its value
        $c = delete $tally{$m}{$prev};
      # report MAC's total count with each update
      addstr($row++,0,sprintf("%-17s %5d %10d",$m,$c,$tally{$m}{total}));
      # don't update past the last line onscreen
      last if ($row > $LINES-2);

  # track PPS per MAC
  # track total packet count per MAC
  # move previous timestamp forward
  $prev = $ts;

  # push update out to screen

# never reaches this point, unless tcpdump fails


    0 of 8192 characters used
    Post Comment

    No comments yet.


    This website uses cookies

    As a user in the EEA, your approval is needed on a few things. To provide a better website experience, uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

    For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at:

    Show Details
    HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
    LoginThis is necessary to sign in to the HubPages Service.
    Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
    AkismetThis is used to detect comment spam. (Privacy Policy)
    HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
    HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
    Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
    CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
    Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the or domains, for performance and efficiency reasons. (Privacy Policy)
    Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
    Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
    Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
    Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
    Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
    VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
    PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
    Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
    MavenThis supports the Maven widget and search functionality. (Privacy Policy)
    Google AdSenseThis is an ad network. (Privacy Policy)
    Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
    Index ExchangeThis is an ad network. (Privacy Policy)
    SovrnThis is an ad network. (Privacy Policy)
    Facebook AdsThis is an ad network. (Privacy Policy)
    Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
    AppNexusThis is an ad network. (Privacy Policy)
    OpenxThis is an ad network. (Privacy Policy)
    Rubicon ProjectThis is an ad network. (Privacy Policy)
    TripleLiftThis is an ad network. (Privacy Policy)
    Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
    Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
    Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
    Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
    ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
    Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)