Parsing Apache Logs with tail, cut, sort, and uniq

A client experienced some intermittent website down time last week during the final few days of April 2021, and sent over that month’s Apache logs for me to see if there is anything out of the ordinary – excessive crawling, excessive probing, brute force password attacks, things of that nature. Below are a few commands I have used that I thought would be nice to keep handy for future uses. I am currently using Ubuntu 20.04 LTS.

While unrelated, just to form a complete picture, my client sent the logs to me in gz-compressed format. If you are not familiar on how to uncompress it, it is fairly straight forward:

gunzip 2021-APR.gz

Back on topic… I ended on parsing the file in three separate ways for me to get an overall view of things. I found that the final few days of April are represented in the roughly final 15,000 lines of the log file, so I decided to use the tail command as my main tool.

First, I did following command below to find which IP addresses hit the server the most:

tail -n 15000 filename.log | cut -f 1 -d ' ' | sort | uniq -c | sort -nr | more

Quick explanation:

  • The tail command pulls the final 15,000 lines from the log file (final few days of the month)
  • The cut command parses each line using a space delimiter and returns the first field (the IP addres)
  • The sort command sorts the results thus far
  • The uniq command groups the results thus far and provides a count
  • The second sort command reverse the sort so the highest result is on top
  • Finally, the more command creates screen-sized pagination so it’s easier to read

There is always more than one way to do something in Linux, of course. Just as an aside, the following functions very similarly:

cat filename.log | awk '{print $1}' | sort | uniq -c | sort -nr | tail -15000 | more

Then, I thought it would be nice to get an idea of how many requests were made per hour. This can be achieved with the command below.

tail -n 15000 filename.log | cut -f 4 -d ' ' | cut -f 1,2 -d ':' | sort | uniq -c | more

The main difference here is that I opted for the 4th (rather than 1st) result of the cut command, which gets me the timestamp element (rather than IP address), and then a second cut command parses it on the colon symbol and returns the first (date) and second (hour) for further grouping.

Finally, I tweaked it a little bit more so I get an idea of whether there was excessive requests within any minute-span. This can be achieved by expanding the second cut command slightly, as per below.

tail -n 15000 filename.log | cut -f 4 -d ' ' | cut -f 1,2,3 -d ':' | sort | uniq -c | more

Installing xRDP on Ubuntu

I installed xRDP on my Ubuntu 16.04 LTS so that I can easily connect to my Linux box from any Windows machine using the remote desktop tool that comes by default with every Windows installation. The actually installation is very simple:

sudo apt-get install xrdp

Through another post, you will see that I have the Xfce4 desktop environment installed:

https://www.dev-notes.com/blog/2018/03/30/installing-xfce4-desktop-environment/

Xfce4 plays very well with xRDP. To launch Xfce4 when a xRDP session is conneted, add this line to your .xsession file:

echo xfce4-session >~/.xsession

And then edit your startwm.sh file using your favorite text editor (nano is used below as example); add “startxfce4” without the quotes to the end of that file.

sudo nano /etc/xrdp/startwm.sh

At this point, you can either restart the machine, or run the two commands below to ensure xrdp is ready to accept connections.

sudo service xrdp restart
sudo /usr/sbin/xrdp-sesman

Finally, I noticed that when I am connected from a Windows machine, my tab key did not work correctly, causing me to lose the ability to autocomplete file names, among other things. It ended up the fix is very easy. Again, launch your favorite text editor to open up this Xfce configuration file:

nano ~/.config/xfce4/xconf/xfce4-keyboard-shortcuts.xml

Look for this line:

<property name="<Super>Tab" type="string" value="switch_window_key" />

And modify it to this line below:

<property name="<Super>Tab" type="empty" />

Installing Xfce4 Desktop Environment

Xfce is a light weight desktop environment for Linux that is suitable for those who prefer to not waste system resources on eye candy, for those who prefer to keep things simple, or for those who want to add a few more years of useful life to older machines. Below are the steps I followed to install Xfce4 on my machine running Ubuntu 16.04 LTS.

https://xfce.org

1. Minimally, I needed to install the main Xfce4 package.

sudo apt-get install xfce4

One of the first things I did was to adjust how the clock displayed. This guide helped me get started:

%% a literal %
%a locale's abbreviated weekday name (e.g., Sun)
%A locale's full weekday name (e.g., Sunday)
%b locale's abbreviated month name (e.g., Jan)
%B locale's full month name (e.g., January)
%c locale's date and time (e.g., Thu Mar  3 23:05:25 2005)
%C century; like %Y, except omit last two digits (e.g., 21)
%d day of month (e.g, 01)
%D date; same as %m/%d/%y
%e day of month, space padded; same as %_d
%F full date; same as %Y-%m-%d
%g last two digits of year of ISO week number (see %G)
%G year of ISO week number (see %V); normally useful only with %V
%h same as %b
%H hour (00..23)
%I hour (01..12)
%j day of year (001..366)
%k hour ( 0..23)
%l hour ( 1..12)
%m month (01..12)
%M minute (00..59)
%n a newline
%p locale's equivalent of either AM or PM; blank if not known
%P like %p, but lower case
%r locale's 12-hour clock time (e.g., 11:11:04 PM)
%R 24-hour hour and minute; same as %H:%M
%s seconds since 1970-01-01 00:00:00 UTC
%S second (00..60)
%t a tab
%T time; same as %H:%M:%S
%u day of week (1..7); 1 is Monday
%U week number of year, with Sunday as first day of week (00..53)
%V ISO week number, with Monday as first day of week (01..53)
%w day of week (0..6); 0 is Sunday
%W week number of year, with Monday as first day of week (00..53)
%x locale's date representation (e.g., 12/31/99)
%X locale's time representation (e.g., 23:13:48)
%y last two digits of year (00..99)
%Y year
%z +hhmm numeric timezone (e.g., -0400)
%Z alphabetic time zone abbreviation (e.g., EDT)

If you are curious, my setup is:

%a, %d %b %Y, %r

Which translates to, for example, “Fri, 30 Mar 2018, 10:25:25 PM”

Xfce follows the “do one thing, and do it well” philosophy, so it is literally just a desktop environment and nothing else. Read on to see the few additional packages I installed as add-ons for my Xfce installation.

2. Out of box, Xfce did not come with an application menu. I opted for Whisker menu.

sudo add-apt-repository ppa:gottcode/gcppa
sudo apt-get install xfce4-whiskermenu-plugin

You can customize various things with Whisker menu for the right look and usability that suits you.

3. I had installed this on a laptop, so it would be nice to display a battery meter. This can be done through xfce4-power-manager. As a bonus, this package also gave the ability to adjust screen brightness via a GUI tool.

sudo apt-get install xfce4-power-manager

4. There are tons of screenshot tools available for Linux, and there are actually several that are better than the Xfce one. I installed the Xfce screenshot tool nevertheless, to try it out as part of the greater Xfce offering.

sudo apt-get install xfce4-screenshooter-plugin

I set up a keyboard shortcut to the Print Screen key. The short cut runs:

xfce4-screenshooter -w -s ~/pics/screenshots/

I had wanted to try out Xfce purely out of curiocity, but I did get a nice bonus out of it — Xfce4 plays well xith xRDP, that means I could easily open a remote desktop session to my Linux machine from any Windows machine, since remote desktop comes installed by default on Windows. For details on how I installed and configured xRDP, please see:

https://www.dev-notes.com/blog/2018/03/30/installing-xrdp-on-ubuntu/

Turn your Mythbuntu box into a file server

As it turns out, because Mythbuntu already has Samba built in, it is easy to make it a file server in your home network by using the same tool. First, identify a place where you want to open up a new folder for file sharing. In this example, we’ll do “/fileserver/”. Then, let us issue the following command to edit the Samba config file.

su pico /etc/samba/smb.conf

Note the “su” command; it will require you to put in your password before editing. Once the text editor pico loads up, add the following section at the end of the configuration file.

[files]
comment = Files
path = /fileserver/
public = yes
writable = yes
create mask = 0660
directory mask = 0770
force user = mythtv
force group = mythtv

Once this is done, restart the samba daemon from your command line.

/etc/init.d/smb restart

You should now be able to access the new “files” samba share from another computer on your network.

Set a static IP address for Mythbuntu

The process to set a static IP address to a Mythbuntu box is very similar to Ubuntu, given that Mythbuntu is built upon Ubuntu. In Mythbuntu, choose “Utilities/Setup” from the main menu, then “Setup”, and finally “Mythbuntu”. After putting in the password, you will be presented with the Mythbuntu Control Centre. Under “Advance Management”, you will be able to “Launch Terminal”.

First, check out what your current settings are.

ifconfig -a

Take a note what your current broadcast, subnet mask, and gateway settings are. Also, have you decided on an IP address for your Mythbuntu box yet?

Next, issue this command to modify the Mythbuntu box’s network settings.

sudo pico /etc/network/interfaces

In the pico text editor tool, you probably only see a set of “lo”, or loopback, settings. Let’s add a new set for “eth0” so that the whole content of the file looks something like the following. Note your addresses may be different. In my case, I set the box’s static IP address to 192.168.1.3, for example, and yours may be different.

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
	address 192.168.1.3
	netmask 255.255.255.0
	network 192.168.1.0
	broadcast 192.168.1.255
	gateway 192.168.1.1

Finally, restart the network card with the following command.

sudo /etc/init.d/networking restart

That’s it! You should now be able to access the Mythbuntu box with the same IP every time now! No more trying to guess what the IP address might change to every time you turn it on!