Creating a Penetration Testing Web Server Using Gearman & Supervisor Part 1 Installation & Basic Usage
I was tasked with building a pen test web server for my company, i.e. an easy to use and relatively attractive web interface that allows our employees to quickly and quietly launch various tools, using predefined settings and a few user supplied parameters. The following requirements formed version 0.1:
- Minimal effort required by the user for each tool, i.e. enter a few details, such as scan name and IP address, and hit scan.
- Be able to run multiple instances of a tool at once, i.e. have individual tools running in parallel.
- Be able to limit the number of instances allowed for each tool.
- Include functionality to encrypt and then email scan results to the user, if so desired.
- Scale well for when company grows.
I originally looked at using one of the several inbuilt PHP functions that allow shell command execution, such as shell_exec or exec, to launch the tool and then use the bash commands nohup and &, in combination with>/dev/null and 2>&1, to quietly background the subsequent processes. It quickly became clear, however, that this was a very inelegant approach and would make it very difficult to manage the spawned processes effectively, whilst not scaling well at all. As a result, I began to look for other (open source) solutions and found Gearman and its PHP wrapper, used in conjuction with Supervisor, to best suit my needs.
I initially struggled to install and operate Gearman, as I was new to PHP and whilst I regularly use linux, I am by no means a guru. I also found the Gearman documentation to be somewhat lacking, though my inexperience may well have played a large part in this. Therefore, I took it upon myself to carefully record every step I took and document it here, in the hope that it helps someone else in a similar position to myself.
Gearman is basically a queue and load balancing system that allows multiple tasks to be run in parallel. Many large sites use it, such as Yahoo and Digg. It consists of three main components, a client, a worker and the job server. Here is a high level description of their roles, in the context of a pen test web server:
The client is the PHP page that collects the user supplied details, such as IP range and scan name, and then submits them, along with any predefined parameters, to the job server, along with which type of worker (i.e. what tool) should process the data. The job server will then take this data and, depending on what worker type we chose, farm it to an available worker on the server with the least load. The worker will then receive this data and launch the appropriate tool with the supplied parameters. If there are no available workers, the job server will queue the task until one becomes available.
If we take Nmap as an example, we will have a HTML page where we enter our scan parameters, so type of scan (i.e. TCP, UDP etc), the IP range to scan, and what name we want the scan to be saved as. These details will then be posted to our client PHP page, which will put them into an array, indicate that we want Nmap workers to act upon the data and then submit it to the job server. The job server will then choose the best suited (based on availability and server load) Nmap worker and pass the data onto it. The Nmap worker will then process the data and launch an instance of Nmap with the user supplied details and corresponding flags.
This is basically how Gearman works; however, in reality it is slightly more complicated than this and so I do recommend reading this excellent blog post and looking through this slide show for a more in-depth description.
Now that we understand how it operates, let’s get it installed. Below is a step by step list of instructions to get it up and running that are tried and tested on the latest version of Ubuntu, currently 11.10. I installed and operate Gearman (and Supervisor) as root, so be sure to sudo su before you begin (or add sudo before each command). (Note, the version of Gearman in the apt-get repository is out out date, so it must be manually installed).
- Download the latest version of Gearman (currently 0.28) here
- Download the latest version if its PHP wrapper (currently 1.0.1) here
- Install Gearman’s dependencies:
apt-get install g++ libboost-program-options-dev libboost-thread-dev libevent-dev uuid-dev libcloog-ppl0
4. Install Gearman:
tar -xf gearmand-0.28.tar.gz
5. Install the PHP wrapper dependencies:
apt-get install php5 php5-cli php5-dev
6. Install the PHP wrapper:
tar -xf gearman-1.0.1.tgz
7. Add the Gearman module to all your php.ini files:
Use find / -name php.ini to find your php.ini file locations (mine were: /etc/php5/cli/php.ini and/etc/php5/apache2/php.ini)
Add extension=gearman.so to the file and save.
8. Check to see if module is installed properly:
php –info | grep “gearman support”
Should produce: gearman support => enabled
9. Start Gearman:
gearmand -d -L 127.0.0.1 -l /path/to/logfile/gearman.log
-d starts the job server in daemon mode, so it is detaches from the terminal. -L binds gearman to an address, which is necessary for security reasons as by default gearman binds to all interfaces. -l starts with logging; remember to create a log file first: touch gearman.log && chmod 755 gearman.log
10. Check to see gearman is running properly:
lsof -i -P | grep gearmand
Should output something like:
gearmand 26501 root 8u IPv4 69854 0t0 TCP *:4730 (LISTEN)
gearmand 26501 root 9u IPv6 69855 0t0 TCP *:4730 (LISTEN)
Note, if you ever need to restart Gearman (to clear the queue for example), use this useful command string which sends a shutdown command direct to the Gearman service using Netcat: (echo shutdown ; sleep 0.1) | netcat 127.0.0.1 4730 -w 1 You can then use the standard gearmand -d to start it again.
So now we have the Gearman job server installed and running, let’s look at the code behind the two key components, the client and the worker.
As described above, the client gathers up and submits the user supplied parameters to the job server. The best way to explain how this is done is via an example, so below is the code for a TCP quick scan (i.e. the default –top-ports 1000) that forms part of my Nmap client code. Three user supplied parameters are posted from the HTML form, the IP range to be scanned, the name of the job the tester is currently on and what he wishes the scan to be saved as. If this all seems difficult to visualize at the moment, fear not as we will walk through an example later on.
If you’re familiar with PHP, most of this should make sense but let’s take a look through it line by line just to be safe:
Line 2: We use a header here to redirect the user back to the Nmap scan page once the scan has been launched
Line 4: Here we are simply saying that if there is data in the posted IP_range_TCP_Quick variable, i.e. this scan option has been selected by the user, then perform the below code.
Lines 6-8: In these lines we grab the posted data and assign them to variables for processing. We use trimto remove any trailing whitespace and escapeshellarg as a security measure. Escapeshellarg encapsulates any user supplied input in single quotes, so any data is passed as an argument to the bash shell, making it impossible to append or prepend malicious bash commands. Any quotes within the argument are escaped.
Lines 10-14: Here we create an array of all variables that we want to send to the job server. It is important to note that we need to serialize the data here, otherwise it will lose it’s type and structure when submitted to the job server, which will prevent the worker from being able to process it correctly.
Line 17: We then call the method addServer(), in which we add our job server details. We will be running Gearman on the loopback and use the default port of 4730.
Line 18: Next we add a task to be run. We use addTaskBackground() to create a background task, so we don’t have to wait for the task to complete. If we used the normal addTask(), the client PHP page would hang while it waits for the job to finish. We supply the name of our serialized array containing our data and the name of the tool we wish to use. (If you have read the two links containing a more detailed description of Gearman, you will know that it’s actually the name of the function we wish to use; however, if you haven’t do not worry as it is a relatively minor detail.)
Line 19: Finally we call the method to run the tasks, thus submitting the data to the job server.
As you can see, the client performs no action other than to gather up the data and submit it to the Gearman job server. The Gearman server will then automatically farm the task out to the best suited worker, which contains the necessary code to start the tool with the user supplied data.
The worker is the component that actually performs the processing of the data, which in our case means starting the tools with the appropriate flags and the user supplied data passed from the client. As before, an example is the best way to illustrate how it works, and I will keep it basic here so it’s easier to understand. In my next blog post, however, I will describe how to add effective logging, add code to ensure files aren’t overwritten and include functionality to encrypt and email scan results. It’s also worth nothing that, because the scan flags are added in the client code, this worker can complete any type of Nmap scan.
Lines 3-4: Same as the client here, we add a server with the appropiate settings, then make a new worker object.
Line 5: Here is where we register the name of our function, which will be performing the scanning, with the Gearman job server. So, in this case, we are giving the function Perform_Nmap_Scan an alias of Nmap. If you scroll back up to the client code, you can see we use this alias when sending the serialized array, so that the job server knows to which worker it should send the received data to.
Line 9: This is just a simple while loop that ensures the worker waits for a job.
Line 11: Here we begin our Nmap function, passing it the $job object, which will be our serialized array sent from the job server.
Lines 13: We then retrieve our serialized array by calling the workload() method and assign it to a variable.
Line 14: Next we unserialize our data so we can process it.
Line 16: Now we log that a scan has been received. The $workload[“Scan_Name”] is how we access the items in the array, i.e. using the names we gave them in the client code.
Line 18: Here we add the scan save directory. I’m just going to save the scans to /var/www/scans/.
Line 20: Here we create the output filename, I use the format nmap_<scan name>_<scan flags>.
Line 21: The scan flags passed to the Nmap function do contain spaces, which makes the files difficult to access in the bash shell, so here I just remove any whitespace from the filename.
Line 23: Simply putting the output directory together with the filename here.
Line 25: Lastly we use shell_exec to execute our entire command string via the shell.
Line 27: Log that the scan has finished.
Again the code really isn’t that complicated, the job server does most of the difficult work for us. Once we start adding extra functionality, such as preventing files becoming overwritten, it does become more complex; however, the core Gearman code is relatively simple.
Now we understand how Gearman and its components operate, let’s try it out using our Nmap code.
An Nmap Example
Take the above client code and put it in a file named nmap_client.php and put the worker code in nmap_worker.php (remember to change the scan save directory to somewhere you want, or simply mkdir /var/www/scans/). Then put both files in /var/www/. We also need a HTML form in which to enter our scan details, so take the below code and put it in a file named nmap.html in the /var/www/ directory as well. (The form isn’t going to look pretty, but it will get the job done!)
Now make sure your apache web server is started (/etc/init.d/apache2 start), if you receive a warning about the ServerName, see here. Ensure Gearman is running (lsof -i -P | grep gearman) and make sure you have Nmap installed (which nmap, if it’s not there, apt-get install nmap). Then start the nmap_worker.php in a terminal (make sure you are root) with php nmap_worker.php, you should then see: Nmap worker initiated. Lastly, browse to nmap.html on your localhost, which should be http://127.0.0.1/nmap.html, enter some scan details (I’d recommend just scanning 127.0.0.1) and hit submit. If the browser tries to download the PHP page, see here. You should be redirected back to the nmap.html page and if you check back to the terminal, you should see: Scan ‘Test’ of job ‘Gearman’ received and about thirty seconds later, Scan ‘Test’ of job ‘Gearman’ finished. To confirm scan success, have a look in your scan save directory and the three nmap files (as we used the -oA flag) should be there. If they are, congratulations! You have successfully executed your first Gearman scan!
The only downside of using Gearman in this manner is that every individual worker needs its own process. So if you wanted five Nmap scans to be run in parallel, you would need to start five instances ofnmap_worker.php in five separate terminal windows. As you can imagine, this would quickly become a problem if you have multiple tools requiring multiple workers active at once. A viable solution to this issue would be using something like screen; however, a much more elegant option is employing a PHP daemon, such as Gearman Manager, or a process control system, such as Supervisor, to daemonize the workers. I decided upon Supervisor as I found the documentation to be much more comprehensive. As before, I’ve documented every step and each is tried and tested on the latest version of Ubuntu (currently 11.10).
1. Install Supervisor. The easiest way to do this is using python’s Easy Install, so we install that first:
apt-get install python-setuptools
2. Create a configuration file. We will just use the sample one:
echo_supervisord_conf > /etc/supervisord.conf
3. Create an init.d script:
Copy the script from here https://gist.github.com/176149 and place it in /etc/init.d/supervisord
4. Change the file permissions and activate the init script:
chmod 755 /etc/init.d/supervisord
update-rc.d -f supervisord defaults
5. Now we need to update the supervisord configuration file so we can instruct it to daemonize our worker. Sonano /etc/supervisord.conf and add this to the top:
command=/usr/bin/php nmap_worker.php ; php location and name of worker file
numprocs=2 ; number of processes – i.e. how many workers we want available for each tool
process_name=%(program_name)s_%(process_num)03d ; if numprocs > 1, this line ensures each process has a unique name
directory=/var/www/ ; directory containing worker file
stdout_logfile=/var/www/nmap.log ; log file location
autostart=true ; auto start program when supervisor starts
autorestart=true ; auto restart program if it exits
stopsignal=KILL ; stop the program if it a kill signal is issued
As you can see, the configuration file is very straightforward. There are plenty more values you can set, seehere for a detailed list, or just scroll down in the sample configuration file. To increase the number of workers available for each tool, just increase the numprocs value. The process_name then ensures each process is set a unique name. I’d also recommend creating a log file folder where all the log files for your tools are stored. Supervisor will then write to these files (provided you update the stdout_logfile value of course). Remember to create the log files first.
6. Start Supervisor:
7. Check to see if it has started properly:
Nmap worker initiated.
Nmap worker initiated.
(As we started two worker instances)
If we go back to our HTML form, we can then give our new Gearman & Supervisor setup a go. So enter some scan details and hit start scan, then enter some new details and hit start scan again (remember to change the scan name as we haven’t introduced any code to prevent overwriting yet). You could also immediately enter details for a third scan, as even though there are only two workers available, Gearman will queue the third scan until one of the other workers has become available. If you then check your nmap.log file after about thirty seconds or so, you should see something like:
Nmap worker initiated.
Nmap worker initiated.
Scan ‘test1′ of job ‘Gearman’ received.
Scan ‘test2′ of job ‘Gearman’ received.
Scan ‘test1′ of job ‘Gearman’ finished.
Scan ‘test3′ of job ‘Gearman’ received.
Scan ‘test2′ of job ‘Gearman’ finished.
Scan ‘test3′ of job ‘Gearman’ finished.
The Nmap scan files will also be in the output directory you specified in the worker code (/var/www/scans/ for me). If at any point you receive this error while trying to restart Supervisor, Error: Another program is already listening on a port that one of our HTTP servers is configured to use. Shut this program down first before starting supervisord. Then fix the issue with: unlink /tmp/supervisor.sock
That’s Gearman and Supervisor installation and basic usage covered then! In my next post we will be adding some slightly more complex code including:
- Adding authentication.
- Verbose logging.
- File checking.
- Encrypting and emailing scan results.
- Creating a worker status page.
Check back soon!