Analyzing Shellcode with Radare2


I know it has been a long time since I’ve added anything to this blog, so I decided to go through something I’ve encountered a lot of recently; shellcode.  I have also been trying to familiarize myself with using radare2, so I figured why not use this debugger to analyze shellcode without having to just rely on running it with whatever loader it was delivered in.  To add one more layer of complexity, or simplicity depending on how you think of it, I’m doing it all on Linux.

Generating Shellcode

Step one is to generate some shellcode.  In case you don’t know what shellcode is, shellcode is basically assembly language instructions represented in hex.  These instructions then get interpreted by the CPU and make things happen.  When you compile a program written in a high level language such as C++, the compiler will convert the program logic into a bunch of assembly instructions and values.

For this example, I’m going to use a Linux-based x64 bind shell that is xor encoded.  For those unfamiliar with a bind shell, all this code does is open up a local connection (socket/IP port pair) on the host that a remote system can connect to.  There are different ways to generate shellcode like writing the assembly yourself (not advised), writing an assembly program (.asm) and compiling it, but for ease and speed I used msfvenom to generate this shellcode:

msfvenom -p linux/x64/shell_bind_tcp LPORT=6666 -f c -e x64/xor
No platform was selected, choosing Msf::Module::Platform::Linux from the payload
No Arch selected, selecting Arch: x86_64 from the payload
Found 1 compatible encoders
Attempting to encode payload with 1 iterations of x64/xor
x64/xor succeeded with size 127 (iteration=0)
x64/xor chosen with final size 127
Payload size: 127 bytes
unsigned char buf[] =

No, I didn’t select the platform or architecture because I didn’t feel like it. I made the bind shell bind to any/all interface(s) (absence of the ‘LHOST=’ parameter).

The next step is to add the shell code into a wrapper so that we can run it.  This can be done in several different ways but for the sake of this example I will use the following wrapper in C:

unsigned char code[] =

int main(void)
(*(void(*)()) code)();

There is a pretty good writeup for doing this in Python and the logic behind these wrappers.

All this wrapper is doing is basically creating a function pointer to the shellcode, which will then be called when main() executes.

After the program is coded and error free, it needs to be compiled.  The following command will compile the code and disable any security controls that will make debugging difficult:

gcc shellcode.c -fno-stack-protector -z execstack -o shellcode

Make it executable:

chmod +x shellcode

Make sure to test the program to make sure it works properly.  You should be able to netcat to localhost on port 6666:

nc 6666

After which you should be able to execute commands (note: a prompt ‘$>’, or whatever, will not be on your terminal)

If you get a segmentation fault when executing the program make sure to check you CPU architecture and OS because something is wrong.

Debugging with r2

Radare2 is a very extensive framework that is open source.  You can do a lot of really cool things with it that I’m not going to explain here, but if you’re interested, be sure to check out their book.

I apologize in advance if my code snippets or screenshots are awful :/

I’m going to assume a decent understanding of r2 (and assembly/debugging for that matter), so if you have any questions there are plenty of other blogs and documents that will probably answer what you need to know.  Also, I’m assuming you have r2 installed as well 🙂 (and that it’s the most recent version – they build it constantly)

Step 1

Open the program (in this case I called it shellcode) with r2 in debugging mode (-d):

r2 -d shellcode

Then type the command, aaa, to analyze the program.  This will make debugging much easier and help label and show program flow, etc.

Screen Shot 2016-07-12 at 2.28.10 PM

Step 2

When the program first starts we are not in the main function but in a system library responsible for launching the process.  You can see this with the dm command:

Screen Shot 2016-07-12 at 2.38.58 PM.png

Notice the asterisk.  That is where the debugger is currently at, and coincidentally, where it first broke in the execution of this program.

Now we need to find the shellcode.  The following snippet will show me seeking to the main function, printing the disassembly at main, and then seeking to the shellcode at obj.code.  Notice that I called the shellcode and function, code, in the wrapper program.

Screen Shot 2016-07-12 at 2.41.21 PM.png

You can see that the wrapper is working because it is calling the pointer at 0x4004b4 (in rdx – set at 0x4004aa) to the shellcode.  This will transfer program execution over to the shellcode I generated.  Remember that edx still exists on 64 bit systems, it’s just the lower half (32 bits) of rdx.

Step 3

Now to look at the shellcode.  So we know that it is encoded with an xor scheme.  How it does that exactly is unknown right now, but however it does that there must be an xor key and some sort of looping control in place.

Let’s take a look at the assembly (use the V command followed by entering p until you have a view that looks like mine):

Screen Shot 2016-07-12 at 3.00.32 PM.png

You can tell we are in the shellcode be examining the assembly instructions in the middle column.  They should line up with the shellcode from msfvenom.  There is one hiccup, however, and that is sometimes r2 decides that code is a string.  In this case, 0x6008bb is being misinterpreted as a string instead of code.  To fix this, as well as address 0x6008f4, we will use the dc command, which stands for “define as code”.   Navigate the assembly with the arrow keys and when you get to the address you want, type dc.

You should now have something that looks like this:

Screen Shot 2016-07-12 at 3.09.59 PM.png

We can now see that there is a loop, as well as a hex constant that is most likely not an address even though it is 8 bytes and we are on a 64-bit system.  If it were an address, it would be outside the allocation for this process.  So 0xd17b3383b74ba9b6 is most likely the xor key.

Step 4

Now let’s analyze what the first part of the program is doing.  Starting at 0x6008a0, the program first clears the rcx register with xor rcx,rcx, which is typically used for a counter while looping. After that it sets rcx to 0xb (11 in decimal), so it’s most likely going to loop 11 times.  This part is a little confusing since it’s using the sub operation, but it’s basically a double negative situation here.  We’re really adding 0xb in a round about way.  Remember that in assembly there is always more than one way to do something.

The next operation is tricky as well: lea rax, [rip – 0x11]. What’s happening here is the address of whatever is in rip at the time the debugger reaches this point is subtracted by 0x11 (17 in decimal) and then that address location is loaded into rax.  What makes this tricky is that when your debugger is at the lea instruction, rip is set to the address of this instruction, but when this command actually executes rip is already set to the address right after (0x6008b1), but luckily the debugger understands this and tells you the proper calculation as a comment to the right of the screen.

The next instruction places the xor key to rbx and then the loop begins.  The loop instruction is at 0x6008c5 and will loop back to 0x6008bb (also expressed as str.H1X_H_).  This will loop 8 times (rcx counter).  The primary instructions in this loop, well the only ones really, are at 0x6008bb and 0x6008bf.

0x6008bb: xor qword [rax + 0x27], rbx

This instruction is taking the contents of the address pointed to in rax, which was set at 0x006008aa + 0x27, and xoring it with the key that was stored in rbx at 0x006008b1.

0x6008bf: sub rax, -8

This instruction is just incrementing rax by 8 bytes (case of the double negatives – 8 is a constant, not hex).

So this loop is xoring the memory right below the loop with the 8-byte xor key, incrementing the xoring section by 8 bytes each iteration.  This routine could have written the decoded xor onto the stack or to some other section of memory but instead it just overrode the existing code (which was garbage if you examined it at first).

When you let the loop finish decoding, you should get something that looks like this:

Screen Shot 2016-07-12 at 3.41.28 PM.png

Notice how the code changed and the appearance of the newly created syscall instructions?

Step 5

Now that the shellcode decoded itself, we want to see what it’s doing.  The bind shell is formed from a few syscalls.  You can think of syscalls as the mechanism by which you execute certain kernel functions (I/O, processes, networking, etc) .  How the bind shell works is fairly simple:

  1. Create a socket
  2. Bind to an IP and port
  3. Listen and accept connection
  4. Redirect std pipes to client
  5. Run incoming commands with /bin/sh

Now, if you noticed, there are not a whole lot of syscalls in the decoded portion (thanks r2) so just be aware that you need to either define code or just wait for the instructions to execute.  As you might have guessed, there’s an define code issue at 0x6008dc.  But as you step through it clears itself up:

Screen Shot 2016-07-12 at 3.51.06 PM.png

You may be wondering where or “loopback” or 6666 are.  Well, I have to inform you that operating systems don’t really work entirely on strings, but with data structures.  In this case we have to realize that a sockaddr structure is being built on the stack and being passed into a syscall.

Step 6

Time to go over the syscalls.  Just to make things brief, the syscall instruction accepts a number and some arguments from the stack. Here is a pretty good reference for Linux syscalls.  So what we expect to see is some things being pushed to the stack, one of which should be a number (in hex) right before the syscall instruction.

0x6008d1: This syscall….I’ll repaste a clearer picture:

Screen Shot 2016-07-12 at 4.08.47 PM.png

This syscall is taking the hex value 0x29 (at 0x6008b1, which equals 41 in decimal) as an argument.  Ignore all the garbage in between.  If you lookup up this syscall number in the table I posted you’ll see that 41 is the call for socket creation.

0x6008e6: This one is a little more difficult. But we know that from the hex at 0x6008e6 the syscall number is probably 49, which is “bind”.  If you examine the instructions right before the syscall you’ll see that an interesting hex constant is being pushed onto the stack at 0x6008d6.  This is where the sockaddr structure is being built.

Let’s take a look at the sockaddr structure:

// IPv4 AF_INET sockets:

struct sockaddr_in {
            short        int          sin_family;
            unsigned     short int    sin_port;
            struct       in_addr      sin_addr;
            unsigned     char         sin_zero[8];

struct in_addr {
            unsigned     long         s_addr;

We need to push these three things onto the stack (order matters because as well as endianess – so things might appear backwards on the stack):

  • AF_INET = 2 (From source code: #define AF_INET 2)
  • PORT = 6666
  • IN_ADDR = 0 (From source code: #define INADDR_ANY (u_int32_t) 0x00000000), This binds to all interfaces.

So let’s take a look at the hex constant: 0x(0)a1a0002

6666 in hex = 0x1a0a (and is an unsigned short or 2 bytes).  We know 0x0002 is equal to AF_INET (IPv4) and is a short (2 bytes).  These two values make up the four bytes seen in this hex constant.  But where is the 0 that is required for INADDR_ANY? That gets pushed onto the stack right before 0x(0)a1a0002 with push rdx (rdx = 0 at this instruction), thus completing the data structure:

0x000a1a0002 or:
*corrected endianess
02      (af_inet)
1a0a  (port)
00      (inaddr_any)


0x6008eb: This syscall (0x32 or 50) is easy, it makes a call to ‘listen’ 🙂

0x6008f3: Another easy one.  This syscall (0x2b or 43) calls ‘accept’.

So far we have a socket being created, a sockaddr structure being created and bound to all interfaces on port 6666, and now it is listening and accepting connections.

I’ll repost the disassembly:

Screen Shot 2016-07-12 at 4.50.08 PM.png

0x600900: 0x21 = 33.  This syscall calls ‘dup2’, which essentially redirects command output to the client connected to the socket.

You may have noticed the constant at 0x600908: 0x68732f6e69622f.  Well this in ascii = “/bin/sh”.  So the final syscall does something with this:

0x600908: 0x3b = 59 or the ‘execve’ call.  This call will execute a program and any arguments for the program you provide it with.

As you might have guessed by now, this is taking whatever comes in from the socket, executing it with /bin/sh and then returning the results to the socket. So a remote shell connection 🙂


While I did not go into as much depth on r2 as I would have liked to it is definitely a wonderful tool.  I hope that you found this entertaining or insightful.  I’m no expert but I am always trying to learn so if there are errors or something I missed please message me 🙂


Volatility and LiME on Ubuntu 14.04

I recently decided to try out Volatility on Linux, and in general – for the first time ever. I figured it was time to step it up and actually try to figure out how to make it work since most everyone I know hasn’t ever used it before.

This tutorial will be fairly high level – I’m not going to hold your had through everything but most of the info to get Volatility working with memory dumps will be provided. The set up can be a little confusing, so I wanted to make a more strait forward guide. That being said, keep in mind that I am doing this on an Ubuntu 14.04 LTS with the specific kernel in that distro. Volatility is compatible with most kernels but not all and just because something works on my system doesn’t mean it must work on everyone’s.

I’m using Volatility 2.2 and will be working from my home director, FYI

A lot of these instructions can be found on Volatility’s site:

Download Dependencies:
– Python (obviously):
apt-get install python2.7

– Dwarfdump:
apt-get install dwarfdump

– GCC/make:
apt-get install build-essential

– Headers for building kernel modules: apt-get install linux-headers-`uname -r`

NOTE: There are other python libraries that will need to be imported in order to use some of the plugins. Comprehensive lists can be found online.

Download Volatility:

wget the latest release from
Make sure to pull the tar.gz file and not the .exe

tar -xzvf
cd volatility-2.2

You can either use the install script ( or just work out of the local directory.
Installing will allow you to import the libraries into other scripts you make, however it’s harder to manage as there is no uninstall script, so you have to do it manually whenever you want to upgrade. Version control is easier without installing. For example, the day after I downloaded Volatility 2.2, 2.3 was released. All I had to do was download the new version and move my kernel profile (explained later) 🙂

*I will be running the script from the directory it resides using python. I choose not to install it.

Make Profile/ Kernel vtypes:

For Ubuntu 13 module.c is wrong. You must change “#include <linux/net_namespace.h>” in the module.c file to “#include <net/net_namespace.h>”

cd ~/volatility-2.2/tools/linux
head module.dwarf

(If this didn’t work look at what I wrote before the commands ^^…)

Now you need to make a profile (it’s just a zip of you module.dwarf and files):
*Make sure to name the zip file something memorable. I called mine “Ubuntu13”.

sudo zip ~/volatility-2.2/volatility/plugins/overlays/linux/ ~/volatility-2.2/tools/linux/module.dwarf /boot/
adding: home//volatility-2.2/tools/linux/module.dwarf (deflated 90%)
adding: boot/ (deflated 79%)

Check to see if your profile exists:
sudo python ./ --info | grep Linux

Volatility requires a memory dump to work with. Volatility cannot do this itself so you must use LiME to make the dump. LiME is a kernal module that performs this action.

Download LiME from:

Extract and compile:
wget unzip
cd /LiME-master/src

make -C /lib/modules/3.13.0-32-generic/build M=/home//Downloads/LiME-master/src modules
make[1]: Entering directory `/usr/src/linux-headers-3.13.0-32-generic’
CC [M] /home//Downloads/LiME-master/src/tcp.o
CC [M] /home//Downloads/LiME-master/src/disk.o
CC [M] /home//Downloads/LiME-master/src/main.o
LD [M] /home//Downloads/LiME-master/src/lime.o
Building modules, stage 2.
MODPOST 1 modules
CC /home//Downloads/LiME-master/src/lime.mod.o
LD [M] /home//Downloads/LiME-master/src/lime.ko
make[1]: Leaving directory `/usr/src/linux-headers-3.13.0-32-generic’
strip –strip-unneeded lime.ko
mv lime.ko lime-3.13.0-32-generic.ko

Integrity: Ignorant Trust or Cunning Deception?

If you are active in the cyber security industry then you are familiar with past and present attacks as well as emerging threat trends and attack campaigns.  It’s interesting to observe how attacks evolve and adapt to technologic and economic breakthroughs and trends.  For every great idea, tool, or resource there are many attack vectors and actors willing to exploit them (whether accidentally or intentionally).  I read the news every day and I see some pretty advanced attacks happening all over the world, and it amazes me to see what extent people and organizations will go to in order to break things, spy, or make a profit.

Most people know what CIA is (not the government organization): Confidentiality, Integrity, and Availability.  Attacks generally revolve around these three domains.  DDoS or DoS attacks affect Availability; anything involving passwords, hashes, and PII (Personally Identifiable Information) involves Confidentiality; manipulation of information involves Integrity.

I believe the most destructive and costly attack sector involves Integrity – Yes, Confidentiality is very critical and can be extremely costly, but let me make my point and then you will see why I think this.  When I say Integrity I’m not just talking about man-in-the-middle attacks, but rather attacks that compromise the integrity of your system, or the TCM (Trusted Computing Model).  It’s one thing to manipulate files or data in transit, but it’s entirely another when your operating system or hardware is compromised.

We hear all sorts of security terminology on a daily basis such as “Malware”, “Virus”, “Spyware”, “Trojans”, etc., but which ones are the worst of all evils?  The ones that compromise the integrity of your system.  Traditionally these programs are categorized as rootkits, and trojans.  Trojans (from the story of the Trojan Horse) is a program that either imitates another trusted program or creates a pseudo-trusted program such as an antivirus tool or coupon printer program.  These programs perform (usually) the task that they advertise, as well as a hidden procedure (soldiers inside the horse) – And just to clarify something that irritates me, a trojan isn’t just malware that bypasses security; it pretends to be something with an ulterior motive.  This type of behavior/masquerading violates the trust of the user and is dangerous.  Sometimes these programs are obvious but other times they are not.  What happens when your vendor’s application gets trojanized by Malware and you don’t notice anything?  Or what happens when the browser that you downloaded turns out to be a trojan that uses your system resources for malicious purposes?  It’s hard to detect these kinds of things sometimes!

What happens when you can’t even trust your OS?  This is where rootkits come in.

If you’re familiar with the Trusted Compute Model you will remember that ring 0 is where your System/Kernel and Drivers run and ring 3 is where user applications and other programs run.

Rootkits are a suite of programs that usually consist of a dropper, loader, and the rootkit itself.  The idea of a rootkit is that it hides what it does to the user of the system, and even the system itself.  Rootkits can reside in either ring 3 or ring 0.  Ring 0, or kernel level rootkits, are extremely hard to detect and utilize hooking to intercept system calls and modify the return data types to the caller.  This class of rootkits are faultier and can cause kernel crashes (Blue Screens).  User level rootkits are more stable and utilize hooking as well, but these rootkits must infect the memory of all running process in order to fulfill their purposes.

What concerns me is how advanced rootkits have become.  We now hear about bootkits, and trojanized type one hypervisors, and even chipset rootkits!  There’s even a whitepaper on creating a hardware rootkit by changing the dopant polarity in capacitors – leaving the hardware intact!  That’s insane!

When you go into the negative rings of the TCM (-1, -2, -3, etc.; -1 = hypervisor, -2 = boot loader/MBR, -3 = chipset (like the Northbridge or other microprocessors), things get a lot scarier and a whole lot more unsettling because of the control these components have on a system.

Let’s say you buy a NIC from China whose driver has been trojanized or intentionally installs a rootkit because it already has system level access (ring 0)?  It could manipulate your network traffic and you wouldn’t know the difference.  How many of you know that most NICs strip off the CRC of an Ethernet frame?  This can be confusing at first when inspecting the size of frames in a protocol or network traffic analyzer.  It’s not malicious but it’s something done at the Data layer by the device driver without even notifying the user!

So we see a new threat type emerging: one that compromises the trust and integrity of your system.  It’s no longer a matter of removing malware with a tool, but rebuilding your system from the ground up, and even possibly destroying it altogether.  Over time I’ve come to realize how ineffective antivirus programs are and how little they protect.  While they are a good idea to have installed and defeat petty threats, they will not discover advanced malware, which are the ones that cause the most damage.  If you have a rootkit that has infected your BIOS then reinstalling the OS won’t work.  It’s hard to trust everything on a computer when it’s composed of disparate parts from all over the world and from different vendors.

The real question comes down to who can you trust?  Can you even trust the system you are using right now?  While these threats are very real, it’s no excuse to stop using computers and that’s not the intent of this article.  I simply want to present the idea of a violation of trusted computing.  All an attacker needs to do to cause a lot of damage is convince you that your system cannot be trusted.  Once that idea is planted it’s hard to not do anything about it.  If you suspect there is a rootkit on your system you will spend a lot of time and effort trying to find and remove it, and even then you may question whether it can still be trusted.  Think of all the money it will take to clean or replace systems to a trusted state.  When hardware trojans are involved then you are looking at more cost than just running an AV program, paying fines, and paying for credit monitoring (depending on your system).  If we’re talking about your home system then you’re looking at replacing your computer or spending a lot of time and money trying to clean it up (unless you don’t care that you’re compromised) – money that you would most likely prefer not to spend.  It’s hard to put a price on peace of mind!

Security goes beyond the physical and digital realm and into the human psyche.  Do you think Target will ever forget its data breach?  I doubt it.  They were not just physically/digitally compromised, but emotionally as well.  Their customers lost trust and confidence in the company. If you can discredit and compromise trust, then you can cause serious long-term damage.

Python SMS Messaging Server

I made a simple Python server that will send an SMS text message via a given service provider’s SMS email address. I built this server for a simple Android messaging app that I made. The socket connection is just a Python server socket object. You will have to determine how you choose to implement the client connection and input/output streams. I passed in a string representation of an object that I then just parsed with python string methods and regex. I didn’t need to do anything fancy, but ideally I would have passed a dictionary or Java Bean (if it were written in Java) to my server via JSON had I taken the time to figure out how a Python socket is able to do this (I beleive there is a simple json module in Python that you can use). If the server were written in Java then it would have been easier and you could simply use GSON or an object stream to pass the data.  Anyway, feel free to use this or modify it – I’m positive it can be optimized and corrected!

*As a security aside: Data is encrypted via TLS from this server program to the service provider’s mail server (if they implement SSL/TLS), and then from the email server to the recipient’s phone (assuming the provider encrypts their SMS and all the transactions after the mail server receives the data).

Here’s what I got so far:

from socket import *
import threading
import thread
import errno
import subprocess
import smtplib
import datetime
import re
import sys

def getIP():
  #Only works in Linux - NOT MAC OSX
  #Get IP address of server
  arg = 'ip route list'
  p = subprocess.Popen(arg, shell= True, stdout= subprocess.PIPE)
  data = p.communicate()
  split_data = data[0].split()
  print split_data
  ipAddr = split_data[split_data.index('src') + 1]

  print '[*] Your ip is %s' % ipAddr

def sms(num, msg, car):

  server = smtplib.SMTP("", 587)
  server.login('', 'password')

  carrierEMail = ""

  if car == "T":
    carrierEMail = ""
  elif car == "S":
    carrierEMail = ""
  elif car == "V":
    carrierEMail = ""
  elif car == "A":
    carrierEMail = ""

  server.sendmail('Test', num + carrierEMail, msg)
  print "[+] SMS sent to: " + num + carrierEMail + "\n"

def connectionHandler(clientSocket, addr, BUFFER):

  data = clientSocket.recv(BUFFER)
  print "Recieved: " + data

  #Parse data
  msg,car,num = data.split(",")

  num = num[7:-2]
  msg = msg[9:]
  car = car[9]

  #Strip out hyphens in phone number
  regex = r'-'
 num = re.sub(regex, "", num)

  print car
  print num
  print msg

  reply = '[+] Server Recieved Data!'
  print "[+] Sent reply!"


    sms(num, msg, car)
    print "[-] Failed to send SMS"

def main():
  HOST = getIP()
  PORT = 4444
  BUFFER = 1024

  #Create socket
    serverSocket = socket(AF_INET, SOCK_STREAM)

  #Max number of cued connections per socket

    print "[-] Socket creation failed!"

  while True:
    print '[*] Waiting for connection...'

      #clientsocket = read/writable object -- addr = incoming connection address
      clientSocket, addr = serverSocket.accept()
      print '[+] Connection from: ', addr
      thread.start_new_thread(connectionHandler, (clientSocket, addr, BUFFER))

      print "[-] Socket connection failed!"

if __name__ == '__main__':

How to Create a TOR Hidden Service

Before I go into telling you how to set up a TOR hidden service I want to clarify a grammatical issue that bugs the crap out of me.  If you didn’t know, TOR stands for The Onion Route (onion refers to the layering encryption process the protocol uses for traffic).  So when people say “the” TOR network it’s a little redundant – “the” The Onion Route.  It’s like saying the ATM “machine” – Automatic Teller Machine “machine”… See my point?  I might make this mistake while I write, so I could prove to be a little hypocritical, sorry.


How many of you have heard of a TOR hidden service like the Silk Road?  It’s really not some mystical location on the “dark” side of the Internet, or “darknet”.  While it does sound a bit fantastic, a TOR hidden service is just a server that is connected to TOR network and is only accessible by a uniquely generated domain name (.onion).  The idea is that there aren’t supposed to be any IP’s associated with the server once it’s set up on TOR – making it “untraceable”.  Of course there are ways of exploiting a vulnerability that will reveal that address, but I won’t go into that.  Essentially a hidden service is just a web service on TOR.

Here’s how to set up a basic service on a Debian/Ubuntu distro:

1) Update and upgrade you system.

sudo apt-get update
sudo apt-get upgrade

2) Install a web server.  I use Apache.

sudo apt-get install apache2

3) Install TOR via apt-get.  You don’t need to download the binary or source code.

sudo apt-get install tor

4) After you install TOR, open the configuration file.

*NOTE: Your configuration file(s) will be in /etc/tor/

vim /etc/tor/torrc

You will want to change the port numbers in the configuration file.  Verify the IP you web server is listening on (change it if you want).  I use 6666.  What happens in the configuration file is that the TOR service will listen on a certain port and address (accessible only via TOR) and then redirect that traffic to your webserver.

In the configuration file navigate to the first instance of this:

HiddenServiceDir /Library/Tor/var/lib/tor/hidden_service/

HiddenServicePort 80

The “HiddenServiceDir” field is a directory location (non-arbitrary) in which you specify where you want TOR to create your key and .onion address (hostname).  It will generate two files in the directory to which you point it, one for each artifact (see step 6).

The “HiddenServicePort” field is where you specify which ports you want TOR to listen on and to redirect to.

So in my case, I have TOR listening on port 7777 which then redirects to my apache webserver on 6666 (which is what I configured Apache to listen on).  So when I navigate to my .onion address on port 7777, I will be sent to the home directory of my apache server (/var/www).  Remember that whatever web server you are running, your traffic will be directed to the HOME directory of that server, wherever that may be.  Make sense?

In my case, my torrc file contains these two fields:

HiddenServiceDir  /var/lib/tor/hidden_service

HiddenServicePort 6666 is what TOR is listening on, and 6666 is where it is redirecting (and where Apache is listening).

*NOTE:  It doesn’t matter if Apache is listening on a port that is public or private (localhost).  If you have Apache listening on a port on your public interface, then you should block it at the firewall so as to not leave the web server open to the world.

Again, by default the TOR service will point to your webserver’s default directory.  So in my case when I navigate to my hidden service I get dropped into /var/www.  You might be able to change this, but I haven’t done it/figured out how.

*NOTE:  You can have more than one hidden service running at once.  Just repeat this configuration for each instance and create different directories on your webserver.


So to bring this configuration into perspective, let me summarize.  You have a TOR service listening on a certain port on a uniquely generated .onion URL that redirects traffic to whatever port you webserver is listening on.  Keep in mind that you are still connected to the internet on your normal IP and your webserver can get pwned if you make whatever port it’s listening in on available to the outside!  You should only open up the port that TOR is listening on in your firewall configuration.

Your service will only be accessible via the .onion address so you don’t have to worry about people stumbling upon it by accident (unless they guess or steal your .onion address – which should be difficult).

Here’s my 3rd grade art showing you how it works:


5) Start the tor service and web server:

sudo tor
service apache2 start

6) When you start the tor service for the first time it will generate your key and hostname in the directory you specified in the configuration file.  Navigate there and copy your address.  This is the URL for your .onion address.


Hopefully this clarified things and provided a simple way to set up a TOR hidden service!  Let me know if I need to make changes!

Password Complexity

Your hear a lot about passwords and hashes in the news, and how they are stolen from databases connected to webservers that are exploited through SQL injection or some other type of vulnerability.  If you’re like me, you constantly see articles about how to make your password stronger and the best ways to store them.  To be frank, I think password setting guides are ridiculous and fail to understand a basic mathematical concept.  It’s all about combinations!  (Permutations don’t really fit in here since order doesn’t matter when creating a password).  I’ll explain more about combinations shortly.

Let me explain how a password is stored when you sign up for something online and create a username and password.  When you submit your profile your username and password are stored in a database.  However, usually only your password is “encrypted”.  If you encrypt both user name and password then no relation exists in which to correlate login – Well, you could encrypt both, but that would just be ridiculous – (When you log into a system your user name is matched with the user name on file and your password is re-“encrypted” and then matched to the same “encrypted” version in the database).  Most databases that hold user names and passwords are relational and need to match an “encrypted” password with a plain text username… Again, this is just generally speaking.

You may be curious as to why I use quotes when I say “encrypted”… Let me enlighten you for a minute (Try to pay attention because I make a lot of basic assumptions).

A hash is a ONE-WAY function that produces a fixed size output from any size input.  Imagine a black box (yes, this analogy is cliché if you’re a programmer) that takes an input and produces an output.  The input is any length of text and the output is a 128 character long text.  The box essentially scrambles or encrypts the input text by using an algorithm and then produces a fixed size output text.  The unique thing about a hash function is that it will (or at least should) always produce the same output text from the same input text.  If you change one letter in the input text sample then it will completely change the output string.

So, image taking the word “apple” and sending it through an MD5 cipher (type of hash algorithm).  You will get a fix length output (measured in bits – i.e.as768asd95asf521235n346k45nmckas9s0gfm) of some crazy combination of letters and numbers.  The algorithm used in the cipher cannot be reversed so you can’t send that fixed length output back through the cipher to get the word “apple” again.

Encryption algorithms allow for cipher text (encrypted text) to be reversed using the same algorithm, if you have the right key.  Encryption is only as good as its key.  The algorithm is publicly known (usually, unless it’s proprietary) but the key is what makes it unique.  I won’t go further on this subject.  Back to hashing.

Okay, so why use hashing if you can’t reverse it?  Well, for one, a hashing algorithm is faster than an encryption algorithm, and for two, companies don’t care about decrypting your password!  If you forget it they’ll just have you make a new one!  It’s safer for the company as well since a hacker would have to guess your password rather than finding the encryption key used in the encryption algorithm the company uses.  If the company loses that key, then all accounts could be compromised.  Starting to make sense?

So the only way to crack a hash is to guess it (since you can’t reverse it)!  You take a word, hash it using the same algorithm as the system you are trying to breach, and see if it matches!  Simple, right?  No, this is where entropy comes in, or password complexity.

If you remember from math class or statistics, combinations are the number of possibilities of unordered sets.  For example, think of a PIN of 4 digits.  How many combinations of 4 digits can you make using only digits?  Well, the combination formula is n^r where n is the number of possible values to choose from and r is the number you will choose.  In our example there are 10 digits (0-9) and 4 digits in the PIN.  So the formula would be 10^4 = 10,000 different possible 4 digit PINs.  We don’t deal with permutations because ordering doesn’t matter (i.e. 1221 = 2211).

Now, for the real eye opener (hopefully).  You know how people tell you to make a really complex password so nobody will guess it?  They tell you to make these insane passwords that nobody can even memorize!  For example, if my password is “secureme” most people will tell you to make it complex using upper case, lower case, symbols, and numbers.  While this does increase the complexity and entropy of your password it isn’t the safest method, and it’s more frustrating than anything.  Taking my sample password, “secureme”, people will tell you to turn it into this beast: “!S3cUreM3!%@”.  Who wants to memorize that?!?!

Let’s take a look at the math.  So we used digits, upper case, lower case, and special characters.  Assuming that a system will take this form of complexity (yes, systems do have their limitations as to the length and complexity of your password – especially banking systems, go figure), let’s figure out “n” in the combinations formula.  So we will add up all values to make our pool from which to pull our password values.  There are 10 digits, 26 lowercase letters in the alphabet, 26 uppercase letters in the alphabet, and let’s say 10 special characters.  That makes n = 72.

So let’s assume that we are using all those fancy characters and numbers and our password is 8 characters long.  The formula becomes 72^8 = 722,204,136,308,736.  That’s a big number, but that can be cracked probably in a few months.  Now, let’s look at something else, something better.

Ever heard of a passphrase?  What is a phrase?  “John went home for the day”.  Simple!  A word is: “went” or “home”.  I’m not talking about programming words (DWORD/WORD) but actual grammar.  So if you have a passphrase it’s essentially a phrase you use as a password!  Let’s assume my password is now “canyoubringmechocolatetoday”.  Going back to the formula we get n = 26 (only lowercase letters) and r = 27 (that’s how many characters there are – or length of my passphrase).  If you know you exponent rules you already know this is going to be a RBN (really big number).  So, 26^27 = 1.6005910908538609008071353149841e+38.  Obviously this is a much larger number than the one produced from our crazy password earlier!  Now what if we capitalize the first letter in the passphrase and add a question mark at the end since it’s a sentence?  62^29… you get the idea.  Also, what if the words in our phrase don’t even make sense and are completely unrelated?  How hard would it be to guess “Sally can’t chocolate today for two potatoes”?  A hacker would spend their whole life (and then some) trying to guess it!

So, the point I attempted to make was to help you think about your password and do yourself a favor by understanding them and making them easier to manage.  I hope this helped and look forward to hearing comments and input!

If you don’t believe me, have some fun on this site and see for yourself:

Nslookup: Querying Another Nameserver

For inexperienced users nslookup can be a little intimidating.  I’m going to show you have to use basic commands (in Windows) to get information from a nameserver.  This should be used for troubleshooting but it can also be used for malicious purposes.

First bring up a command prompt and type in “nslookup”.  This will give you the carrot prompt “>”.

From there you can specify a different nameserver other than yours with the “server” command:

> server

Where “” is the new nameserver’s IP

Once you have changed your server you can begin querying it.  You can either type in an IP or DNS name into the prompt and it will display the results.

Here is an example:

> server
Default server:

Non-authoritative answer:

You can also do a quick search from the command prompt like this:


Where “” is the name you want to resolve and “” is the nameserver’s IP you want to use.

Hope this helps!

Erase a Hard Drive For Reuse

So it took me forever to figure out how to erase a hard drive in order to reuse it because, from what I’ve seen and read, people don’t know what it means to erase and not completely destroy the disk!  

The answer was so obvious and I feel a little ashamed for not realizing it earlier, but all you do is download Darik’s Boot and Nuke, make a bootable USB out of it, and then boot from USB on your system and select which wipe option you want to do.  One of the options is to do a “Quick Erase” which will only write over the disk once.  This option will allow you to “erase” the disk and reuse it for another install (the only completely “wiped” hard drive is the one ground to dust, thrown in a powerful magnet, burnt, and then soaked in water — as in there is always a way to get something out of your drive).

The other options in the program allow you to do more than one pass over and perform different types of passes (like Pseudo Randomly Generated Numbers and DoD spec wipes).  Be careful though because you can/will destroy your hard drive (at least in your eyes) experimenting.

GeoIP Lookup with Ubuntu

This tutorial demonstrates how to install the CLI geoiplookup tool on Ubuntu 12.10 Server and and how to get data files from which you can lookup geoip information.   With the CLI tool you can resolve IPv4 and 6 addresses to countries, cities, and organizations.

First you need to install the CLI tool:

apt-get install geoip-bin

After that navigate to the geoip folder containing your .dat files.  The .dat files are used for various geoip queries and you can get free or subscribed ones from various sources.  I download mine them from MaxMind.  Make sure to download the “Binary / gzip” ones!

Once your in the /usr/share/GeoIP/ directory download the files from MaxMind using wget or whatever tool you want.

You will need to use the “gunzip” to unzip the new data files.

Once they’re extracted make sure the only files in your directory are the .dat ones.

After you have your database files and the geoiplookup tool installed, you can now do searches.

The most basic search is this: geoiplookup

This will go you your data files and display the country and organization to which this IP address belongs.

More complex searches involve point the command to either a single .dat file or the entire directory with the -f and -d options, respectively, after the geoiplookup command.  You can do this to narrow or broaden you searches to cities and organizations.

For example:

geoiplookup -f /usr/share/GeoIP/GeoIP.dat

will output:

GeoIP Country Edition: US, United States


geoiplookup -d /usr/share/GeoIP/

will output:

GeoIP Country Edition: US, United States
GeoIP ASNum Edition: AS15169 Google Inc.

You can view the man page here for additional options.

If you want to get really into it you can create a CSV file from a list of IP addresses and create specific columns based on different geoiplookup command searches. Here’s something I came up with:

while read line;
do echo "$line,$( geoiplookup -f /usr/share/GeoIP/GeoIP.dat $line | cut -d " " -f 4- ),\"$(geoiplookup -f /usr/share/GeoIP/GeoLiteCity.dat $line | cut -d " " -f 7-9 | cut -d "," -f -2)\",\"$(geoiplookup -f /usr/share/GeoIP/GeoIPASNum.dat $line | cut -d " " -f 5-)\" " >> $output;
done < geoips.txt

Cisco VPN Client for Mac Problem Solved

For the longest time I couldn’t get Cisco’s VPN client to work on my Mac running OS X 10.7.  It would install but fail to run saying that it was no longer supported on my system.  With a lot of searching and trying to find alternatives I finally found the solution!  It turns out that newer models of the MacBook Pro, by default, run at 64 bits, which is not supported by the Cisco VPN client.  So in order to run the client you must reboot into 32 bit kernel mode.  You can do this by pressing the “3” and “2” keys together during system startup (grey screen).  This will boot into the 32 bit kernel only for the duration of the boot.  Once you restart it’ll switch back to 64 bit.

To make these changes permanent you can use these commands for 64 and 32 kernel booting, respectively:

sudo systemsetup -setkernelbootarchitecture x86_64
sudo systemsetup -setkernelbootarchitecture i386