Generalized Password Cracking, Part 2: Starting with Some Stock Password Attacks

Disclaimer: The present series of posts contains information on how to exploit security vulnerabilities in passwords. It is intended for educational and research purposes only. Neither the password cracking tools described in this series nor any of the exploits developed with these tools are to be used for gaining unauthorized access to accounts or other computer resources in the real world. Don’t hack anyone else’s account, computer, device, or network without their express permission. I am not responsible for any illegal hacking activities you decide to engage in using this information, nor am I responsible for any damages that may result from such activities.


Before I implement my password cracking language, I need some stock password attack implementations so I can combine these implementations later into a generalized pattern attack. I have chosen three standard attacks here – a simple dictionary attack that hashes words from a wordlist, a bruteforce attack that searches all passwords up to seven characters in length, and a precomputed table lookup using a file that directly maps hashes back to plaintext passwords.

All of these attacks require using the hashing algorithms we are trying to reverse. These will be supplied by the OpenSSL libraries. There are two libraries that implement the OpenSSL API – they are libssl and libcrypto. Header files for the OpenSSL API can be found in /usr/include/openssl.

To hash passwords, I’m using theSHA1() function declared in openssl/sha.h. This uses a single character for each byte of the hash, meaning if the input is in a hexadecimal format, it will need to be converted into raw byte format so that the result can be compared with the output of the SHA1() function. To do this we can use the following code:


ptr = argv[1];
for( i = 0; i < 20; i++ ){
        image[i] =
        16 *
        ( isdigit( ptr[0] ) ? ptr[0] - '0' : ptr[0] - 'a' + 10 ) +
        ( isdigit( ptr[1] ) ? ptr[1] - '0' : ptr[1] - 'a' + 10 );
        ptr += 2;
}

I also want to display each preimage and the hex code for each hash to the console so that I can see what the password cracker is doing in real time. I realize this makes the program a lot slower, but I just want to get a general idea of what the password cracking process looks like for a given implementation of the algorithm, so I can gain insight into how to improve it. I also want to get a rough idea of how efficient the algorithm is. To produce a hex string for a password hash, we can use the following code:


for( i = 0; i < 20; i++ ){
        printf"%s%x",
        (uint8_t) hash[i] < 0x10 ? "0" : "",
        (uint8_t) hash[i] );
}

Let’s look at the bruteforce algorithm first. The algorithm I have written is implemented using several layers of nested for loops that select a certain ASCII character between ! and ~ (that’s the entire range of printable ASCII characters). The program concatenates these characters together into a seven-byte string. It then hashes the string consisting of the first byte, then the first two bytes, then the first three bytes, and so on, and compares these to the hash that has been converted to raw-byte format from the argument. If a match is found, it exits, and then you know that the last password printed is a match.

Here is the complete code for the bruteforce program (it’s rather wide, but you can use the sideways scrollbar at the bottom of the code block to view the entire thing):


 1 // Bruteforce attack
 2 
 3 #include <stdio.h>
 4 #include <stdlib.h>
 5 #include <stdint.h>
 6 #include <string.h>
 7 #include <ctype.h>
 8 #include <openssl/sha.h>
 9 
10 int mainint argc, char **argv ){
11         char a, b, c, d, e, f, g; // Characters in the password
12         int i, j;
13         char *ptr;
14         char passwd[8];
15         char hash[21];
16         char image[21];
17         // Null-terminate all strings:
18         passwd[7] = '\0';
19         hash[20] = '\0';
20         image[20] = '\0';
21         // Convert from hex to raw bytes:
22         ptr = argv[1];
23         for( i = 0; i < 20; i++ ){
24                 image[i] =
25                 16 *
26                 ( isdigit( ptr[0] ) ? ptr[0] - '0' : ptr[0] - 'a' + 10 ) +
27                 ( isdigit( ptr[1] ) ? ptr[1] - '0' : ptr[1] - 'a' + 10 );
28                 ptr += 2;
29         }
30         for( a = 33; a < 127; a++ ){
31                 for( b = 33; b < 127; b++ ){
32                         for( c = 33; c < 127; c++ ){
33                                 for( d = 33; d < 127; d++ ){
34                                         for( e = 33; e < 127; e++ ){
35                                                 for( f = 33; f < 127; f++ ){
36                                                         for( g = 33; g < 127; g++ ){
37                                                                 passwd[6] = a;
38                                                                 passwd[5] = b;
39                                                                 passwd[4] = c;
40                                                                 passwd[3] = d;
41                                                                 passwd[2] = e;
42                                                                 passwd[1] = f;
43                                                                 passwd[0] = g;
44                                                                 for( i = 1; i <= 7; i++ ){
45                                                                         SHA1( passwd, i, hash );
46                                                                         printf"SHA1(%s,%d)=", passwd, i );
47                                                                         // Print hash in hex format:
48                                                                         for( j = 0; j < 20; j++ ){
49                                                                                 printf"%s%x",
50                                                                                          (uint8_t) hash[j] < 0x10 ? "0" : "",
51                                                                                          (uint8_t) hash[j] );
52                                                                         }
53                                                                         putchar'\n' );
54                                                                         if( !strcmp( hash, image ) ) exit0 );
55                                                                         // Stop when reaching the correct preimage.
56                                                                         // The user will know that whichever preimage
57                                                                         // is last is the correct one.
58                                                                 }
59                                                         }
60                                                 }
61                                         }
62                                 }
63                         }
64                 }
65         }
66         return 0;
67 }

As you can see, I had the algorithm place the characters in reverse order. Originally I had it place them in forward order, but then I found that if I did that, the program would spend the first several hours just testing different numbers of exclamation points over and over. If it does them in reverse order, it will cycle through a variety of shorter sequences more quickly, allowing it to catch very weak passwords in a decent amount of time.

The method of generating every possible 7-character string then testing substrings of each string may seem inefficient at first glance since it will end up testing shorter passwords multiple times, but the reality is the code is far less redundant since the alternative would be a long, tedious sequence consisting of one for loop, then two nested for loops, and so on, one after the other. A single nestedfor loop is also much easier to build from a pattern command given by the user. Since these algorithms are intended to be incorporated into an actual programming language implementation, I think this is fairly important.

It’s also not that much more complex, which you can see if you write out the complexity equations for the two algorithms. If we measure complexity by how many hashes we have to compute, then the time complexity of the long, tedious code would look like this:

Brute-forcing algorithm with exponential complexity

On the other hand, the complexity of a single nested loop pattern would look like this:

Brute-forcing algorithm with exponential complexity

In these equations, s represents the set of characters, n represents the maximum length of the passwords searched, and C1 and C2 represent the time complexities of the first and second algorithm respectively. In both cases, the algorithm completes in exponential time, and the complexity moving from the first to the second is merely multiplied by a constant factor. This factor will be approximately n. For example, if we calculate the complexities for |s| = 94 (the number of printing ASCII characters) and n = 7 we get the following result:


C1=65545047154954
C2=453934315934848
C2/C1=6.925532

If we hash the password “A0” using the sha1sum command and then try to reverse this hash using the bruteforce program, we get the following output:


$ ./bruteforce b8c4ed32039755356ab5eaf0878516ef63ab96f8
SHA1(!!!!!!!,1)=0ab8318acaf6e678dd02e2b5c343ed41111b393d
SHA1(!!!!!!!,2)=b4613f8681b1e26686a2e88299525a4dc89c46d5
SHA1(!!!!!!!,3)=9a7b006d203b362c8cef6da001685678fc1d463a
SHA1(!!!!!!!,4)=f6bb43b4c87f68cd820cc911a787b06060e85227
SHA1(!!!!!!!,5)=1227cb28ec9e51942b7dacc0d5453e10d975612f
SHA1(!!!!!!!,6)=bae598184569d68359358ff314765c82166f9dfd
SHA1(!!!!!!!,7)=9b8a410b57694951c5ca9405c741fcc7578af9b1
SHA1("!!!!!!,1)=2ace62c1befa19e3ea37dd52be9f6d508c5163e6
SHA1("!!!!!!,2)=b82a1cbc556a615014aee599d4da44bfcf30db50
SHA1("!!!!!!,3)=013c0f8fccac2124416e8cf9ed8e13fd0439ee50
SHA1("!!!!!!,4)=64171fac81247af7a9373e5e4093acbd3ef429a8
SHA1("!!!!!!,5)=ebf5b260eeeace31fc2d3fe65562c8b04e4d0391
SHA1("!!!!!!,6)=6314f29e8974d0fb4398d2979207f5136b95e755
SHA1("!!!!!!,7)=2b72e4e89f114cbcd4e373cdb1e4fc6eed3b0694
SHA1(#!!!!!!,1)=d08f88df745fa7950b104e4a707a31cfce7b5841
SHA1(#!!!!!!,2)=2d46bc445fe5eef1125228cce271ca63a02b246d
SHA1(#!!!!!!,3)=62835194030ec6d9038a83472408cbebb5734055
SHA1(#!!!!!!,4)=d8e16006a45475bd5dca1de8a6314a6a981cc5bb
SHA1(#!!!!!!,5)=78978e44a63d26fdeca7bdf4624858c6682357f2
SHA1(#!!!!!!,6)=cc23d43e101b56155e1ad5ac7fa61a13bf916faf
...
SHA1(>0!!!!!,4)=4cb2dca31b9c5fe1c4b7a09a89127a64a6db2404
SHA1(>0!!!!!,5)=e58c06cdb3ad914c8254d62fb789e50bc35afa6e
SHA1(>0!!!!!,6)=b6af29586e7e797eb7d5f3f801f5a54ed96bc5e5
SHA1(>0!!!!!,7)=54e3bae94042f95dd8820694fcefb83a30718072
SHA1(?0!!!!!,1)=5bab61eb53176449e25c2c82f172b82cb13ffb9d
SHA1(?0!!!!!,2)=04f46713539ee1e582054f9a5438865d91001599
SHA1(?0!!!!!,3)=b386febe2cb35532c96865158ae4de20fe3b22b8
SHA1(?0!!!!!,4)=e3cebdd92f5459245587e8fadfcf303cca1588ff
SHA1(?0!!!!!,5)=2809994d8a367c043ec3cda0f6487820f15d3eb3
SHA1(?0!!!!!,6)=1a4d1589c9e439e36ddc9446bc077cf1fb16da73
SHA1(?0!!!!!,7)=f6f1a46d0283165333dfb6813ec604abc1bf7fce
SHA1(@0!!!!!,1)=9a78211436f6d425ec38f5c4e02270801f3524f8
SHA1(@0!!!!!,2)=5d535dabfd744109cbbae1ad1b4901d1946e7548
SHA1(@0!!!!!,3)=98a2a246adf8c1f3a4091ab7cdd8a5e72d21b017
SHA1(@0!!!!!,4)=cb5a65be01d4e284003de624b89698877fcfba23
SHA1(@0!!!!!,5)=9cf399810e4e636f765ad79cd4a5eec040f61c63
SHA1(@0!!!!!,6)=bb54c38ed15fb8209c283aaa045c10bb18da89ad
SHA1(@0!!!!!,7)=033e3ebcf47522581b62c94736182c11b5c4fc9b
SHA1(A0!!!!!,1)=6dcd4ce23d88e2ee9568ba546c007c63d9131c1b
SHA1(A0!!!!!,2)=b8c4ed32039755356ab5eaf0878516ef63ab96f8

As you can see, the algorithm stopped once it reached the hash for the password we chose. The second argument in the parenthesis is 2, meaning the algorithm found a match when it tested the first two characters of the sequence.

The program for the dictionary attack has a lot of the same code, though the actual attack is structured differently. The program takes two arguments: the first is the password hash you’re trying to crack and the second is the file path for the wordlist the program is to use. The program is fairly straightforward, reading one word at a time from the wordlist, hashing it, and then comparing it to the raw byte version of the first argument.


 1 // Dictionary attack
 2 
 3 #include <stdio.h>
 4 #include <stdlib.h>
 5 #include <stdint.h>
 6 #include <string.h>
 7 #include <errno.h>
 8 #include <ctype.h>
 9 #include <openssl/sha.h>
10 
11 int mainint argc, char **argv ){
12         int c, i, len;
13         FILE *wordlist;
14         char *ptr;
15         char passwd[100];
16         char hash[21];
17         char image[21];
18         // Null-terminate strings:
19         hash[20] = '\0';
20         image[20] = '\0';
21         // Convert from hex to raw bytes:
22         ptr = argv[1];
23         for( i = 0; i < 20; i++ ){
24                 image[i] =
25                 16 *
26                 ( isdigit( ptr[0] ) ? ptr[0] - '0' : ptr[0] - 'a' + 10 ) +
27                 ( isdigit( ptr[1] ) ? ptr[1] - '0' : ptr[1] - 'a' + 10 );
28                 ptr += 2;
29         }
30         if( !(wordlist = fopen( argv[2], "r" )) ){
31                 fprintfstderr"%s: Can't open wordlist file: %s\n", argv[0], strerror( errno ) );
32                 exit( errno );
33         }
34         while( (c = fgetc( wordlist )) != EOF ){
35                 ungetc( c, wordlist );
36                 for( i = 0; i < 100; i++ ) passwd[i] = '\0'// Make sure string is null-terminated
37                 fgets( passwd, 99, wordlist );
38                 // The next three lines get rid of any newline characters at the end.
39                 len = strlen( passwd ) - 1;
40                 if( passwd[len] == '\n' ) passwd[len--] = '\0';
41                 if( passwd[len] == '\r' ) passwd[len--] = '\0';
42                 // Now for the actual attack:
43                 SHA1( passwd, len + 1, hash );
44                 printf"SHA1(%s)=", passwd );
45                 // Print hash in hex format:
46                 for( i = 0; i < 20; i++ ){
47                         printf"%s%x",
48                                 (uint8_t) hash[i] < 0x10 ? "0" : "",
49                                 (uint8_t) hash[i] );
50                 }
51                 putchar'\n' );
52                 if( !strcmp( hash, image ) ){
53                         fclose( wordlist );
54                         exit0 );
55                 }
56         }
57         fclose( wordlist );
58         return 0;
59 }

Keep in mind I haven’t completely tested this program. I’ve only tested it using wordlists in the Unix format, so I’m not 100% sure if it works for DOS-format wordlists. Also, the program segfaults if it encounters a blank line. I decided it would be easier to just delete the one blank line present in the John the Ripper wordlist than it would be to modify the program so it could process blank lines.

I’ve run this program using two wordlists: john.txt and cain.txt. john.txt has a little over 3,000 words while cain.txt has a little over 300,000. It took 9 minutes and 51.58 seconds for the program to hash every single password in cain.txt and print the result of each hash to the console. On the other hand, I let my bruteforce password cracker run overnight and it was still running the next morning. So clearly there’s a massive efficiency disparity between bruteforce and dictionary attacks. Dictionary attacks are far more efficient even with very large wordlists.

The final attack is the one using a precomputed table lookup. Originally I was going to do a rainbow table attack, but then I did some research on how rainbow tables work and realized it would be far too complicated for this stage of the project. So the lookup table we will be using is simply a one-to-one mapping of hashes onto passwords.

The lookup method consists of two programs: one to generate the lookup table for a wordlist and another to use the lookup table to quickly find the preimage of a hash. The following code generates the lookup table:


 1 // Precomputed table generator
 2 
 3 #include <stdio.h>
 4 #include <stdlib.h>
 5 #include <stdint.h>
 6 #include <string.h>
 7 #include <errno.h>
 8 #include <openssl/sha.h>
 9 
10 #define UNIX 0
11 #define DOS  1
12 
13 int mainint argc, char **argv ){
14         int c, i, len;
15         char passwd[100];
16         char hash[21];
17         hash[20] = '\0';
18         FILE *wordlist;
19         uint8_t format = UNIX;
20         if( !argv[1] ) wordlist = stdin;
21         else{
22                 if( !(wordlist = fopen( argv[1], "r" )) ){
23                         fprintfstderr"%s: Can't open wordlist file: %s\n", argv[0], strerror( errno ) );
24                         exit( errno );
25                 }
26         }
27         while( (c = fgetc( wordlist )) != EOF ){
28                 ungetc( c, wordlist );
29                 for( i = 0; i < 100; i++ ) passwd[i] = '\0'// Make sure string is null-terminated
30                 fgets( passwd, 99, wordlist );
31                 // The next three lines get rid of any newline characters at the end.
32                 len = strlen( passwd ) - 1;
33                 if( passwd[len] == '\n' ) passwd[len--] = '\0';
34                 if( passwd[len] == '\r' ){
35                         passwd[len--] = '\0';
36                         format = DOS;
37                 }
38                 // Now we generate the table entry:
39                 SHA1( passwd, len + 1, hash );
40                 // Print hash in hex format:
41                 for( i = 0; i < 20; i++ ){
42                         printf"%s%x",
43                                 (uint8_t) hash[i] < 0x10 ? "0" : "",
44                                 (uint8_t) hash[i] );
45                 }
46                 putchar':' );
47                 printf"%s", passwd );
48                 if( format == DOS ) putchar'\r' );
49                 putchar'\n' );
50         }
51         if( wordlist != stdin ) fclose( wordlist );
52         return 0;
53 }

This program can either take a wordlist as an argument or read from standard input. It does the latter automatically if no argument is provided. It then reads each line, hashes it, and then prepends the hash to the password, separated by a colon. It uses a binary value to keep track of whether the file it’s reading is in DOS format or Unix format and makes sure its output is in the same format.

Then we have the actual table lookup algorithm. Its code looks like this:


 1 // Precomputed table lookup
 2 
 3 #include <stdio.h>
 4 #include <stdlib.h>
 5 #include <string.h>
 6 #include <ctype.h>
 7 #include <errno.h>
 8 
 9 #define hex_value( c ) (isdigit( c ) ? c - '0' : tolower( c ) - 'a' + 10)
10 
11 int mainint argc, char **argv ){
12         int c, i, len;
13         char *fieldsep;
14         char buf[150];
15         char *hash;
16         char *passwd;
17         FILE *table;
18         len = 1;
19         // Set field separator string:
20         fieldsep = ":";
21         for( i = 1; i < argc; i++ ){
22                 if( argv[i][0] == '-' && argv[i][1] == 'f' ){
23                         fieldsep = argv[i+1];
24                 }
25         }
26         for( i = 0; i < 150; i++ ) buf[i] = '\0';
27         if( !(table = fopen( argv[2], "r" )) ){
28                 fprintfstderr"%s: Can't open table file: %s\n", argv[0], strerror( errno ) );
29                 exit( errno );
30         }
31         while( (c = fgetc( table )) != EOF ){
32                 ungetc( c, table );
33                 fgets( buf, 149, table );
34                 hash = strtok( buf, fieldsep );
35                 passwd = strtokNULL"\r\n" );
36                 // Test only the prefix:
37                 for( i = 0; i < len; i++ ){
38                         putchar( hash[i] );
39                         if( hex_value( hash[i] ) > hex_value( argv[1][i] ) ){
40                                 putchar'\n' );
41                                 printf"Password not found\n" );
42                                 fclose( table );
43                                 exit0 );
44                         }
45                         if( hash[i] != argv[1][i] ) break;
46                 }
47                 if( i != len ){
48                         putchar'\n' );
49                         continue;
50                 }
51                 // Extend the prefix:
52                 while( hash[len] == argv[1][len] ){
53                         if( len < 40 ) putchar( hash[len] );
54                         if( hex_value( hash[len] ) > hex_value( argv[1][len] ) ){
55                                 putchar'\n' );
56                                 printf"Password not found\n" );
57                                 fclose( table );
58                                 exit0 );
59                         }
60                         len++;
61                 }
62                 putchar'\n' );
63                 if( len == 41 ){
64                         printf"%s\n", passwd );
65                         fclose( table );
66                         exit0 );
67                 }
68         }
69         printf"Password not found.\n" );
70         fclose( table );
71         return 0;
72 }

This program is different from the other ones in that it doesn’t need to use any cryptography functions. It is designed to take advantage of a sorted lookup table. This table would be generated using the previous program and then sorted using the Unix sort command. The program keeps track of how many hex digits in the current line match the target hash. For subsequent lines it only checks that many characters. It then attempts to extend the number of matching characters. If any characters don’t match it stops checking. If it finds a digit that gives the hash a greater numerical value than the target, then it knows the target hash is not in the database, so it exits. If it manages to match every character in the hash, it prints the preimage.

Checking only the number of characters known to match is a huge time saver because for the vast majority of hashes only one character has to be checked. Here’s a sample output of this program:


$ ./lookup b892f067921d231448e8f0a591107de8b2ad3202 john.sorted
0
0
0
0
0
0
0
0
0
0
...
b
b
b
b
b
b
b8
b8
b8
b8
b892f067921d231448e8f0a591107de8b2ad3202
sunny

As you can see, only the first character was checked for all but the last five hashes, and then only the first two characters were checked for all but the last hash, which was a match.

If we run the program for a hash that doesn’t have a match, we see a similarly efficient process:


$ ./lookup 73216320b4a1186852ff9f93d7df27954aab126c john.sorted
0
0
0
0
0
0
0
0
0
0
...
7
7
7
7
7
7
7
7
73
732
733
Password not found

Now that I’ve successfully implemented these three stock password attacks, the next step would be to combine the concepts from these algorithms into a pattern attack which can be automated using a pattern like what was shown in the first part of this series. This will be the subject of Part 3, so stay tuned. Until then, farewell and happy hacking.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s