Cracking Zip Password using Python3

Get a thorough knowledge of python's zipfile module and build your own Zip file password cracking tool in less than 3 minutes

Cracking Zip Password using Python3

Hello, world! We often get Zip files that are encrypted and it's unable to retrieve the files from it. In this post, I will share the knowledge while discussing a lab Protected Zip File Cracking from AttackDefense. I will guide you on how to crack the password pre-shipped zipfile module in the python language.

We are provided with a zip archive named secret.zip which is of course, password protected. The password is generated using the following password policy:

  • Password is 6 characters long
  • Password only contains characters from character set  {a, b, c, d, e, 1, 2, 3, 4, 5}

Let's try to open the zip and list the files. You can list the file names but actually, its content are encrypted not the entries in the archive

from zipfile import ZipFile

z = ZipFile("secret.zip")
z.printdir()
Getting the list of the entries in the zip file secret.zip

As you can see there is one entry LabAccessCodes-TopSecret.txt file. This was added 4 years ago and has some text when you check its size.

Also when you will try to open the file via the following code, it will throw a RuntimeError with a message telling you that password is required for extracting this file.

z.open("LabAccessCodes-TopSecret.txt")
Try to extract the file in memory and open it using zip.open

As we have seen the password policy. Let's create a wordlist containing all the possible combinations including repeated characters. Since the combinations function lacks this functionality, we will use the product function from the itertools module.

from itertools import product
from string import ascii_lowercase, digits

CHARSET = ascii_lowercase[:5] + digits[1:6]
PASSLEN = 0x6

combinations = product(CHARSET, repeat=PASSLEN)

with open("dict.txt", "w") as w:
    for combination in combinations:
        pwd = "".join(combination) + "\n"
        w.write(pwd)
        pass
    pass
Generating wordlist containing all the combinations of the password including repeated characters

If you look closely, you have total of 6 characters of the password and 10 possible characters at each location. So the total number of possible passwords for this zip file would be \( 10 ^ 6 \) or \(1,000,000\) passwords 😳.

Wordlist generated with name dict.txt

Since there are a lot of passwords, we will read the file line by line and try to check whether the zip file entry LabAccessCodes-TopSecret.txt is extractable by the password from the current line in the wordlist file or not. So the following is the simple code for that.

import sys
from zipfile import ZipFile, BadZipFile
from tempfile import mkdtemp

z = ZipFile("secret.zip")

TARGETDIR = mkdtemp()

with open("dict.txt", "rb") as w:
	for pwd in w:
		try:
			pwd = pwd.strip()
			z.extract("LabAccessCodes-TopSecret.txt", path=TARGETDIR, pwd=pwd)
			print("\nPassword found: {}. Contents are extracted to {}".format(pwd.decode(), TARGETDIR)) 		
			break
		except (RuntimeError, BadZipFile):
			print("Tried password: {}".format(pwd), end="\r")
		except Exception as e:
			print("\nError: {}".format(e))
			sys.exit(1)
			
	else:
		print("Not password found in the wordlist")

z.close()
Snippet to find the correct password of the zip file and extract the LabAccessCodes-TopSecret.txt entry into TARGETDIR directory

So basically in the above code, when the file is successfully extracted by the password in the TARGETDIR, it will print the "Password found .." message and the directory where you can find this file. How all this happen, is abstracted by this python module and is out of scope for this post.

Once you get the correct password, you can see the following output and you can read the file contents from the extracted directory.

List the files in the /tmp/tmpuw9m9wo6 and reading the extracted file in the directory