Can Phoenix Safely use the Zip Module?

Cover image of floating laptops, purple liquid filled bottles and zippers.
Image by Annie Ruygt

This post explores the OTP :zip module and tests it against two different types of zip attacks so we can learn how to safely use zip in our Elixir applications. Fly.io happens to be a great place to run Elixir applications. Check out how to get started. You could be up and running in minutes.

A lot of cool stuff is available in Erlang’s OTP Standard Library. Elixir has the ability to directly use everything in OTP and that means we have a number of built-in features available to us. One of those features is the Erlang :zip module. As the name suggests, it is a “utility for reading and creating ‘zip’ archives.”

When we consider using a new library, tool or feature in our applications, we should always consider the security impact of it. Here we’ll look deeper into how a zip file can be abused and test how we can safely use the OTP :zip module.

TL;DR

If you already know about Path Traversal attacks and Zip Bombs then jump to the summarized Conclusion for the 9 take home lessons.

Why accept zip files from users?

Why would we want to accept zip files at all? It’s common to allow users to upload individual files like images, CSV files, documents to review, etc.

Sometimes it’s helpful to upload a large number of files at a time, especially when they are organized in nested folders. We want to make it easier for our users to upload whole directories of files and have it keep the directory organization once uploaded.

Never trust user input

The mantra we should have stuck in our heads is, “never trust user input.” Well, a zip file is a pretty complex container for a lot of different user inputs! What’s unique about a zip file is that the container format itself can be abused too!

If we’re going to allow users to upload zip files, then we should know a bit about the types of security risks. Of course, the zip file may contain files with trojans, viruses, or other malware. Antivirus software is purpose built-to detect those things. We’re also concerned about how the zip structure itself might be malicious.

Here are the two types of zip file attacks we will dig deeper into:

  • The Zip Slip attack, which is a marketing name for a form of “path traversal attack”.
  • A Zip Bomb attack, which creates tiny zip files that, when extracted, consume massive amounts of CPU and/or storage.

We’ll take a look at these two attack types to find out what happens with the :zip module and what, if anything, we should do about it.

Path traversal attack

What is a path traversal attack?

A path traversal attack is described this way by the OWASP Foundation:

A path traversal attack (also known as directory traversal) aims to access files and directories that are stored outside the web root folder. By manipulating variables that reference files with “dot-dot-slash (../)” sequences and its variations or by using absolute file paths, it may be possible to access arbitrary files and directories stored on file system including application source code or configuration and critical system files.

With regard to zip files, a file can be stored in a zip archive with a relative path. It might look like this: ../../../../../../../etc/passwd. By doing this, the zip creator is trying to escape from wherever the app is running and back out to the system root and then back down to try and replace a system configuration file.

It is worth noting that the permissions used for the extract operation are the same permissions used for running our application. This is why we don’t want to run our apps as root!

What could a malicious user do?

In general, OWASP is talking about a path traversal attack being used to “read” private data, source code, keys, etc. Our situation is potentially more dangerous! We’re accepting a user controlled zip file that, when extracted, creates new files on our server!

What kinds of mischief could a user try to do?

  • System level (if permissions allow for it)
    • Overwrite existing system files
    • Add malicious system files that add evil services
  • Application level
    • Overwrite application files like our config, ENV, JavaScript files, etc.
    • Add malicious application files that might be read and executed on restart.

How can we safely test this?

We’ve decided we want to test that our application code isn’t susceptible to this attack. To do that, we need a zip file that tries to get up to some mischief. While we could poke around the dark corners of the interwebs for malicious zip files (which I’m certain we’d find), we’d rather use something a lot safer!

Let’s create our own zip file that tries to use relative file paths to escape where the application is running. Fortunately, we can safely create our file using the :zip module.

To get started, we need a file to zip and it works best if it’s in the desired target location.

The following command creates an empty file in /etc/ named evil.txt.

NOTE: This command assumes a Linux host.

sudo touch /etc/evil.txt

The test is if we can get this file created on our system when we extract the zip archive.

Using an IEx shell, run your own version of the following command:

Erlang functions expect a string to be a charlist as opposed to the Unicode strings that Elixir works with by default. The ~c sigil is really handy for converting an Elixir string to a charlist for passing arguments.

:zip.create(~c"/home/mark/evil.zip", [~c"../../../../../../etc/evil.txt"])
#=> {:ok, '/home/mark/evil.zip'}

This creates a zip file with a relative path to our malicious file. What’s interesting about this attack is you don’t have to know how deep in the tree you are, just add more ../ entries. When we are at the root directory /, then a cd ../ doesn’t error and we remain at the root directory. For our test, it just needs to be enough to get out of our application to the root.

After creating the zip file, we can inspect it with a regular archive program. This is how the Ark program handles our zip file. We see this warning:

Ark compression program showing warning message refusing to open a zip file

Yay! We’ve got our safe yet suspicious zip file.

What happens when we extract our evil zip?

The question this all leads up to is this:

If :zip can create a malicious zip file, will it extract it and be a potential vulnerability for us?

To test what happens, let’s first use the :zip.extract/1 function.

:zip.extract(~c"/home/mark/evil.zip")

13:56:32.771 [error] Illegal path: ../../../../../etc/evil.txt, extracting in ./

{:ok, ['evil.txt']}

We give it a path to the zip to extract and the “destination” is wherever our current working directory is.

Extracting evil.zip logged an error message but returned an :ok tuple because it did extract the file, but it put it in the ./ directory. For our application, this is the root of our app.

Ruh roh!

That means automatically, :zip.extract prevents it from extracting to our system root, but our application root is potentially still vulnerable.

When I ran IEx from a new terminal window, it started out at /home/mark. I can see the file evil.txt was extracted to /home/mark/evil.txt.

This means unless we take extra precautions, our application files could be potentially affected by the contents of a zip file.

What we want is to extract the files to a specified location and make sure they can’t escape from there.

Controlling the extraction location

The :zip.extract/2 function takes “options” as the 2nd argument. One of those options is :cwd for setting the “current working directory.” We pass the options as a keyword list and any strings need to be a charlist. Let’s test it out.

First, I’m running IEx from my home directory. We can create a temp directory and set the cwd to that.

:zip.extract(~c"/home/mark/evil.zip", cwd: ~c"./temp/")

14:46:42.468 [error] Illegal path: ./temp/../../../../../etc/evil.txt, extracting in ./temp/

#=> {:ok, ['./temp/evil.txt']}

The result was it didn’t extract evil.txt to the system folder, but it also wasn’t able to escape our temp/ directory. We can find the file at /home/mark/temp/evil.txt.

Now, did it fail because the path to /etc was rejected? To be sure, let’s try another test.

What if the path is valid but outside of the working directory?

Let’s create a separate evil2.txt file that lives in our home directory and create evil2.zip.

touch /home/mark/evil2.txt

Our goal with this test is to see if we can escape the working directory to get back to our application root.

IEx has some handy directory functions we can use to move around. This means we don’t even have to leave IEx! Inside IEx, we can write:

cd "temp"
#=> /home/mark/temp

Now we’re in the right place, let’s add the evil2.txt file using the relative location.

:zip.create(~c"/home/mark/evil2.zip", [~c"../evil2.txt"])
#=> {:ok, '/home/mark/evil2.zip'}

With our zip file prepped, let’s ensure we delete the evil2.txt file so we can tell if it get’s extracted. Also, let’s cd back to our home directory.

cd ".."
#=> /home/mark

Now, can we escape the cwd directory when we have permissions and it’s valid?

:zip.extract(~c"/home/mark/evil2.zip", cwd: ~c"./temp/")

15:08:54.878 [error] Illegal path: ./temp/../evil2.txt, extracting in ./temp/

#=> {:ok, ['./temp/evil2.txt']}

Phew! Even when everything is otherwise valid, :zip.extract/2 doesn’t allow us to escape the cwd directory. A quick check of our file system confirms the file was extracted to the temp directory. We learned that the :cwd option is successful at keeping extractions contained.

Early detection and aborting

The Ark archive program took the approach of detecting a relative path in the zip archive and not even letting us extract it. We can do the same!

The :zip.list_dir/1 function lists information about the archive file’s contents. Here’s what it looks like for evil2.zip.

:zip.list_dir(~c"/home/mark/evil2.zip")
{:ok,
 [
   {:zip_comment, []},
   {:zip_file, '../evil2.txt',
    {:file_info, 0, :regular, :read_write, {{2023, 3, 28}, {14, 53, 27}},
     {{2023, 3, 28}, {14, 53, 27}}, {{2023, 3, 28}, {14, 53, 27}}, 54, 1, 0, 0,
     0, 0, 0}, [], 0, 0}
 ]}

All we care about is entries that are a :zip_file and we only want the file path. With a little massaging, we can get our list of files back. Let’s see how we can get that data out.

The Kernel.to_string/1 function easily converts a returned Erlang charlist to an Elixir string. Here we see that as to_string(path).

# read the zip file's info
{:ok, zip_info} = :zip.list_dir(~c"/home/mark/evil2.zip")

# get all the file names as strings
file_names =
  Enum.reduce(zip_info, [], fn
    {:zip_file, path, _, _, _, _}, acc ->
      [to_string(path) | acc]
    _other, acc -> acc
  end)
#=> ["../evil2.txt"]

# test if any files contain "../". If so, we can error.
Enum.any?(file_names, &(String.contains?(&1, "../")))
#=> true

Using the above code, we can get the names of all the files in the zip archive and detect if any of them have a ../ in them. If so, we could reject the entire upload with an error.

Does antivirus recognize a path traversal?

For this test, we’ll use the open source ClamAV (Linux based virus detection software). Different products and services will of course behave differently.

What happens when we scan our path traversal zip file?

$ clamscan evil2.zip
/home/mark/temp/evil2.zip: OK

The results of our scan were: “all clear!” Ooops. Well, that’s good to know.

Then important thing we learned here is that :zip prevents escaping the working directory and we really should be specifying an explicit working directory using :cwd.

Now let’s explore another type of zip file attack.

Zip bombs for denial of service

Compression programs use algorithms to look for ways to squeeze the extra space from a file. A zip bomb exploits the algorithm to create a very small file that expands to something enormous. It can be used as a Denial-of-Service (DOS) attack against a server.

For some more detail on the type of Zip Bomb we’re using here, refer to this excellent technical breakdown. Here’s a snippet about what it is:

[…] a non-recursive zip bomb achieves a high compression ratio by overlapping files inside the zip container. “Non-recursive” means that it does not rely on a decompressor’s recursively unpacking zip files nested within zip files: it expands fully after a single round of decompression.

Here’s the short of it. The following list shows how big 3 different zips are capable of expanding:

  • 42 kB zip → 5.5 GB
  • 10 MB zip → 281 TB
  • 46 MB zip → 4.5 PB (Zip64, less compatible)

Yikes! We certainly don’t want those expanding on our servers consuming CPU and massive disk space!

How can we test this?

For this test, we won’t try to create the zip archive ourselves. Fortunately for us, there are safe examples available online. This previously linked resource has them for us. In our case, we’ll stick with the smallest version, zbsm.zip… just in case. 🙂

What does it look like to an archive program?

Using Ark again, we can open the 42kB file and see it how the archive program sees it.

Ark compression application showing reported contents of a small zip bomb file

Wow! That little zip really packs it in there!

We should also note that the application doesn’t complain about this file. We can extract individual files and it happily does it.

Okay, so what does the OTP :zip module do with this?

What happens when we extract the Zip Bomb?

Using the OTP :zip.extract/1 function, what happens when we extract the zip bomb file?

iex(1)> :zip.extract(~c"/home/mark/temp/zbsm.zip")

BREAK: (a)bort (A)bort with dump (c)ontinue (p)roc info (i)nfo
       (l)oaded (v)ersion (k)ill (D)b-tables (d)istribution
^C^C

Gah! Abort! Abort! It began chugging away on the file, extracting numerous large files! After seeing the pause with the IEx prompt, I killed it and found it consumed about 800MB of disk. Not cool!

So :zip is susceptible to this attack.

What does :zip.list_dir/1 tell us?

If we run :zip.list_dir/1 on the zip, what do we see?

iex(1)> :zip.list_dir(~c"/home/mark/temp/zbsm.zip")
{:ok,
 [
   {:zip_comment, []},
   {:zip_file, '0',
    {:file_info, 21849182, :regular, :read_write, {{1982, 10, 8}, {13, 37, 0}},
     {{1982, 10, 8}, {13, 37, 0}}, {{1982, 10, 8}, {13, 37, 0}}, 54, 1, 0, 0, 0,
     0, 0}, [], 0, 30357},
   {:zip_file, '1',
    {:file_info, 21849151, :regular, :read_write, {{1982, 10, 8}, {13, 37, 0}},
     {{1982, 10, 8}, {13, 37, 0}}, {{1982, 10, 8}, {13, 37, 0}}, 54, 1, 0, 0, 0,
     0, 0}, [], 36, 30321},
...

We see a list of the files like what Ark displayed. With that, we see the decompressed file sizes like 21849182 which is 21MB. We could examine all the files contained in the zip file and determine the total disk space it would consume and make a judgement call about what’s “too big”.

However, what we should perhaps consider that maybe we can’t foresee all the ways a zip upload might be malicious. Will AV detect it?

Does antivirus recognize our Zip Bomb?

We’ll again turn to the open source ClamAV (Linux based virus detection software) for this test. What happens when we scan zbsm.zip, our small Zip Bomb file?

$ clamscan zbsm.zip
/home/mark/temp/zbsm.zip: Heuristics.Zip.OverlappingFiles FOUND

Well that’s good! ClamAV identified it as a malicious zip file.

With AV, at least we have a mitigation strategy.

Conclusion

After running our experiments and playing with the :zip module, what did we learn?

  1. Elixir has easy access to powerful OTP features like the :zip module.
  2. Antivirus software should be used to help protect our systems from malicious zip files.
  3. Antivirus may not detect path traversal attempts in zip files.
  4. :zip.extract/2 is susceptible to Zip Bombs.
  5. :zip.create/2 allows us to create zip files. We can even create them with relative paths.
  6. :zip.extract/2 detects ../ paths and overrides the file path dropping it in our current working directory.
  7. We can specify the :cwd in the :zip.extract/2 options to specify a working directory.
  8. Relative path files are safely contained inside our working directory.
  9. :zip.list_dir/1 lists all the files in the zip so we can check for the existence of relative paths and abort early. We can also see the decompressed size on disk for the files.

The answer to our question “can Phoenix safely use the zip module?” ends up being “yes, but with extra precautions.” We should isolate extractions using the :cwd option and, keeping true to the mantra of “never trust user input”, we should AV scan a zip file first.

An added benefit of AV scans is it also checks the contained files for malware.

Another precaution could be to extract the files on a dedicated machine that runs in a sandbox mode. Hey, that happens to be a great use for an immutable Fly.io machine!

Stay safe out there!

Fly.io ❤️ Elixir

Fly.io is a great way to run your Phoenix LiveView app close to your users. It’s really easy to get started. You can be running in minutes.

Deploy a Phoenix app today!