I was recently working on a security review, and I came across an anti-pattern I’ve seen time and time again. Sure, it might be obvious, but this was a relatively tenured developer who suggested this particular solution. It’s seemingly pervasive enough that it warrants digging into. So, with that in mind, let’s chat about path traversal and SSRF.

The Context

I was performing an app/code review of a new “thing” to keep this vague enough. The new “thing” would be integrating with an API which user-supplied data was used to generate the API’s final URL. I’m including an example below to make sense of my lackluster use of English. Sorry 😬

https://api.example.com/v2/{event}/{key}?environment=production

This is the URL that is being built based off of two user-supplied variables:

  • event
  • key

The initial vulnerability

In this particular case, I called out that there was opportunity for a malicious user to perform a path traversal on the generated URL, refer to the example below for how that would work.

https://api.example.com/v2/../other_api_endpoint/super_malicious#/key?environment=production

By setting the event to ../other_api_endpoint/super_malicious#, we can change the endpoint that the integration is making its final request to. Usage of the # can be used to effectively negate part of the URL to the right of the #.

This results in the following “resolved” URL:

https://api.example.com/other_api_endpoint/super_malicious

Obviously, this is a completely different location than the application intended to make a request to and likely will result in undesirable behavior.

How to weaponize this

Below are a few ways this could be used to cause actual damage, in worst to least concerning order:

  • Find an open redirect in the API and use it to steal Authorization or Authentication headers
    • Confirmed in NodeJS that the Authorization and other headers are sent when a redirect is encountered
  • Perform state changing actions against the API
    • Only applicable if the request is performing a state-changing action, and you can control the body of the request in a significant way
  • Leak data from the API by calling other endpoints and seeing what information is returned
    • Only applicable if you can see the API response

The proposed fix

The proposed fix sanitized user supplied data against a regex denylist. Only alphanumeric characters, periods, dashes, and underscores were allowed, aka the following regex: [^a-zA-Z0-9\.\-\_]

This was done because the event and key parameters, according the API specification, could only fall in that character space.

On the surface, this seems like a reasonable solution. Slashes are not allowed, and # are not allowed. Both of which we needed to perform a path traversal earlier. But let’s revisit our URL again and get creative…

Getting creative

https://api.example.com/v2/{event}/{key}?environment=production

What if we set event to ..

and what if we also set key to other_api_endpoint

Our resolved URL would still be:

  • https://api.example.com/other_api_endpoint?environment=production

So, in this case, we can still make the integration request other endpoints. Albeit, the options for other endpoints is now smaller, but there is still a risk present.

Solving path traversal issues (for good)

URL/Percent encoding - If you read nothing more, this is the answer in most cases.

Now, for this specific case, URL Encoding actually does not prevent our new “lesser” path traversal issue. URL Encoding will keep dots as they are legal URL characters. In situations where a user controls two different path elements directly next to each other, you can still have a valid path traversal issue even though you URL encode the user’s input.

URL encoding in practice

When building URL path parameters, you want to individually encode each path parameter, then construct your URL.

When building URL query parameters, you can safely encode all parameters in one swoop.

Read more about URL Encoding at: https://javascript.info/url#encoding

Applying to this vulnerability

Revisiting our original attack vector, using URL encoding the result would look like:

https://api.example.com/v2/{event}/{key}?environment=production –> https://api.example.com/v2/..%2Fother_api_endpoint%2Fsuper_malicious%23/key?environment=production

As you can see, the / and # are encoded such that they are no longer control characters in defining the path of the resulting URL. The API being called will likely not handle this input well, but it will not result in a completely different API endpoint being called.

Revisiting our more creative attack vector, using URL encoding the result would look like:

https://api.example.com/v2/{event}/{key}?environment=production –> https://api.example.com/v2/../other_api_endpoint?environment=production

As you can see, the double dots are not encoded, and thus the attack still works. This particular example stresses why it is so important to filter out .. inputs for path parameters. Otherwise, you are no better off.

References and Relevant Documentation