I was recently working on a security review, and I came across an anti-pattern I’ve seen time and time again. Sure, it might be obvious, but this was a relatively tenured developer who suggested this particular solution. It’s seemingly pervasive enough that it warrants digging into. So, with that in mind, let’s chat about path traversal and SSRF.
The Context
I was performing an app/code review of a new “thing” to keep this vague enough. The new “thing” would be integrating with an API which user-supplied data was used to generate the API’s final URL. I’m including an example below to make sense of my lackluster use of English. Sorry 😬
https://api.example.com/v2/{event}/{key}?environment=production
This is the URL that is being built based off of two user-supplied variables:
event
key
The initial vulnerability
In this particular case, I called out that there was opportunity for a malicious user to perform a path traversal on the generated URL, refer to the example below for how that would work.
https://api.example.com/v2/../other_api_endpoint/super_malicious#/key?environment=production
By setting the event
to ../other_api_endpoint/super_malicious#
, we can change the endpoint that the integration is making its final request to. Usage of the #
can be used to effectively negate part of the URL to the right of the #
.
This results in the following “resolved” URL:
https://api.example.com/other_api_endpoint/super_malicious
Obviously, this is a completely different location than the application intended to make a request to and likely will result in undesirable behavior.
How to weaponize this
Below are a few ways this could be used to cause actual damage, in worst to least concerning order:
- Find an open redirect in the API and use it to steal
Authorization
orAuthentication
headers- Confirmed in NodeJS that the
Authorization
and other headers are sent when a redirect is encountered
- Confirmed in NodeJS that the
- Perform state changing actions against the API
- Only applicable if the request is performing a state-changing action, and you can control the body of the request in a significant way
- Leak data from the API by calling other endpoints and seeing what information is returned
- Only applicable if you can see the API response
The proposed fix
The proposed fix sanitized user supplied data against a regex denylist. Only alphanumeric characters, periods, dashes, and underscores were allowed, aka the following regex: [^a-zA-Z0-9\.\-\_]
This was done because the event
and key
parameters, according the API specification, could only fall in that character space.
On the surface, this seems like a reasonable solution. Slashes are not allowed, and #
are not allowed. Both of which we needed to perform a path traversal earlier. But let’s revisit our URL again and get creative…
Getting creative
https://api.example.com/v2/{event}/{key}?environment=production
What if we set event
to ..
and what if we also set key
to other_api_endpoint
Our resolved URL would still be:
https://api.example.com/other_api_endpoint?environment=production
So, in this case, we can still make the integration request other endpoints. Albeit, the options for other endpoints is now smaller, but there is still a risk present.
Solving path traversal issues (for good)
URL/Percent encoding - If you read nothing more, this is the answer in most cases.
Now, for this specific case, URL Encoding actually does not prevent our new “lesser” path traversal issue. URL Encoding will keep dots as they are legal URL characters. In situations where a user controls two different path elements directly next to each other, you can still have a valid path traversal issue even though you URL encode the user’s input.
URL encoding in practice
When building URL path parameters, you want to individually encode each path parameter, then construct your URL.
When building URL query parameters, you can safely encode all parameters in one swoop.
Read more about URL Encoding at: https://javascript.info/url#encoding
Applying to this vulnerability
Revisiting our original attack vector, using URL encoding the result would look like:
https://api.example.com/v2/{event}/{key}?environment=production
–> https://api.example.com/v2/..%2Fother_api_endpoint%2Fsuper_malicious%23/key?environment=production
As you can see, the /
and #
are encoded such that they are no longer control characters in defining the path of the resulting URL. The API being called will likely not handle this input well, but it will not result in a completely different API endpoint being called.
Revisiting our more creative attack vector, using URL encoding the result would look like:
https://api.example.com/v2/{event}/{key}?environment=production
–> https://api.example.com/v2/../other_api_endpoint?environment=production
As you can see, the double dots are not encoded, and thus the attack still works. This particular example stresses why it is so important to filter out ..
inputs for path parameters. Otherwise, you are no better off.
Comments