CVE-2023-45133: Finding an Arbitrary Code Execution Vulnerability In Babel
Introduction
On October 10th, 2023, I stumbled upon an arbitrary code execution vulnerability in Babel, which was subsequently assigned the identifier CVE-2023-45133. In this post, I’ll walk you through the journey of discovering and exploiting this intriguing flaw.
Those who use Babel for reverse engineering/code deobfuscation love using Babel because of all of the built in functionality it provides. One of the most useful features is the ability to statically evaluate expressions using path.evaluate()
and path.evaluateTruthy()
. I have written about this in the previous articles:
Wait, did I say statically evaluate?
The Exploit
Before delving into the details, let’s take a look at the proof of concept I came up with:
Proof of Concept
1 |
|
This simply outputs the result of the id
command to the terminal, as can be seen below.
1 |
|
Of course, the payload can be adapted to do anything, such as exfiltrate data or spawn a reverse shell.
Exploit Breakdown
To understand why this vulnerability works, we need to understand the source code of the culprit function, evaluate
. The source code of babel-traverse/src/path/evaluation.ts
prior to the fix is archived here
1 |
|
When evaluate
is called on a NodePath, it goes through the evaluatedCached
wrapper, before reaching the _evaluate
function which does all the heavy lifting. The _evaluate
function is where the vulnerability lies.
This function is responsible for recursively breaking down AST nodes until it reaches an atomic operation that can be evaluated confidently. The majority of the base cases are evaluated for atomic operations only (such as for binary expressions between two literals). However, there are a few exceptions to this rule.
The two pieces of the source code we care about are the handling of call expressions and object expressions, as shown below:
Vulnerable Source Code
Relevant _evaluate source code
1 |
|
Handling of Call Expressions
The first thing to understand is that while call expressions can indeed be evaluated, they are subject to a whitelist check, relying on the VALID_OBJECT_CALLEES
or VALID_IDENTIFIER_CALLEES
arrays.
Additionally, there are three cases for handling call expressions:
- When the callee is an identifier, and the identifier is whitelisted in
VALID_OBJECT_CALLEES
orVALID_IDENTIFIER_CALLEES
. - When the callee is a member expression, the object is an identifier, the identifier is whitelisted in
VALID_OBJECT_CALLEES
, and the property is not blacklisted inINVALID_METHODS
. - When the callee is a member expression, the object is a literal, and the property is a string/numeric literal.
The most interesting one is the second case:
1 |
|
The only blacklisted method is random
, which is a method of the Math
object. This means that any other method of either the whitelisted Number
, String
, or Math
objects can be directly referenced.
In JavaScript, all classes are functions. Since Number
and String
are global JavaScript classes, their constructor
property points to the Function
constructor.
Therefore, the two expressions below are equivalent:
1 |
|
Passing in an arbitrary string to the Function
constructor returns a function that will evaluate the provided string as JavaScript code when called.
The AST node generated by Number.constructor('javascript_code_here;')
contains:
- A call expression, where
- The callee is a member expression, where
- The object is an identifier, with name whitelisted by
VALID_OBJECT_CALLEES
- The property is an identifier, not blacklisted by
INVALID_METHODS
- The object is an identifier, with name whitelisted by
- The arguments are a single string literal, containing the code to be executed.
- The callee is a member expression, where
Therefore, the code is considered safe to evaluate, and we have successfuly crafted a malicious function.
However, it is crucial to note that this cannot call the function on its own. It only creates an anonymous function.
So, how exactly can we call the function? This is where the second piece of the puzzle comes in: object expressions.
Handling of Object Expressions
Within Babel’s _evaluate
method, an ObjectExpression
node undergoes recursive evaluation, producing a true JavaScript object. There’s no limitation on key names for ObjectProperty
. As long as every ObjectProperty
child in the ObjectExpression
yields confident: true
from _evaluate()
, we can obtain a JavaScript object with custom keys/values.
A key property to leverage is toString
(MDN Reference). Defining this property on an object to a function we control will allow us to execute arbitrary code when the object is converted to a string.
This is exactly what we do in the payload:
1 |
|
We’ve assigned our malicious function, crafted via the Function
constructor, to the toString
property of the object. Thus, when this object undergoes a string conversion, it gets triggered and executed.
In the provided example, we pass the object to the String
function, given its status as a whitelisted function (referenced in case 1). Still, the String
constructor isn’t mandatory. Implicit type coercion in JavaScript can also trigger our malicious function, as demonstrated in these alternative payload formats:
1 |
|
1 |
|
The first example employs type-coercion to transform the object into a string. In contrast, the second example utilizes type-coercion to convert it into a number, as detailed in Object.prototype.valueOf(). Both examples exploit the _evaluate()
method’s approach to handling BinaryExpression
nodes, which directly performs the operation after recursively evaluating the left and right operands.
The Patch
Upon disclosing this vulnerability, I was impressed by the swift response from the Babel team, who promptly rolled out a patch. This patch was released in two parts:
The first of which was a workaround for all of the affected official Babel packages, by guarding the calls to evalute()
with an isPure()
check. isPure inherently prevents this bug, as it returns false for all MemberExpression
nodes. PR #16032: Update babel-polyfills packages
The subsequent step involved refining the evaluate()
function. This adjustment ensured that all inherited methods, not only constructor
, were prevented from being called. PR #16033: Only evaluate own String/Number/Math methods
After the fixes were implemented, GitHub staff issued CVE-2023-45133 for the security advisory.
A side note on disclosure timing
You might have noticed that this blog post was released on the same day as the security advisory. Usually for critical vulnerabilities, it’s customary to wait a while before disclosing a proof of concept. However, I believe this disclosure timing is justifiable for a few reasons:
Predominantly, the vast majority of Babel users remain unaffected by this vulnerability. Babel is primarily utilized for refactoring and transpiling one’s own code, which means the typical use case doesn’t expose users to this risk. It’s improbable that many have server-side implementations that accept and process arbitrary code from users through the compilation plugins or the invocation of path.evaluate
. Furthermore, there are really only a couple real use-cases for using Babel to analyze untrusted code on the server-side:
- Reverse engineering bot mitigation software, etc.
- Malware analysis
In the first case, I doubt any legitimate bot mitigation entity would try to attempt Remote Code Execution (RCE) due to the legal ramifications. Meanwhile, professionals using Babel for malware reversal possess the expertise to conduct their analyses within controlled, sandboxed environments. Thus, the risk to the community, in real-world scenarios, remains minimal.
Conclusion
Discovering and delving into this vulnerability was a fun experience. I initially stumbled upon the vulnerability during a brainstorming session for a Babel-based challenge for UofTCTF’s upcoming capture the flag competition, where I was focusing on an entirely different, non-security-related “bug”.
This vulnerability predominantly impacts those integrating untrusted code with Babel. Unfortunately, this places individuals leveraging Babel for “static deobfuscation” directly in the crosshairs of this attack vector.
There’s a touch of irony in the fact that my first credited CVE emerged from reverse engineering Babel - the very tool I often employ for reverse engineering JavaScript, and the topic of all of my previous posts 🤣.
This was a great learning experience, and hopefully this write-up was useful to you as well. Thanks for reading, and take care!