Skip to content

Runtime: stream serialization of state at the end of each step #1203

@josephjclark

Description

@josephjclark

I think we've seen a case where a huge dataclip has triggered an OOMKill by kubernetes

Now, a Run executes in a child process with a limit of usually 1gb, or whatever. The run should be oomkilled if that child process exceeds its memory limit, and nothing is lost.

But that OOMkill does require memory allocation to run. Blocking processes, like JSON.stringify, could prevent the nodeprocess from blowing itself up.

So what can happen is this:

  • A step completes and writes a huge dataclip to state. 1gb of JSON, why not
  • The runtime serializes the state at the end of each step.
  • For a large object, this serialization could use a lot of memory (i think json needs 3-4x the memory of the object its parsing)
  • And if this serialisation is blocking, it'll just chew up available memory without blowing up
  • And if the pod uses too much memory, it'll be instantly and gracelessly killed by kubernetes

So I think I need to look closely at that serialisation code. I think I use fast-safe-stringify right now - but if it's blocking, maybe I need to borrow the streaming serializer from the worker

Something else we might consider is saying: if the state object exceeds a certain limit in size, don't bother serializing, just pass it through. In a similar way that I want an exemption for streams on state.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions