Skip to content

Split generated serde code from transport handling by introducing middlewares #343

@fishcakez

Description

@fishcakez

Currently we generate the client and server code for TFramedTransport and there is somewhat tight coupling. However there are four different parts to this end to end pipeline:

  • Client, which generates a set of functions representing the service
  • Protocol serde
  • Transport
  • Server behavior, which generates the callbacks required to act as the service

To decouple, we provide a common data structure that is passed through the pipeline on each of the peers:

%Thrift.Message{name: String.t, type: :call | :oneway | :reply | :exception, seq_id: integer, payload: struct()}

On the client side the function for a call (or oneway) would create a %Thrift.Message{} with the method name, request type and payload as the request struct. This would be passed to the protocol layer, which replaces the request struct in the message with a new struct (e.g. %Thrift.Binary{}) containing the serialized request struct. The protocol then passes the message to the transport layer (e.g. Thrift.Binary.Framed.Client) which converts the message to data and sends it on the socket. The transport layer then receives the response, deserializes into a new Thrift.Message with serialized payload struct (e.g. %Thrift.Binary{}) and returns the message to the protocol layer. The protocol layer deseralizes the payload to the response struct, puts that in the message's payload and returns the message to the generated client function. Finally the client function handles the response struct and converts to the function's result,
i.e. {:ok, result} | {:error, Exception.t}.

This means our client side control flow is Client -> Protocol -> Transport -> Protocol -> Client. For the server side we reverse this to Transport -> Protocol -> Server -> Protocol -> Transport.

On the server side the transport layer (e.g. Thrift.Binary.Framed.Server) receives data from the socket, deserializes this into a message containing the method name, request type, sequence id and serialized payload (e.g. Thrift.Binary struct). The message is passed to the protocol layer, which deserializes the payload to the request struct and passes the message to the server layer. The server layer unrolls the request struct and dispatches it to the server's callback module, which handles the request and returns a response. The server layer turns this response into the response struct, puts it in the message struct with the new type (e.g. :reply) and returns it to the protocol layer. The protocol layer serializes the response struct into its own payload (e.g. %Thrift.Binary{}) and returns it to the transport layer. The transport layer serializes the message and sends it over the socket.

Lets take this through an example with the following schema:

namespace elixir Example

service Service
{
  string ping(),
}

The generated client will generate code approximately like:

defmodule Example.Service.Client do
  def ping(stack) do
    case Thrift.Pipeline.call(stack, %Thrift.Message{name: "ping", type: :call, payload: Example.Service.PinArgs.new()}) do
      %Thrift.Message{type: :reply, payload: %Example.Service.PingResponse{success: string}} ->
        {:ok, string}
      %Thrift.Message{type: :exception, payload: %Thrift.TApplicationException{} = error ->
        {:error, error}
    end
  end
end

However it is awkward to have to pass the stack around to all the clients so we will allow a stack and client to
be compiled into their own module:

Thrift.defclient(Example.MyClient, Example.Service.Client, stack)

And then create the Example.MyClient with ping/0 with function spec:

@spec ping() :: {:ok, String.t} | {:error, Thrift.TApplicationException.t}

The client stack is a list of modules and arguments:

[{Example.Service.Binary, []}, {Thrift.Binary.Framed, [pool: Example.MyClient.Pool]}

With Thrift.Pipeline.call/2 definition approximately:

def call([{mod, opts}], msg), do: mod.call(msg, opts)
def call([{mod, opts} | stack], msg), do: mod.call(msg, &call(stack, &1), opts)

With Example.Service.Binary.call/3 definition approximately (next being &Thrift.Binary.Framed.call(&1, opts)):

def call(%Thrift.Message{type: type payload: req} = req_msg, next, _) when type in [:call, :oneway] do
  %Thrift.Message{payload: rep} = rep_msg = next.(%Thrift.Message{req_msg | payload: serialize(req)})
  %Thrift.Message{rep_msg | payload: deserialize(rep)}
end

With Thrift.Binary.Framed.call/2 definition approximately:

def call(%Thrift.Message{type: fun} = req_msg, opts) when fun in [:call, :oneway] do
  pool = Keyword.fetch!(opts, :pool)
  apply(Thrift.Binary.Framed.Pool, fun, [pool, req, opts])
end

The Thrift.Binary.Framed.Pool would checkout a connection, send request, receive reply, checkin connection.

On the server side we also have a compiled module:

Thrift.Binary.Framed.defserver(Example.MyServer, stack)

It is started like:

Example.MyServer.start_link(opts)

The server stack is a similar list of modules:

[{Example.Service.Binary, []}, {Example.Service.Handler, {Example.MyHandler, opts}}]

The Thrift.Binary.Framed.Server would have a function to handle each request, approximately:

defp handle_framed_packet(data, stack) do
  msg = deserialize(data)
  try do
    stack
    |> Thrift.Pipeline.call(msg)
    |> serialize()
  rescue
    error ->
      serialize(%Thrift.Message{msg | type: :exception, payload: %Thrift.TApplicationException{message: error.message}})
  end
end

With Example.Service.Binary.call/3 definition approximately (next being &Example.Service.Handler(&1, {handler, opts})):

def call(%Thrift.Message{type: type payload: req} = req_msg, next, _) when type in [:call, :oneway] do
  %Thrift.Message{payload: rep} = rep_msg = next.(%Thrift.Message{req_msg | payload: deserialize(req)})
  %Thrift.Message{rep_msg | payload: serialize(rep)}
end

With Example.Service.Handler.call/2 definition approximately:

def call(%Thrift.Message{type: :call, payload: %Example.Service.PingArgs{}} = msg, {handler, opts}) do
  string = apply(handler, :ping, [opts])
  %Thrift.Message{msg | type: :reply, payload: %Example.Service.PingResponse{success: string}}
end

Once these are in place we can now support custom middleware as elements in the stack on client and server side. For example we could support Thrift.BinaryCompact (#333) as a protocol instead of our current Thrift.Binary. We could support both, or even additional custom protocols, on the same server as we could dispatch to the correct protocol module based on the "magic bytes" in the framed binary message.

Notably the middlewares are identical on client and server so a single middleware can be reused on both sides. For example to measure per method latency, which is awkward to do right now. There are many other common or generic middlewares that would be useful. Fortunately there is a protocol-agnostic middleware library at https://github.com/fishcakez/stack we can build on top of that supports retries, concurrency limits, request criticality, deadlines and distributed tracing out of the box.

We don't currently support a thrift protocol/transport that supports distributed tracing. There are open source thrift protocols that do, TTwitter from finagle and THeader from fbthrift, which could be supported via middlewares. These would require an additional headers field in the Thrift.Message struct:

%Thrift.Message{name: String.t, type: :call | :oneway | :reply | :exception, seq_id: integer, headers: %{optional(atom) => binary}, payload: struct()}

The middleware design also enables a convenient way to test clients and servers because we can write a client that calls through to the server side implementation without the protocol or transport:

stack = [{Example.Service.Handler, {Example.MyHandler, opts}}]
Thrift.defclient(Example.MyTestClient, Example.Service.Client, stack)

We would implement this in stages:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions