Skip to content

My thoughts on the Python FDK #66

@paulfelix

Description

@paulfelix

I want to share some thoughts on the fdk-python design and some experiences (so far) developing my own Python FDK. This is not a proposal to make any changes to the fdk-python, just thoughts.

Two substantial changes to the latest fdk-python include:

  1. Homegrown async http code
  2. A change in the way the function gets executed in order to facilitate lazy loading of the function's modules.

The main driver for these changes, from my understanding, is to a) improve cold startup time up to the point when the function is listening on the unix socket, and b) support Python 3 asyncio programming in the function itself.

I feel the implemented solutions to these problems are unnecessary and result in too much extra code that has to be developed/maintained/fixed. Specifically, the homegrown async http code is unnecessary, and the lazy importing approach (using a Python 3.7 feature) is also unnecessary. It also sets the Python FDK programming model apart from the FDKs for other languages.

Here's my experience so far developing my own Python FDK. I think it also solves the same problems while using much less code and avoiding a change to the function programming model. The code shown here is working code, but still just proof-of-concept, as it lacks extra error handling, return types, etc.

I chose to use the eventlet wsgi package to listen to http over the unix socket. Here's my main code fo far:

import os
import sys
import socket
from eventlet import wsgi, listen
from .application import Application


def run(handler):
    fn_listener = os.environ.get('FN_LISTENER')
    if not fn_listener:
        sys.exit('FN_LISTENER is not set')

    socket_file = fn_listener.lstrip('unix:')
    sock = listen(socket_file, family=socket.AF_UNIX)
    wsgi.server(sock, Application(handler))

Here's the basic wsgi application code. It constructs a Request object from the environment variables (code not shown) that is passed to the function's handler.

import sys
import json
import logging
from .request import Request

logger = logging.getLogger()


class Application:
    def __init__(self, handler):
        self._handler = handler

    def __call__(self, environ, start_response):
        try:
            request = Request(environ)
            response = self._handler(request)
            start_response('200 OK', [('Content-Type', 'application/json')])
            response_body = json.dumps(response).encode()
            return [response_body]
        except Exception:
            logger.exception('Error executing function')
        finally:
            sys.stdout.flush()
            sys.stderr.flush()

Here is the function's func.py code:

import fnpy


def handler(request):
    from myfunction import execute
    return execute(request)


if __name__ == "__main__":
    fnpy.run(handler)

Notice that the function's modules are not imported until the handler is called. Unless I'm missing something, that's all we need for lazy loading. The next time the handler is called, the myfunction module will have been loaded already, so the import will be a no-op.

Using my FDK, I have been working on a function that handles GraphQL queries and makes concurrent calls to other services using Python asyncio. The time from cold-start container loading to the first handler call is < 1sec.

I have been able to run my function on OSX and CentOS platforms with the same performance on both.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions