Viewing a single comment thread. View all comments

BasilLimade t1_j6ykjaz wrote

I'm looking at making a docker image to host on AWS ECR, to contain some python code and dependencies (over 250MB of dependencies, so I can't just zip up my modules as a lambda "layer"). How does this compare to making my own docker lambda image?

4

seattleite849 OP t1_j6yt8w2 wrote

How are you wanting to trigger your function?

Also, here are some examples you can peek at: https://docs.cakework.com/examples

Under the hood, both Lambda and cakework are deploying Docker containers as microVMs running on bare metal instances. A few key differences:

- Lambda is a building block vs cakework is a custom, point solution for running async tasks. Meaning with Lambda, you will want to wire together other cloud resources to make it an application you can hit. This mix of code and infrastructure makes iterating quickly on your actual logic slow, in my experience, since you need to:

- Trigger the function (either exposing it via API Gateway if you'd like to invoke it using a REST call), or by hooking it up to an event (S3 PutObject, database update event).

- To hook up your function to other functions (for example, if you want to upload the final artifact to S3), you'll set up SQS queues. If you want to chain functions together, you'll set up Step Functions

- To track failures, store input/output params and results, and easily view logs, you would set up a database and write some scripts to trace the request via Cloudwatch logs.

- With Lambda, you manage creating and building the container yourself, as well as updating the Lambda function code. There are tools out there such as sst or serverless.com which help streamline this.

- With Cakework, you write your Python functions as plain code, then run a single command via the Cakework CLI to run `cakework deploy` which deploys your functions, exposes a public endpoint you can hit (either via REST calls, a Python SDK, or Javascript/Typescript SDK). The nice thing is you can directly test invoking your function as if it were code running on your local machine.

- No limits on the docker image size and no limit on how long your job can run for (vs 10 GB and 15 minute timeout for Lambda)

- You also specify CPU and memory parameters per request! So that you don't need to spin up a bigger instance than you actually need and pay that extra cost. Or provision not enough CPU or memory and 1) deal with failures, then 2) re-deploy your lambda with more compute.

3