Unable to Execute DVC Commands in AWS Lambda

I’m working on a use case where an AWS Lambda function is triggered upon uploading data to an S3 bucket. This Lambda function is supposed to perform data versioning using DVC commands for the uploaded files and then place .dvc files in a Git repository and hashed files in a separate remote S3 bucket.

To achieve this, I have created the following Dockerfile to build the AWS Lambda function, where I’m installing all the required dependencies like GIT and DVC using a ‘requirements.txt’ file:

FROM public.ecr.aws/lambda/python:3.11

# Installing GIT depedencies
RUN yum update -y
RUN yum install git -y

# Copy requirements.txt
COPY requirements.txt ${LAMBDA_TASK_ROOT}

# Copy function code
COPY lambda_function.py ${LAMBDA_TASK_ROOT}

# Install the specified packages
RUN pip3 install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "lambda_function.handler" ]
# Requirement.txt file
dvc
dvc[s3]
s3fs
boto3

In the Lambda function file ‘lambda_function.handler’ (as shown below), I’m executing the dvc init command:


import json
import boto3
import os

def lambda_handler(event, context):
    # Specify the S3 bucket and file name
    bucket_name = 'mytestdata-dvc'
    file_name = 'mytextfile.txt'
    
    # Create an S3 client
    s3 = boto3.client('s3')
    
    # Read the file from S3
    response = s3.get_object(Bucket=bucket_name, Key=file_name)
    file_content = response['Body'].read().decode('utf-8')
       
    os.makedirs('/tmp/dvc-repo', exist_ok=True)
    os.chdir('/tmp/dvc-repo')
    
    # download the file
    try:
        s3.download_file(bucket_name, file_name, '/tmp/dvc-repo/mytextfile.txt')
    except Error as e:
        print("<--Error:-->", e)

    os.system("git clone https://github.com/<repo-name>.git")
    os.system("dvc init")
    
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

Despite trying various methods, I’m unable to run DVC commands in AWS Lambda. I’ve also attempted to create the Lambda function using AWS Layers and as a zip file, but the results remain the same.

Has anyone encountered this issue before, and do you have any guidance or pointers that might help resolve it?