X Tutup
Skip to content
This repository was archived by the owner on Mar 23, 2026. It is now read-only.

Fix Lambda async invocation queue namespace#10831

Merged
joe4dev merged 10 commits intomasterfrom
fix-lambda-async-queue-error
May 24, 2024
Merged

Fix Lambda async invocation queue namespace#10831
joe4dev merged 10 commits intomasterfrom
fix-lambda-async-queue-error

Conversation

@joe4dev
Copy link
Member

@joe4dev joe4dev commented May 15, 2024

Motivation

Addresses #10280

Re-creating a Lambda function immediately after deleting it, triggers a race condition and deletes the internal async invoke queue. This causes continuous error logs and breaks async invokes.
The namespace clash also causes further issues because queued invokes can get deleted.

Changes

  • Introduce a volatile (i.e., non-persisted) function instance id unique to a running LocalStack installation that can be used for the async queue namespace
  • Add test test_recreate_function to reproduce the reported function recreate issue
  • Add test test_function_update_during_invoke to validate the AWS behavior of function updates during a running invocation => Currently broken on LocalStack ⚠️
  • Add test test_async_invoke_queue_upon_function_update to validate the queuing behavior of async invokes when a function update happens => kinda works but the first invoke fails and gets retried due function updates aborting all running invokes (even for $LATEST)

Unrelated changes sneaked in:

  • Fix config default for event rule engine (align with docs)
  • Remove unnecessary lambda_service backlink in Lambda service manager (missing declaration cleanup from this PR)
  • Fix an outdated comment in the Docker runtime executor
  • Fix misleading name in test and snapshot

Testing

State pickling testing whether the instance_id attribute is persisted or not:

  1. Debug tests.aws.services.lambda_.test_lambda.TestLambdaCleanup.test_recreate_function with a breakpoint at the end of the tests
  2. Export the state using localstack-pro state export --format json lambda-state
  3. Load the pickled file in a scratch file:
import pickle
import pprint

with open("/Users/joe/Downloads/lambda-state/api_states/000000000000/lambda/us-east-1/store.state", "rb") as f:
    lambda_store = pickle.load(f)

    pprint.pp(lambda_store)
    print("attr_functions".center(80, "="))
    pprint.pp(lambda_store.attr_functions)
    print("functions".center(80, "="))
    pprint.pp(lambda_store.functions)

Discussion

  • I'm not really happy with the volatile variable hack in the lambda model. Maybe, we can find some better solution here 🤔
  • We should not abort running executions for updates of $LATEST. That seems like a common scenario.

@joe4dev joe4dev added the semver: patch Non-breaking changes which can be included in patch releases label May 15, 2024
@joe4dev joe4dev added this to the 3.5 milestone May 15, 2024
@joe4dev joe4dev self-assigned this May 15, 2024
@github-actions
Copy link

github-actions bot commented May 15, 2024

S3 Image Test Results (AMD64 / ARM64)

  2 files    2 suites   3m 12s ⏱️
399 tests 347 ✅  52 💤 0 ❌
798 runs  694 ✅ 104 💤 0 ❌

Results for commit 6445b95.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented May 15, 2024

LocalStack Community integration with Pro

    2 files  ± 0      2 suites  ±0   1h 40m 16s ⏱️ + 3m 26s
2 988 tests +38  2 677 ✅ +29  311 💤 +9  0 ❌ ±0 
2 990 runs  +38  2 677 ✅ +29  313 💤 +9  0 ❌ ±0 

Results for commit 6445b95. ± Comparison against base commit 9d80628.

This pull request removes 24 and adds 62 tests. Note that renamed tests count towards both.
tests.aws.services.cloudformation.resources.test_secretsmanager ‑ test_cfn_secret_policy
tests.aws.services.events.scheduled_rules.test_events_scheduled_rules_logs ‑ test_scheduled_rule_logs
tests.aws.services.events.scheduled_rules.test_events_scheduled_rules_sqs ‑ test_scheduled_rule_sqs
tests.aws.services.events.test_events.TestEvents ‑ test_list_tags_for_resource
tests.aws.services.events.test_events_rules ‑ test_create_rule_with_one_unit_in_singular_should_succeed[rate(1 day)]
tests.aws.services.events.test_events_rules ‑ test_create_rule_with_one_unit_in_singular_should_succeed[rate(1 hour)]
tests.aws.services.events.test_events_rules ‑ test_create_rule_with_one_unit_in_singular_should_succeed[rate(1 minute)]
tests.aws.services.events.test_events_rules ‑ test_put_rule_invalid_rate_schedule_expression[ rate(10 minutes)]
tests.aws.services.events.test_events_rules ‑ test_put_rule_invalid_rate_schedule_expression[rate( 10 minutes )]
tests.aws.services.events.test_events_rules ‑ test_put_rule_invalid_rate_schedule_expression[rate()]
…
tests.aws.services.apigateway.test_apigateway_ssm ‑ test_ssm_aws_integration
tests.aws.services.cloudformation.resources.test_secretsmanager ‑ test_cfn_secret_policy[default]
tests.aws.services.cloudformation.resources.test_secretsmanager ‑ test_cfn_secret_policy[true]
tests.aws.services.cloudformation.test_template_engine.TestMacros ‑ test_pyplate_param_type_list
tests.aws.services.events.test_events_schedule.TestScheduleCron ‑ test_schedule_cron_target_sqs
tests.aws.services.events.test_events_schedule.TestScheduleCron ‑ tests_put_rule_with_schedule_cron[cron(0 10 * * ? *)]
tests.aws.services.events.test_events_schedule.TestScheduleCron ‑ tests_put_rule_with_schedule_cron[cron(0 18 ? * MON-FRI *)]
tests.aws.services.events.test_events_schedule.TestScheduleCron ‑ tests_put_rule_with_schedule_cron[cron(0 8 1 * ? *)]
tests.aws.services.events.test_events_schedule.TestScheduleCron ‑ tests_put_rule_with_schedule_cron[cron(0/10 * ? * MON-FRI *)]
tests.aws.services.events.test_events_schedule.TestScheduleCron ‑ tests_put_rule_with_schedule_cron[cron(0/15 * * * ? *)]
…
This pull request skips 4 and un-skips 3 tests.
tests.aws.services.firehose.test_firehose.TestFirehoseIntegration ‑ test_kinesis_firehose_opensearch_s3_backup[domain]
tests.aws.services.firehose.test_firehose.TestFirehoseIntegration ‑ test_kinesis_firehose_opensearch_s3_backup[path]
tests.aws.services.firehose.test_firehose.TestFirehoseIntegration ‑ test_kinesis_firehose_opensearch_s3_backup[port]
tests.aws.services.lambda_.test_lambda.TestLambdaConcurrency ‑ test_reserved_concurrency_async_queue
tests.aws.services.sns.test_sns_filter_policy.TestSNSFilterPolicyBody ‑ test_filter_policy_on_message_body_or_attribute
tests.aws.services.sqs.test_sqs.TestSqsProvider ‑ test_set_unsupported_attribute_fifo[sqs]
tests.aws.services.sqs.test_sqs.TestSqsProvider ‑ test_set_unsupported_attribute_fifo[sqs_query]

♻️ This comment has been updated with latest results.

@joe4dev joe4dev marked this pull request as ready for review May 16, 2024 08:51
@joe4dev joe4dev requested a review from giograno May 16, 2024 08:52
@thrau thrau removed their request for review May 23, 2024 11:38
@thrau
Copy link
Member

thrau commented May 23, 2024

removing myself as reviewer as i was added only because of the config.py change

Copy link
Member

@dfangl dfangl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good solution for now.

We should definitely tackle the lambda function deletion locks and the aborted invocations soon!

I just had some minor questions regarding some comments, otherwise approved from my side :)

def latest(self) -> FunctionVersion:
return self.versions["$LATEST"]

# HACK to model a volatile variable that should be ignored for persistence
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not 100% satisfied with this hack, but I think for now it is the best solution. Let's hope we can find a way to get by without this hack in the future :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, pretty ugly hack 😢
We currently don't have a Python object such as the VersionManager at function level. Alternative hacks:

  • Use a dict to track them in the lambda service, but passing it down into the event managers is gonna be awkward
  • Add it to the version manager of a specific version (e.g., v1, $LATEST is annoying because then it needs to migrate)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

semver: patch Non-breaking changes which can be included in patch releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

X Tutup