Tag AWS

Account deletion @ bkmrkd.it

I’ve been struggling with the account deletion feature at bkmrkd.it. Specifically if users have large amounts of bookmarks. The Lambda proces time limit make it a bit trickier than just a normal program. You have to limit the amount of CPU per call. So the simple method first tried was to paginate (like browsing through your bookmarks) thourg the list of bookmarks and deleting them in batches using SQS for triggering each batch. This triggered another “hidden” feature I didn’t know about while testing this: Recursive loop detection and auto remediation.

At first I thought I missed something in my code as the deletion process didn’t finish as expected but stopped halfway. It all became clear the next day when I received an email notification from AWS:

Hello,

AWS Lambda has detected that one or more Lambda functions in your AWS Account: ########## are being invoked in a recursive loop with other AWS resources. To prevent unexpected charges from being billed to your AWS account, Lambda has stopped the recursive invocations listed in the ‘Affected resources’ tab of your AWS Health Dashboard.

This scared me at first but my trusty AI companion walked me though some of the options. The difficulty being that unlike relational databases there is no possibility do a “DELETE FROM TABLE WHERE COLUM = X”, you have to do a scan or query and then iterate trough the result set which is in itself timeconsuming on large datasets. As I stored 2 records for each bookmark, one for quick lookup and finding and one with all the details this made it increasingly difficult. So I did a rethink on the datamodel, which resulted in a single record per bookmark.

The records now all have the same PK which makes deleting everything from the same account easier. Another design choice was to move the bulk deletion out of Lambda and proces it locally. I now only delete the account details directly and leave the bulk deletion for a later moment outside of Lamdba.

By the way, the export functionality was quite the opposite and easy without significant problems. It is basically the same idea because of the time limit. I chunk through all the bookmarks, outside the main program and build a set of JSON files which I zip together at the end and send an email with a download link. Works like a charm.

Restarting bkmrkd.it

It is finally in a state that is usable and presentable, I’ve been working on and off in my spare time since the end of summer last year. I wanted to learn and experience the benefits of using a serverless platform. The simplest way to start that journey was with Chalice, a ready made framework to develop for AWS Lambda.

The project that I’ve wanted to reboot was my bookmarking website bkmrkd.it which I developed some time ago using the PHP Laravel framework but never quite finished it. It was usable for me but I never completed user registration and other features that were still on the todo list.

The current release is working, but not completely finished or polished. All the information is there and the basics work, you can register and add, edit and delete bookmarks with tags. It is even possible to import bookmark files from your browser to give you a head start. There might be some minor bugs, I’m still testing and debugging but it is good enough to release it publicly.

Still to do:

  • export functionality (will time out because of the time limit for lambda functions, need to implement a background process for this)
  • deleting user accounts (same problem, time limit if you have a large amount of bookmarks)

Chalice development and deployments

I’ve been playing with AWS Chalice and rebuilding a bookmarking web app I had earlier witten in PHP Laravel. I wanted to learn serverless development and this looked liked the simplest route to get started using a known language Python. I also thrown in some LLM support, by using the free version of ChatGPT, as an assistent in learning the new environment.

Developing and running it locally was easy, there is a docker image available for having your own DynamoDB for a backend database and a simple install of Chalice. The hard part was when development was finished to get it running on the actual AWS infrastructure. It all looked fairly easy but then the errors start appearing. I’m documenting my findings here so others will find the working solution easier then I did.

First error I ran into is that because I develop on an M1 Mac, everything was based on the ARM architecture, and there is no feature yet in Chalice to deploy to the ARM AWS infra. It just defaults everything to the X86 infrasturcure. There is a feature request to handle this but it looks like development on Chalice has slowed down. Quick thinking was that my Mac supports X86 with Rosetta2 and I would just need to spin up an X86 Ubuntu VM to build the Chalice package. Next issue, Multipass does not support this yet. Luckily I found OrbStack where the architecture of your VM is selectable, downside is you can’t set the amount of CPU or memory per VM. It does support cloud-init, just like Multipass so my setup scripts are re-usable.

Second error, permissioning. Chalice, by default, creates the roles it needs to execute the lambda functions. However it leaves out the access rights for the DynamoDB table I needed. ChatGPT went off the rails here by proposing several different suggestions, which followed after I responded that the solution didn’t work. First I had to include a policy.json in the root of the project, next solution was to include the policy in the config.json which also didn’t work. A classic search showed me this and this article which showed me how it is done. First in your config.json add "autogen_policy": false in the required stage and then create a policy-stage.json with the details in the .chalice directory. Later I found the official documentation which described the same solution.

A related problem was using different AWS Access Keys for each stage of development, so using different identities for dev, acceptance and production. Again ChatGPT gave different answers depending on the question but eventually found a solution that worked. You store your credentials in ~/.aws where you have two files: config for your region and credentials for your acces and secret key combo’s. For the credentials it is straightforward just use [stage] to define the applicable stage but for config it is different and you should use [profile stage]. See the following example for credentials:

[default]
aws_access_key_id = yourkey
aws_secret_access_key = yoursecret
[acceptance]
aws_access_key_id = yourkey
aws_secret_access_key = yoursecret

and this for config:

[default]
region = eu-west-1
[profile acceptance]
region = eu-west-1

Third and last error, packaging static files. As I’m using jinja templates to create the webpages in Python, they need to be included in the packaging for deployment. Again ChatGPT gave non-working solutions and the web had no definite answer as well. Some indicate using the vendor directory where you can store specific non standard python packages, other specify using the chalicelib directory .
The official documentation is not quite clear on the route. I’m currently using the vendor route but will try the chalicelib option as well.