Spawn a simple Document Management System at your costs

Photo by Cytonn Photography on Unsplash

Recently I started to work as a contractor developer. It’s a challenge to take care of your direct work and Personal Marketing, Bookkeeping, Social insurances, Reports and maintain all your essential papers for legal compliance.

Today, I want to tell you about my project, where I tried to tackle the problem of keeping inexpensively essential documents (for instance, Invoices) organized in durable and secure storage.

That sounds trivial at first glance. There’re plenty of places, like Google Drive, Dropbox, iCloud, or even directly in Gmail? :) But I wanted to have smth, that will organize my documents for me. For example, group them on a monthly or quarterly basis and delete them automatically as soon as I don’t need them anymore.

A typical scenario could be:

  • A supplier send me an invoice to my email address, and I forward that email with the invoice in attachment or the mail body to some email, like invoice@invoices.mydomain.com
  • I use a specific email address to receive invoices, like invoice@invoices.mydomain.com
  • I paid for smth in a shop, scan an invoice (i enjoy using this app for that), and forward it to invoice@invoices.mydomain.com

After all, I want to see all my invoices in one place organized.

When it comes to secure and durable storage, I automatically think of AWS S3. They guarantee 99.999999999% (eleven 9th) durability for standard storage type. They achieve that by doing replication within a minimum of 3 different availability zones. On top of that, you can setup Cross-Region Replication for your objects. In other words, your file will be stored in 3 various data centers in, for example, Ireland and somewhere in … Sydney, to be on the safe side :)

What buys me in is the ability to delete an object after a particular time (in my case, 10 years) using Object Lifecycle Management. It also allows me to change storage type after a certain period (for example, after 6 years, where I more likely won’t need those invoices but obligatory to store them) to S3 IA (Infrequent access). This storage class has some requirements but is nearly as twice as cheaper as standard.

To organize my documents, I’d need to program my “business logic”. I’m still not sure what’d be the best way of doing it, but so far, I decided to keep Invoices sorted on a yearly and monthly basis.

To do that, I use AWS Lambda (since I want to pay for compute power only when I use it) with AWS Simple Email Service as a trigger.

I would not probably take SES as a tool to send out marketing emails, even they promote that feature, though. But it’s a lovely service to intercept, filter, and react to emails.

For my setup, I needed to verify my custom domain and set up MX records to point to AWS. I needed to have the following DNS records:

The rest setup can be automated using the Cloudformation stack.

There are many ways of describing your AWS infrastructure. You can go with tools like Terraform, Heat, or Cloudformation. I prefer the last one but not as a plain JSON or YAML. It’s much more readable and easy to maintain when using a particular abstraction layer, like Serverless Framework, Troposphere, AWS SAM, etc.

I decided to go with the last one. You can check the project on Github here.

In the end, project infrastructure will look like here:

SES configuration. There are multiple ways of how to configure SES Email receiving.

I created 2 Rulesets. In the first one, I store attachments to S3 in a temporary bucket, and in the second, I invoke my Lambda function.

Another option would be to create one Ruleset with several actions inside. That makes more sense when you want to have different rulesets depending on recipient email since recipient filters are configured on a Ruleset level.

I created a small Document Management System(DMS) that I actively use with adequate efforts. It’s also relatively easy to spawn the same infrastructure for you. You’ll need to fork the repo and follow the setup instructions.

Since the last year (2018), AWS launched Serverless Application Model. It’s a repository for cloudformation stacks. Like any other package managers (npm, Packagist, PyPi, etc.), you can easily reuse a Cloudformation from someone else or publish your own. That’s what I wanted to do. Unfortunately, AWS doesn’t allow to have all resources from Cloudformation in that repo. For instance, AWS SES Ruleset is not permitted. I hope they’ll add it soon, and it will be possible to install my DMS from AWS SAR easily.

Please, feel free to report issues, contribute to the project, or use it. I’ll be glad to answer your questions in the comments.

Software engineer from Hamburg (Germany) www.zeleniuk.com