Earlier this week, I “completed” a very long journey. I will make no representation to be an “expert” in the subjects I am going to touch on here in the blog. This is just my experience — hopefully formatted in a way that is of some use to other novices trying to explore these concepts.
I have spent nearly 80 days in a row, first retraining in these technologies, then building out a working REST API that ran on AWS's serverless platform, and finally scrapping that work in favor or Node/Express. Along the way, I had many dozens of moments of despair (“maybe I can't make this work after all”), but ultimately emerged with a major success. This was/is the hardest project I have taken in my 25-year career.
Begin with training
Don't waste a perfectly good pandemic
As I related to my good friend and backer, Keith Garcia, the plan that I came up with in mid-March was to take the training, and then develop a series of projects. The economy had just crashed, business activity was near zero, and I had a rare chance to study and focus.
One truth for all
All three of these platforms had independent tables for customers and for user accounts meaning, for example, a client had to be replicated to (a) quote/sell in stockd and (b) bill/receive payment in accountd. This overlap was the goal of the first project: to move the entire interface for customers (and, due to the same issues of commonality, users) into a shared information space. This was a perfect candidate for an external database that would be accessed by a REST API. There was a need to have a “single source of truth” about a customer or a user, so a change made from one platform would be seen by the others.
DynamoDon't — a.k.a. death by database
I spent a long time fighting to understand nuances of DynamoDB including the enormous list of reserved keywords, and how to correct syntax errors which would report little more than a failure had occurred. At one point, I spent an enormous amount of time trying to get a simple update to work that would have been moments in construction in an SQL database.
My advice is this:
- If you're doing any kind of update, use a POST method and a JSON-formatted request.
- Test it first with Postman or a similar application.
- Before writing application code, document your API.
- Only then go to writing application code.
This way you separate the infuriating experience of getting your query to work (routes and methods) from the implementation of your application's logic.
Is it cold-start or what???
I was able to make a functional API for customer management using AWS Lambda and AWS API Gateway.4 It worked, but there was a noticeable lag. I had already experienced this same phenomenon with the excellent api2pdf.com5 which suffers a fairly significant lag when generating PDFs.
After awhile, I just could not tolerate the latency any longer, and the API gateway was scrapped.
Having decided to capitalize on my training in node.js and express.js, I went to install them on our AWS server and completely hit the wall. At the time of writing, I was attempting to install Node 12 according to instructions for CentOS7, but the build kept crashing. There's very little information that I found useful, but I think what was happening was the build was specifying a specific (and older) version of libstcd++.so.6, and because ours is newer, the install failed every time. I had to revert to Node JS 10, but everything worked after that.
I was able to keep most of the internal code from the AWS Lambda scripts, and at present have three routers running APIs served by Express.js. Once this was deployed, the performance was dramatically better than AWS serverless — zero lag and beautiful execution.
So...why bother with all this?
Ok, going back to that point of intersection. We'd determined that we needed to move “Customers” and “Users” off these local platforms and onto a global service. What we ended up doing was slightly different in each case, so let's begin with customers.
We created a DynamoDB table with a partition key for the company and a sort key for the account number. This allowed us to safely commingle clients from multiple companies in a single table-space, because the company ID acts as a separator. This also allows N > 1 companies to use the same account number, because the unique “record index” is the combination of the company ID and account number together.6
To migrate customers, we followed a process.
- We copied all existing customer records into Dynamo DB. Where a client existed in that overlap space, we picked one customer ID, and made a patch file for all of the databases to change any internal pointers to the one “true” customer ID in the new space.
- Internal to the new customer records, we created spaces for contacts, locations, and other settings.7
- Once the clients were all loaded, we patched all customer references in all platforms to refer to the one “true” customer ID.
- All existing customer logic was rebuilt to use the REST APIS and get customer information from the new server.
Now, this was not a small job, because in our systems, the customer is a critical data entity, and we had to make 100s of changes. If you're contemplating doing something like this, it's totally worth it, but you're in for a lot of work. That said, the REST API logic is universal, so we can reuse it on every platform that needs customer information.
Users are different, and the table has to structure differently
The reason is that we wanted to support a single user with a potentially variable degree of access depending on the platform, and also, we have accountd which runs accounting on multiple companies in one toolset.8 We needed to define a group of users (the partition), the users inside that partition, the companies each user has access to, and the privileges accorded to each user. This table goes down five levels, but, thanks to DynamoDB, it's fast and easily handled.
We declare a user inside the partition and locate their record to determine anything the application logic needs. Again, the REST API logic is universal, so we can reuse it on every platform that needs user information.
To migrate users, we followed a process.
- We copied all existing user records into Dynamo DB. Where a client existed in that overlap space, we picked one users ID, and made a patch file for all of the databases to change any internal pointers to the one “true” user ID in the new space.
- Internal to the new user records, we created spaces for platform-specific settings, as well as the standard user metadata.7
- Once the users were all loaded, we patched all users references in all platforms to refer to the one “true” user ID.
- All existing user logic was rebuilt to use the REST APIS and get customer information from the new server.
The truth is single-sourced
At this point, the primary goals of this epic project were accomplished. We've since made some other useful adaptations which I will blog about later. This is a good place to wrap a long piece with a few take-away points.
First, this was not easy, but it's worth it. Our clients have one true customer record (and one true user as well). The information is always synchronized, because it lives outside the application services.
Second, the “rules” only have to be changed in one place. If anything is modified in these structures, everyone gets the change.9
Third, by virtue of well-defined REST APIs and services, we can add another platform and “bolt on” these services very quickly.
This was an enormous project, but necessary to prepare us for the customer growth we are looking to achieve in the future. Hopefully, some of this is useful to you as well.
- I ended using DynamoDB as the backend for this project, as is discussed later in this blog.
- Third was the utterly brilliant Mastering React which has no impact on this project.
- This is very much in use, but only internally. We have not made a public v1.0 release yet.
- Here is a really excellent startup/tutorial article.
- We still use it, despite the lag, because it works very well otherwise, and when we ran wkhtmltopdf natively, it was disastrously erratic. These guys have tamed it, it works well, and the service cost is very reasonable.
- Uniqueness is guaranteed across the combined value. Some clients have a lot of accounts under their company ID.
- Maps where the attribute was a sub-array, or lists where there was no sub-array. Lists are much easier to work with due to the ability to append them.
- This is really useful when you run N > 1 companies and move funds between them. You can do things like post a journal entry taking money out of company A (loan to B), and another in company B showing a loan received from A.
- This is a separation between the rules as they apply to a client (global) and how that client is involved in a transaction (local).