Asynchronous Synchronization For Decoupling

You, development
Back

Problem Statement

Beanworks builds an Accounts Payable Automation software that integrates with various ERPs. Major part of any integration software is the data synchronization. At Beanworks, it's one-way content synchronization where destination always stays as a source of truth. Beanworks imports lists (ie. vendors, GL accounts) and supported types of documents from ERP (ie. purchase orders or payments created in ERP) and exports supported types of documents to ERP (ie. fully approved PO, invoices or payments created in Beanworks).

As the number of customers grew and the size of each customer got bigger, we started to notice a scaling problem with the existing export (data sync to ERPs) workflow. The export requests were handled synchronously - which meant the client has to wait for the server's response until export is complete. There were 3 main problems with this approach:

The synchronous API blocks the client until the client receives a response from the server. The user, as a result, waits indeterminate amount of time until every invoice selected to be posted to ERPs. This greatly disrupts the user's workflow as they are not allowed to navigate away from the page.

The blocking operation is not as bad if the response time is 'somewhat' reasonable. However, we started running into browser timeout as the response took too long, especially with large volume customers. The affected customers were exporting a huge batch of invoices at a time (up to a couple hundred of invoices). Export was coupled with multiple, expensive processing logic (serialization, pdf/csv generation, payment matching ,etc) which exponentially increased the execution time.

Most of the ERPs allows a specific number of concurrent connections at a time. In order to prevent hitting the API request limit set by different ERPs, we've introduced a Lock table in the database that keeps track of locks created during sync operations defaulted to expire in an hour. Any subsequent sync requests during this time would encounter this lock and will throw an exception, alerting customers to try again later. We place the burden of retrying in the hands of customers.

Solution - Asynchronous syncs using AWS SQS

The team quickly came to an agreement of using message queues to asynchronously process sync operations. RabbitMQ and AWS SQS were the top contenders and we decided to go with AWS SQS1.

Challenges

There were a number of challenges implementing this changes.

Result

Asynchronous syncs greatly improved the user experience. We purposely did not set up any benchmark for this refactor as it's impossible to measure the difference in performance. After all, making import and export operation asynchronous does not improve performance by itself - it simply implements mechanics in which it allows us to delay and carry out the operation in the background. It offers a completely different (and better) user experience in that we free up the client as soon as the request is made and result of syncs are shared asynchronously. The client now receives the response almost instantaneously!


  1. Discussion more in detail in this post!
© Hannah Kim.