The Case of the Free Products
The Case of the Free Products
This is a post-analysis of an issue that arose with a former client. It wasn't a difficult bug to diagnose as much as it was comical.
Introduction
It all started one morning when we received a ticket from the client stating that some orders were not totaling correctly. Upon further investigation, we realized that some products were not being charged at all. They were receiving the products completely free. What made it bad was that these items could range from $400 to well over $1,500. Naturally, we tried recreating the issue by adding the same products to our cart, but no matter what we did, we could not reproduce the error.
In the old version of the website, to get any prices or information from the store, it would make API calls to the pricing software called Siriusware. This software is quite literally some ancient legacy software that comes to a crawl during peak hours of business. In order to remedy this, we created a synchronization job that runs once a day during off-peak hours. This would synchronize everything, including products, prices, and availabilities, from Sirius into our own database. What I have come to realize is that a lot of business logic is simply synchronizing data from one system to another.
This meant that the prices in our store were stored in our database, even though the source of truth came from Siriusware. This meant that the next step was to verify the prices stored in the database. Were there any prices that were incorrect or missing? To properly understand how the prices are saved in the DB, I should preface that the prices are not stored based on product, but based on variants. The same product could technically have different prices. For example, the same pair of headphones can have different colors, and each color could have a different price. What was silly was that, in rare cases, a variant could also be broken into variant items, and each variant item would have its own price.
For the most part, prices stay the same each day. However, there are times when a product will be on sale for a short period of time. In order to account for this, we end up storing each product, variant, and variant item by day. Now, that might not seem so bad at first, but if you have 100 products, and each product has five variants, and each variant has another three variant items, and you save a whole year’s worth of products, we are looking at a cool 100 × 5 × 3 × 365 = 5.5 million entries in the DB. While 5.5 million entries isn't the worst, it is not the only table that needs to be accounted for. So, in order to reduce the size of the prices table, we would periodically prune prices from older days. This made sense because we don't need to check the prices of products from three weeks ago if we are making a purchase today, right? … RIGHT?
Well, that is where we were wrong. When you add an item to the cart, it saves the date as well. This means that if you save an item in your cart, when we do the price lookup, it will request the price of the product, variant, or variant item using the date that the product was added to your cart. This was by design because the site administrators wanted users to be able to add products to their cart and have the price locked in until they make the payment. However, since we were pruning older prices, what ended up happening was that getting the price would fail.
Instead of throwing an exception, if the price lookup fails, it simply replaces the price with a null value, which conveniently gets converted to a big, fat zero. This meant that clients who left items in their cart for three weeks or more would automatically be able to check out their items for free. And, of course, Siriusware, being the great software it is, has no guards in place to actually check whether the prices we are sending are in sync with the prices in their system and simply blindly approves whatever we send them.
Conclusion
Ultimately, the issue was fixed by simply getting the product's price for today in the event that the price lookup fails for that particular day. I recall needing to update the pricing information in the cart logic. Unsurprisingly, although the fix was seemingly simple, the checkout and cart flow was quite difficult to follow. Regardless, we got the job done.
Perhaps I am wiser now, but at the time, I do not recall being intimidated or scared, even though I was updating the checkout logic, which could technically adversely affect all purchases made on the site. While this was technically a small change, in hindsight, the issue could have easily become quite large if done incorrectly. In addition, I think I was always a little trigger-happy when it came to deploying to production. You could say that I was younger and less experienced. There is nothing quite like the boldness of a junior developer.