The Justin Blog

On Call Support Automation

Automating 24/7 Monitoring for a Weekend

This is the story of how I was tasked with monitoring our website all weekend, but I wrote a script to do it instead. I did not realize it at the time, but one Friday morning, a storm was brewing. We received an innocuous ticket from a client that orders appeared to be stuck. At this point, the ticket was assigned to my colleague, and I didn't poke around much further. Introduction At around 3:00 p.m., however, the ticket was transferred to me, as my colleague was planning on logging off early. Naturally, the ticket was to be prioritized, but there was no cause for alarm. During the knowledge transfer, we noticed that the scheduled job that was supposed to be processing orders appeared to be stuck. The job that usually only takes a few seconds had been running for over an hour and had still not completed. When checking the queue, we noticed that there was an abnormal number of orders. What we later found out was that the client had a special promotion where certain products were free with the sign-up of a subscription. This resulted in an absurd number of new orders coming in. I double-checked to make sure that orders were still being processed, albeit a little slowly. I still wasn't sure that there was a bigger problem at hand, and in addition, there was a senior dev looking into the root cause. One particularly unfortunate behavior was that when the scheduled job ran, it would first grab all the orders and then process them first in, last out (FILO). Orders that were added after the scheduled job started would not be processed until the job was run again. Since there were thousands of orders coming in, some unlucky orders had to wait longer and longer, especially because the scheduled job had been restarted several times (we were unsure if it was stuck or not). Come 5:00 p.m., I sync with the senior developer along with the project manager to ensure that we are all on the same page. The job is running and processing as expected, but it is a little slow. Doing the math, it will take over 1.5 days to finish processing, even if there are no more new orders. We are all on the same page, and the PM lets us know that he will talk to the client and let them know what we have found. The senior developer works in a different timezone, so his shift had actually been over for more than four hours. With that in mind, I tell him to log off and reiterate that if there are any further problems, I will handle them. Of course, at this point, there is nothing to do but wait, so I monitored for maybe another 10 minutes before stepping away from my computer to take a break. I also made dinner plans with my significant other, and at around 6:00 p.m., I began getting ready to leave for dinner. I reminded myself that I should check my messages before I leave, but I was not too worried. In addition, I had my phone with me, so they could always reach me that way. Where the Trouble Begins I picked up my significant other and was on my way to the restaurant when I got a message from the Big Boss (my boss's boss). He was trying to figure out what was going on and put out a fire that had been raging. I realized I also had a missed call. I was still driving at this point but ended up pulling into a parking lot in order to see what was going on. I ended up calling the Big Boss back to see what was going on. It turns out that after I had left, the PM was not able to explain what had happened to the client or provide assurance that the orders were still being processed. I realized at this point that I had failed to remember to check my computer before I left for dinner. Once home, I jumped on a call with the senior developer and the Big Boss and saw that I had missed about a dozen messages. I was told that the client was having a meltdown and was not convinced that orders were going through. They even suggested that they start manually processing orders to get through the whole backlog (this would have taken more than a week and been prone to errors). The Big Boss was barely able to talk the client down and assured them that we would handle the situation. We again confirmed that the orders were being processed, even though it was slow. To ensure that the client was happy, the Big Boss told me that I would have to take turns monitoring the orders to make sure that they were still being processed. We were to give status updates every hour until the orders were completely processed.  Long term, we needed to speed up the order processing times and ensure that the queue is FIFO instead of FILO.  We never had an issue before because the client would have at most 10 orders in a day.  This particular lucky Friday, we had received well in the thousands. In the short term however, this meant that we would have to take turns staying up all night in order to ensure that the client was happy. At this point, I was just happy that I was not getting fired (as it was mostly my fault that the issue escalated this far). As such, I offerred to take the graveyard shift. The Automation In the middle of the night, I realized that it was somewhat stupid to stay up all night just to count how many orders were remaining by hand. I noticed that to login to the backofice/admin section of the website, it only required basic auth. Naturally, I started working on a Puppeteer script that logs in, goes to the right page/tab, counts how many orders are remaining, and logs the number into a Google Sheets document. From there, I used the timestamp and number of orders to graph it in a chart using Sheets. In order to automate the script runs, I created a scheduled task in Windows to run a batch script that, in turn, runs Puppeteer. At the end of the day, I probably should have just checked my messages before I left for dinner. Writing the script honestly took the better part of a working day and was totally not worth it, but at the very least, it was fun. Lessons Learned: Always have your phone with you when you are on call Always double check that an issue is completely resolved from the client's perspective before assuming all work is done

Jan 29, 2026

Zendesk Discord Webhook Pocketbase

Zendesk Discord Chatbot using Webhooks

Two years ago, the support team at Yaksa (before Verndale had purchased the company) used a Microsoft Teams bot application that was integrated with our Zendesk ticketing platform. The chatbot messages you in real time anytime your ticket is updated. Although we do get emails in near real time, I always find the frequency of the emails to be too noisy, and Outlook/Windows would not properly send notifications when a new email came in. I really appreciated the Zendesk Teams bot, as the instant notifications improved my response time and productivity. In the last year (although unrelated), around the time our company got bought out by Verndale, the Teams bot began misbehaving. It would start double-messaging and phantom-messaging the team about tickets that had not been updated at all. It also sent messages much later than they were updated, defeating the original purpose of the bot. It had gotten so bad that everyone on the support team muted the Zendesk bot and completely stopped using it. Later that year, we eventually migrated our messaging platform from Teams to Slack, and the Zendesk bot had become a memory of the past. Six months ago, there was an in-house AI hackathon that was hosted by Verndale. The support team decided to create a bot that integrated with Zendesk using webhooks. Ultimately, we didn't spend enough time working on the hackathon, which led to a presentation that was less than stellar. It did, however, give us access to various Zendesk APIs and webhooks. Funnily enough, in the back of my mind, I did consider the possibility of creating my own bot that used the webhooks, but the hackathon was over and our access, as well as any enabled webhooks, would soon be disabled—or so I thought. Just last month, I began playing with Docker, and one of the new applications that I discovered was Uptime Kuma. It is a site monitoring tool that can integrate into various chat platforms, notifying you when a site becomes unavailable. This was quite useful for me to check the uptime of my Plex and Immich sites. Naturally, I integrated it with Discord, but it also got me thinking again about the possibility of integrating Zendesk with Discord. I checked if the webhooks were still active, and to my excitement, they were! I immediately knew that I wanted to recreate the chatbot, but I also had to ensure that the uptime would be reasonable. I initially thought of hosting the bot using Vercel because of the generous free tier and ease of integrating with Next.js, but I quickly became afraid that it would be violating their terms of service. This bot would technically no longer qualify as a "hobby" project, as it is an internal tool for "commercial use." What I ended up going with was creating a website using Pocketpages on Pockethost. In the end, my solution was quite hacky, as the Pocketpages framework is not very well known, but the documentation was "good enough." The most important part was that it was free and the uptime was not handled by myself. I won't really bore you with the details, but effectively, the webhook that is sent by Zendesk includes all the ticket information, such as ticket number, assignee ID, actor ID, message, ticket type, and custom statuses. With this information, I am able to identify who updated the ticket, who the ticket is assigned to, if the ticket is about to breach, the ticket number, and the client organization. Using that information, I am able to send the ticket to the appropriate Discord user along with essential information. I had a small issue with duplicate messages because multiple webhooks would come in from the same ticket (sometimes within the span of 3 ms), but I finally sorted out the issue by grabbing all the records in the last ~10 seconds and only sending a message if it is the oldest message in the 10-second window. Now, I am able to relax and not worry about whether or not I have missed an update on any of the tickets assigned to me.

Nov 08, 2025

Windows 10 Linux Fedora

Saying Goodbye to Windows (Windows 10 EOL)

With the end of life of Windows 10 and the stringent hardware requirements of Windows 11, users are left stranded without a secure path forward.  Introduction With the new online account requirements during Windows 11 installation, forced ads within the operating system,  Windows recall, and the ever increasing telemetry data mining, Microsoft is doing their best to alienate their user base.  Ironically, statcounter was reporting that they had a spike in Windows 7 devices on various website they are tracking.  (Unsure of how accurate this really is, however) Some users are opting to try and bypass the TPM 2.0 requirements of Windows 11, while others are opting for Linux as an alternate operating system.  With the steady progress that Valve has been making towards Wine and Proton, Linux has become a more viable solution for gamers as well as the average Windows users. Personally, on my Plex server, out of sheer laziness, I have opted to extend my Windows 10 support for another year, as I do not want to: Upgrade my hardware to support Windows 11  Completely reinstall a Linux distro and refomat the 100+ TB of data Despite my laziness to change the OS of my media server, (my mentality at this point is to not "fix" anything that is not broken), I have tried out a few distros as my daily driver to see what the switching out of Windows would look like.  On said journey to replace Windows, it turns out that I am fairly unopinionated in what my desktop experience is like as long as I am able to run these few applications. Visual studio code Postman Plex media player Parsec Brave Steam Discord Even prior to the Windows 10 EOL date, I have been using Linux Mint, which is a debian based distro that is very user friendly.  Since then, I have also installed Fedora with the KDE plasma desktop environment, and aside from using a different package manager to install applications, my user experience has been mostly the same. Honestly, my workflow does not require a specific distro or operating system as I am effectively just working with Visual Studio Code and the browser. (The soy boy development environment) My active personal projects are using Next.js and pocket pages with turbopack, which have had no issues on Windows as well as on Linux.  Frankly speaking, it may be even running better on Linux. My biggest issue right now is not having a great remote desktop application to remotely access my personal machine.  On Windows, Parsec has been a great way for me to remote into various machines, but Parsec unfortunately does not have a hosting feature on Linux.  As such, I have been looking into Rustdesk to see if that would adequately support use cases and it appears that tentatively, it will.  I will still need additional configuration in order to able to remotely access my machines from outside the network, but this is a decent start. Aside from my remote desktop woes, Fedora, so far seems to be stable and working well.  Although I have only been using it less than a week, I am finding that it is more stable than Linux Mint, which is somewhat surprising.  Another distro I am planning on checking out would be Arch, but for now, Fedora and Linux Mint are both serving me well. I would honestly recommend looking into Linux as a viable alternative (Assuming you are not mainly using proprietary software in your daily workflow that is not available outside of the Windows/Mac ecosystem) Lessons Learned Fuck Microsoft

Oct 10, 2025

Latest Posts

Automating 24/7 Monitoring for a Weekend

Zendesk Discord Chatbot using Webhooks

Saying Goodbye to Windows (Windows 10 EOL)