Explore recent insights, retrospectives, and stories.

Latest Posts

Automating 24/7 Monitoring for a Weekend

This is the story of how I was tasked with monitoring our website all weekend, but I wrote a script to do it instead. I did not realize it at the time, but one Friday morning, a storm was brewing. We received an innocuous ticket from a client that orders appeared to be stuck. At this point, the ticket was assigned to my colleague, and I didn't poke around much further. Introduction At around 3:00 p.m., however, the ticket was transferred to me, as my colleague was planning on logging off early. Naturally, the ticket was to be prioritized, but there was no cause for alarm. During the knowledge transfer, we noticed that the scheduled job that was supposed to be processing orders appeared to be stuck. The job that usually only takes a few seconds had been running for over an hour and had still not completed. When checking the queue, we noticed that there was an abnormal number of orders. What we later found out was that the client had a special promotion where certain products were free with the sign-up of a subscription. This resulted in an absurd number of new orders coming in. I double-checked to make sure that orders were still being processed, albeit a little slowly. I still wasn't sure that there was a bigger problem at hand, and in addition, there was a senior dev looking into the root cause. One particularly unfortunate behavior was that when the scheduled job ran, it would first grab all the orders and then process them first in, last out (FILO). Orders that were added after the scheduled job started would not be processed until the job was run again. Since there were thousands of orders coming in, some unlucky orders had to wait longer and longer, especially because the scheduled job had been restarted several times (we were unsure if it was stuck or not). Come 5:00 p.m., I sync with the senior developer along with the project manager to ensure that we are all on the same page. The job is running and processing as expected, but it is a little slow. Doing the math, it will take over 1.5 days to finish processing, even if there are no more new orders. We are all on the same page, and the PM lets us know that he will talk to the client and let them know what we have found. The senior developer works in a different timezone, so his shift had actually been over for more than four hours. With that in mind, I tell him to log off and reiterate that if there are any further problems, I will handle them. Of course, at this point, there is nothing to do but wait, so I monitored for maybe another 10 minutes before stepping away from my computer to take a break. I also made dinner plans with my significant other, and at around 6:00 p.m., I began getting ready to leave for dinner. I reminded myself that I should check my messages before I leave, but I was not too worried. In addition, I had my phone with me, so they could always reach me that way. Where the Trouble Begins I picked up my significant other and was on my way to the restaurant when I got a message from the Big Boss (my boss's boss). He was trying to figure out what was going on and put out a fire that had been raging. I realized I also had a missed call. I was still driving at this point but ended up pulling into a parking lot in order to see what was going on. I ended up calling the Big Boss back to see what was going on. It turns out that after I had left, the PM was not able to explain what had happened to the client or provide assurance that the orders were still being processed. I realized at this point that I had failed to remember to check my computer before I left for dinner. Once home, I jumped on a call with the senior developer and the Big Boss and saw that I had missed about a dozen messages. I was told that the client was having a meltdown and was not convinced that orders were going through. They even suggested that they start manually processing orders to get through the whole backlog (this would have taken more than a week and been prone to errors). The Big Boss was barely able to talk the client down and assured them that we would handle the situation. We again confirmed that the orders were being processed, even though it was slow. To ensure that the client was happy, the Big Boss told me that I would have to take turns monitoring the orders to make sure that they were still being processed. We were to give status updates every hour until the orders were completely processed.  Long term, we needed to speed up the order processing times and ensure that the queue is FIFO instead of FILO.  We never had an issue before because the client would have at most 10 orders in a day.  This particular lucky Friday, we had received well in the thousands. In the short term however, this meant that we would have to take turns staying up all night in order to ensure that the client was happy. At this point, I was just happy that I was not getting fired (as it was mostly my fault that the issue escalated this far). As such, I offerred to take the graveyard shift. The Automation In the middle of the night, I realized that it was somewhat stupid to stay up all night just to count how many orders were remaining by hand. I noticed that to login to the backofice/admin section of the website, it only required basic auth. Naturally, I started working on a Puppeteer script that logs in, goes to the right page/tab, counts how many orders are remaining, and logs the number into a Google Sheets document. From there, I used the timestamp and number of orders to graph it in a chart using Sheets. In order to automate the script runs, I created a scheduled task in Windows to run a batch script that, in turn, runs Puppeteer. At the end of the day, I probably should have just checked my messages before I left for dinner. Writing the script honestly took the better part of a working day and was totally not worth it, but at the very least, it was fun. Lessons Learned: Always have your phone with you when you are on call Always double check that an issue is completely resolved from the client's perspective before assuming all work is done

Jan 29, 2026

Plex Automation Sonarr Radarr

Automating My Plex Server

I have been running my Plex server since approximately 2015 and I have always been too lazy to automate the workflow.  Often, it is jokingly said that developers spend countless hours automating workflows that takes a few minutes to complete by hand.  Ironically, this is probably one of the few cases where I could have benefited by setting up the automation much sooner.  I am mildly ashamed to say that I have wasted countless hours over the past 10 years by not automating requests and downloads. Introduction Initially, I installed Plex on a Chinese Windows 2-in-1 Tablet in order to share my downloaded content for online Movie Nights. If I remember correctly, it must have been running Windows 8.  Even from the start, the storage was mounted remotely via Google Cloud Drive with rclone.  In those days rclone could not mount on Windows, because WinFSP had not been developed yet.  This meant that we ran a Linux Virtual Machine with VMware just so that we could take advantage of the unlimited Google drive storage provided by Gsuite.   Eventually, the unlimited storage had come to an end, but I was able to pool my local drives with Drivepool for Windows.  Luckily the desktop tower case that I had at the time (the Fractal R2) could fit something like 14 3.5" hard drives as well as 2 2.5" drives.  With that realization, I added a PCIE card Mini-SAS connector so that I could add an addition 8 hard drives.  As Google wisened up and closed their unlimited storage service plan, I migrated my content locally.  Unfortunately, I didn't have enough local storage and ended up losing more than half my library at the time.  I knew the day would come, but it was still a sad day. That brings me to my current setup, which is still running on Windows 10 and is still just a ghetto JBOD (just a bunch of drives).  What makes it great, however, is the recently added power of automation through docker. Motivation So what finally got me off my lazy ass and down this rabbithole of Sonarr, Radarr, and Overseer?  It was a combination of several factors.  Earlier this year, I had discovered Coolify and finally figured out how Nginx works.  I recently discovered Portainer, and docker desktop for Windows finally fixed several known memory leaks.  Unironically, discovering Portainer changed the way I run apps and kicked off my self hosting journey. Prior to using Portainer, I had a lot of concerns regarding data loss and recovery when running docker. Putting it all together Finally, after figuring out a consistent way to run docker containers without the risk of losing data, I was ready to start the setup.  In the past, I had tried setting my homelab on a separate machine using proxmox, but found that it was incredibly hard to change my IP.  As we do frequently change ISPs I figured I would skip out on proxmox this time.  What I do miss, however, is how easy it was to remote desktop into a proxmox environment out of the box.  So naturally, you would think that I just decided to host it on simple distro like Ubuntu server or just plain Ubuntu.  What I ended up doing was going the lazy route, which was to host sonarr and radarr on my existing plex server.  In hindsight, what would have made more sense was to install the *arr application on a separate Linux machine and map the existing network drive, but don't fix what isn't broken. So I finally setup my instance following the TechHutTv guides.  The guides are a fantastic resource and very flexible. In his particular example, he tunnels all his traffic through Wireguard, whereas I am not.  Overall, there were a few hiccups, but once I got it setup, it worked pretty flawlessly. (Hopefully I am not jinxing myself) I was able to setup: Sonarr - for shows Radarr - for movies Prowlarr - for indexing torrents Overseerr - for making requests Flaresolverr - for bypassing cloudflare Qbittorrent - as the torrent client The largest issue that I ran into was due to pathing, as I was originally trying to use the Windows qbittorrent client instead of the docker image.  With that out of the way, all that was left was some tweaking to the quality settings and preferences in codec types.  For the most part, I am able to sit back and let Sonarr and Radarr handle incoming requests with little fuss or intervention.  Of course, I still do occasionally check other sources in the event a show/movie is not available, but for now, I am quite pleased that the setup is working after a weekend's worth of effort. Lessons Learned Don't be afraid to automate things, it will save you time (sometimes)  

Oct 15, 2025

« 1 »