My first production deployment didn’t start with confidence, it started with a tight deadline and a system that suddenly stopped working one day before release.
The project itself sounded simple:
That puts the expected transaction volume at around 15 million rupiah (for the first week). Not huge, but definitely not something you want to break in production.
What I underestimated was how quickly things would become real.
On the first day alone, the system processed over 2 million rupiah in transactions. The next day, it reached about 4 million. "this wasn’t just a school project anymore" is the only thing in my mind. Every request hitting the backend carried actual value.
And naturally, things started breaking.
The biggest issue appeared at the worst possible time: H-1 before the launch. The original implementation relied on dynamic QRIS generation through a payment gateway. It worked flawlessly in the sandbox environment until it didn’t. Without any clear changes, the API suddenly refused to generate QRIS due to permission issues (even though dynamic QRIS were enabled in the dashboard).
With no time to wait for support, I had to think quickly.
The solution was to use static QRIS and simulate dynamic behavior by adding a unique value into each transaction amount. This value acted as an identifier, allowing the backend to map incoming payments to the correct user. Once a transaction was completed, the value could be reused for future payments with the same nominal amount.
To enhance the reliability, I implemented webhook-based verification with hash validation. Combined with a 15-minute expiration window, this approach created a lightweight but functional payment tracking system.
Of course, the first deployment didn’t go smoothly.
On that day, I deployed changes directly to production without a proper understanding of the environment. This led to a critical issue where webhook callbacks were received, but failed hash validation. As a result, legitimate transactions were not recognized by the system.
There’s a unique kind of pressure when your system fails to acknowledge real payments.
The immediate fix was manual intervention. I updated the affected records directly through MongoDB Compass to ensure users weren’t impacted. After stabilizing the situation, I corrected the validation logic and properly tested the flow before deploying again.
That incident alone was enough to change how I approached development.
By day 4, I introduced a separate development environment using a different subdomain. While still sharing the same infrastructure, this separation allowed me to test changes without risking production stability. I also implemented CORS policies to better control frontend access.
Around the same time another issue surfaced "rate limiting".
The initial implementation used IP-based limits, which seemed reasonable at first. However, in a school environment where most users share the same WIFI network, this quickly became a problem. Multiple users were unintentionally blocked because they appeared to come from the same public IP.
The solution was straightforward but important:
This ensured both usability and basic protection against abuse.
To improve visibility, I also added basic observability tools. Using Grafana and Loki, I tracked request logs per endpoint, making it easier to identify and debug issues. Additionally, I implemented a simple audit log system to record important actions and transactions within the same database.
These additions didn’t eliminate problems, but they made bug tracking much easier.
Looking back, the system was far from perfect. It was built under pressure, adjusted in real-time, and improved through trial and error. But it worked. It handled real users, real transactions, and real constraints.
It highlighted a key difference that often goes unnoticed:
This experience forced me to think beyond code to consider edge cases, user behavior, infrastructure limitations, and failure scenarios. It wasn’t just about solving problems, but solving them fast enough while the system was already in use.
For a first deployment, it was chaotic.
But it was exactly the kind of chaos that teaches the right lessons and makes me love programming.