AP
AmiProbashi MoEWOE–BMET
Reliability & Stabilization Case Study
AmiProbashi • 2025
GovTech • National-Scale SaaS • ~7M Records

Stabilizing and Scaling the AmiProbashi MoEWOE–BMET GovTech Platform

Delivered high-impact bug fixes and feature releases on the AmiProbashi national migrant database platform (PHP/Laravel/Vue/MySQL), improving reliability and protecting a strategic, multi-year government revenue channel.

Role Senior Software Engineer
Org AmiProbashi (BanglaTrac Group), Dhaka, Bangladesh
Domain GovTech • National Migrant Systems

The AmiProbashi platform is used by the Ministry of Expatriates' Welfare and Overseas Employment (MoEWOE) and the Bureau of Manpower, Employment and Training (BMET) to manage migrant registration, clearance, and fee collection for millions of workers. My focus: make the platform calm, predictable, and safe to evolve.

Problem → Solution → Impact

Problem

  • Peak-load errors on core registration/clearance flows.
  • Slow, complex MySQL queries against ~7M-row tables.
  • Manual deployments with limited observability and higher regression risk.

Solution

  • Production-grade fixes and validation guardrails on critical paths.
  • Query/index tuning plus clearer logging to pinpoint hotspots quickly.
  • Smaller, safer releases with realistic test data and rollback awareness.

Impact

  • Fewer user-visible errors during peak activity windows.
  • Snappier search/report screens when traffic spikes.
  • Reduced post-deploy firefighting; calmer releases.
User-facing errors
Before: spiky at deadline peaks After: trending down + stable
Search / reporting latency
Before: slow on heavy tables After: tighter P95 during spikes
Post-deploy hotfixes
Before: frequent emergency fixes After: calmer, predictable releases
Peak-load incidents

Triage noisy error paths, stabilize registration/clearance flows with targeted fixes and validation.

Query + logging work

Profile bottlenecks, tune queries/indexes on high-volume tables, add logging to shorten root-cause time.

Safer releases

Ship smaller increments with production-like test data and clearer rollback plans for low-stress deployments.

Impact spotlight
  • Calmer day-to-day operations for government offices handling migrant workflows.
  • Higher confidence in fee collection reliability across a multi-year revenue channel.
  • Engineering team can evolve the product without destabilizing core services.
Overview

Introduction

AmiProbashi is a GovTech SaaS platform used by the Ministry of Expatriates' Welfare and Overseas Employment (MoEWOE) and the Bureau of Manpower, Employment and Training (BMET) to manage a national migrant worker database of approximately seven million records. The platform underpins critical workflows such as migrant worker registration, clearance, and fee collection, forming a strategic multi-year government revenue channel.

As a Senior Software Engineer on the AmiProbashi team, I focused on improving the reliability and maintainability of the core PHP/Laravel/Vue/MySQL platform. My work centered on production-grade bug fixes, high-impact feature releases, and incremental architectural improvements that reduced incidents and helped safeguard uninterrupted revenue collection for government stakeholders.

Background

Context

The platform serves a diverse group of stakeholders: migrant workers, recruiting agencies, government officials, and internal support teams. Peak traffic aligns with policy deadlines or international hiring cycles, which generate sharp load spikes on the core APIs and database.

When I joined, the platform was already in production but exhibited several reliability issues:

  • Intermittent application errors during peak load.
  • Slow or timing-out operations on high-volume database tables (~7M records).
  • Fragile release processes that occasionally caused short service disruptions.

The mandate was clear: stabilize day-to-day operations, reduce production issues, and enable the team to deliver new features with confidence.

Challenge

Problem

The core problem was maintaining uninterrupted, reliable service for a mission-critical government system while continuing to evolve the product. Any prolonged outage or data integrity issue could directly affect migrant workers and disrupt government revenue collection.

Engineering challenges included:

  • Legacy code paths in PHP/Laravel that were difficult to change safely.
  • Complex SQL queries running against large MySQL tables without optimal indexing.
  • Limited observability, making it hard to quickly pinpoint the root cause of production incidents.
  • A deployment pipeline that relied heavily on manual steps, increasing the risk of regression.
Operating Environment

Constraints & Requirements

Key constraints and requirements:

  • High availability: core workflows must remain online during office hours and deadline periods.
  • Data integrity: migrant and financial records must remain consistent and auditable.
  • Regulatory sensitivity: changes must respect government policies and approval processes.
  • Incremental change: large rewrites are not feasible; improvements must be incremental and low risk.
Execution

Implementation Highlights

1) Production bug fixes on core workflows

  • Investigated and resolved intermittent errors affecting migrant registration and clearance flows.
  • Improved validation logic to reduce data entry errors that previously caused downstream failures.
  • Added clearer error messages and logging around failure points, enabling faster diagnosis.

2) Performance optimisation on high-volume queries

  • Profiled slow endpoints using Laravel debug tools and database logs.
  • Simplified and restructured complex Eloquent queries hitting large tables (~7M rows).
  • Worked with the database team to introduce targeted indexes and remove unused ones.

3) Safer feature releases

  • Scoped larger feature requests into smaller, deployable increments.
  • Collaborated with QA to create realistic test data sets that matched production patterns.
The emphasis was on calm, predictable change: resolving the most painful incidents first, then gradually tightening performance and release practices without disruptive rewrites.
Outcomes

Impact & Outcomes

Over time, these improvements contributed to a more stable and predictable platform. While precise figures are confidential, the overall impact can be summarised as:

  • Reduction in user-facing errors during peak periods.
  • Faster response times on frequently used search and reporting screens.
  • Fewer emergency fixes required during or immediately after deployments.
  • Increased confidence from stakeholders in the platform's ability to support ongoing revenue collection.

The engineering team also benefited from clearer code paths, improved logging, and a more structured approach to releasing changes in a sensitive production environment.

Bar chart showing illustrative relative reduction in user-facing error rate before and after improvements.
Illustrative reduction in user-facing errors after stabilisation work. Values use partial/representative data only.
Line chart showing illustrative increase in confident deployments per month.
Illustrative increase in confident, low-stress deployments over time as reliability and release processes improved.

Charts intentionally avoid exposing full confidential production data; they highlight directional impact using partial datasets.

Reflection

Key Learnings

  • In government and other high-stakes domains, reliability improvements and well-executed bug fixes can be more valuable than flashy new features.
  • Incremental refactoring, combined with better observability, can materially improve stability without requiring a full rewrite.
  • Close collaboration with non-technical stakeholders (operations staff, government officials) helps ensure that engineering work aligns with real-world pain points and policy constraints.
  • Investing in safer deployment practices (smaller changes, validation, better rollback paths) pays for itself by reducing incidents and preserving trust.