Launching indonesia-civic-stack: 40 MCP Tools for Indonesian Government Data
How I built an open-source Python SDK and MCP server that connects AI agents to 11 Indonesian government portals — and what I learned along the way.
Indonesia has over a dozen government portals publishing public data — business registrations, halal certifications, drug safety, earthquake alerts, election results, wealth declarations. The data is there, but accessing it programmatically is a nightmare. Every portal has its own quirks: different HTML structures, inconsistent APIs, geo-blocking, rate limits, and the occasional reCAPTCHA.
I’ve been scraping these portals for various projects over the past few months. HalalKah needed BPJPH data. LegalKah needed OJK data. Each time, I was writing the same boilerplate — HTTP clients, error handling, response normalisation. So I extracted the common patterns into a single package.
The result is indonesia-civic-stack: a Python SDK that wraps 11 Indonesian government portals into a unified interface, plus 40 MCP tools that let AI agents query them directly.
🚀 What It Does
The package covers 11 portals:
| Module | Portal | What it does |
|---|---|---|
| BPOM | pom.go.id | Drug & food safety registry |
| BPJPH | halal.go.id | Halal product certification |
| AHU | ahu.go.id | Company registration lookup |
| OJK | ojk.go.id | Financial institution legality |
| OSS | oss.go.id | Business licensing (NIB) |
| LPSE | lpse.go.id | Government procurement |
| KPU | kpu.go.id | Election data & candidates |
| BPS | bps.go.id | National statistics API |
| BMKG | bmkg.go.id | Earthquakes & weather |
| LHKPN | kpk.go.id | Official wealth declarations |
| SIMBG | simbg.pu.go.id | Building permits |
Every module returns a consistent CivicStackResponse object — same structure whether you’re querying drug registrations or earthquake data. No more parsing HTML soup differently for each portal.
🤖 AI-Agent-First Design
The interesting part isn’t the scraping — it’s the MCP integration. Model Context Protocol lets AI assistants call external tools. Instead of asking a human to look something up on a government website, an AI agent can query the data directly.
# Zero-install remote server
claude mcp add civic-stack --transport http \
https://mcp-server-production-d1a2.up.railway.app/mcp
# Or install locally
pip install "indonesia-civic-stack[mcp]"
claude mcp add civic-stack -- civic-stack-mcp
Once connected, you can ask Claude things like:
- “Is this BPOM registration number still active?”
- “Search for companies named ‘Maju Bersama’ in the AHU registry”
- “What was the latest earthquake in Indonesia?”
- “Cross-reference this business across OJK, AHU, and OSS”
The agent figures out which tools to call, chains them together, and synthesises the results. Multi-portal queries that would take a human 30 minutes of tab-switching take seconds.
⚙️ Architecture Decisions
A few choices worth noting:
Unified server over per-module servers. Early versions had separate MCP servers for each portal. Managing 11 server processes was painful. The unified server loads all 40 tools lazily — you pay the import cost only when a tool is actually called.
Consistent error envelopes. Government portals go down. A lot. Every response wraps the result in a status envelope (found, not_found, error, degraded) so agents can handle failures gracefully instead of crashing on unexpected HTML.
Proxy support for geo-blocking. Most .go.id portals are only reliably accessible from Indonesian IPs. The SDK supports a PROXY_URL environment variable for Cloudflare Workers or similar proxies. Not ideal, but practical.
Hatchling build system. Modern Python packaging with pyproject.toml, optional dependency groups ([mcp], [api], [browser], [all]), and CLI entry points. No setup.py archaeology.
🔧 The Hard Parts
Portal instability. Government websites change without notice. URLs shift, auth requirements appear, entire endpoints vanish. The LHKPN module (wealth declarations from the anti-corruption commission) went from a public search API to requiring reCAPTCHA v3 — discovered mid-development. VCR cassettes for tests help, but you’re always one portal update away from a broken module.
Geo-restrictions. Testing from Sydney means most portals return 403s or timeouts. I deployed a Cloudflare Worker as a proxy, but CF-to-CF routing (many .go.id sites use Cloudflare too) creates its own problems. The test suite uses VCR cassettes recorded from Indonesian IPs.
Inconsistent data formats. One portal returns JSON, another returns server-rendered HTML, another requires Playwright for client-side rendering. The OSS and SIMBG modules need a real browser. The SDK abstracts this away, but each module’s scraper is genuinely different code.
📊 Current Status
- ✅ 40 MCP tools across 11 modules
- ✅ Published on PyPI (
pip install indonesia-civic-stack) - ✅ Listed on the MCP Registry
- ✅ Hosted MCP server on Railway (zero-install remote access)
- ✅ 63 tests passing, CI green
- 🔴 LHKPN module degraded (reCAPTCHA v3)
- ⚠️ Most portals need Indonesian IP or proxy for reliable access
The landing page at datarakyat.id has full documentation, module-by-module API references, and example prompts.
💭 Why Civic Tech + MCP Matters
Government data should be easy to access. These portals exist because Indonesian law mandates transparency — business registrations, halal certifications, official wealth declarations are all public record. But “public” often means “technically available if you know which website to visit and how to navigate it.”
MCP bridges that gap. An AI agent with civic-stack tools can answer questions about Indonesian public data as naturally as it answers questions about the weather. That’s not a technical achievement — it’s an accessibility one.
The code is MIT-licensed. If you’re building something for Indonesian civic data, I hope it saves you the weeks of portal-spelunking it took me.
Links:
- 📦 PyPI
- 💻 GitHub
- 🌐 Documentation
- 🤖 MCP Registry