Btse25d ago

Data Engineer

Hong KongRemoteFull-timemid

Data EngineeringData EngineerDataData & AI

0 views0 saves0 applied

Apply Now

Quick Summary

Overview

About BTSE: BTSE Group is a global leader in fintech and blockchain technology, anchored by three core business pillars: Exchange, Payments, and Infrastructure Development.

Technical Tools

Data EngineeringData EngineerDataData & AI

About BTSE:

BTSE Group is a global leader in fintech and blockchain technology, anchored by three core business pillars: Exchange, Payments, and Infrastructure Development. Serving over 100 corporate clients worldwide, we provide white-label exchange and payment solutions. Our offerings encompass everything from exchange infrastructure hosting and development to custody, wallets, payments, blockchain integration, trading, and more. We are looking for talented professionals in marketing, operations, customer support, and other departments. The roles offered may be on-site, remote, or hybrid, in collaboration with our local partner.

About the opportunity:

We're building an AI-powered research platform for institutional investors. Our platform turns vast amounts of market, alternative, and proprietary data into actionable intelligence — powered by AI agents that depend on clean, reliable, real-time data to do their job.

We need someone to own data. Not manage a team that does data. Own it — from finding the right sources, to getting them flowing, to making sure they stay healthy at scale.

Today we ingest from hundreds of sources. That number is growing fast. The sources are diverse: real-time market feeds, regulatory filings, on-chain blockchain data, news, social sentiment, alternative datasets, and proprietary client data. Some are free APIs. Some are $10K/month enterprise contracts. Some are clients pushing their own data into our platform. Every one of them is different, and most of them will break in ways you don't expect.

You'll evaluate vendors, negotiate deals, build integrations, monitor quality, track costs, and make the call on what's worth paying for. When something breaks at 2 AM, you'll know why before the alert finishes firing.

This is an end-to-end ownership role. No handoffs.

Build and maintain integrations with a large and growing number of external data sources — APIs, WebSockets, file drops, streams, scrapers, and formats you haven't seen yet

Evaluate and compare data vendors across quality, reliability, coverage, cost, and terms of service

Negotiate contracts and manage commercial relationships with data providers

Design and operate high-throughput ingestion pipelines handling mixed workloads (real-time, near-real-time, batch, event-driven)

Build monitoring that tells you — before anyone else — when data is late, wrong, incomplete, or drifting

Manage data quality at scale: anomaly detection, cross-source validation, schema drift detection, gap filling

Handle both structured data (time-series, tabular) and unstructured data (documents, text, images) with appropriate extraction and storage

Track costs per source, usage per consumer, and ROI — recommend what to keep, upgrade, or cancel

Build tooling that makes adding the next data source faster than the last one

Use AI tools aggressively in your daily work — for code generation, testing, documentation, anomaly analysis, and anything else that makes you faster

**You've done this before:**
- 5+ years building data pipelines that run in production, 24/7, with real SLAs
- Deep hands-on experience with SQL databases and time-series data
- Python as your primary language, comfortable with async programming
- You've integrated with dozens of external APIs and dealt with the reality of unreliable vendors, changing schemas, rate limits, and bad documentation
- You've built monitoring and alerting for data systems — not as an afterthought but as part of how you work

**You think about the whole picture:**
- You don't just connect to an API. You think about what happens when it goes down, when the schema changes, when the data is wrong, when the bill doubles
- You understand that data has a cost and a value, and not every source is worth keeping
- You've worked with data vendors commercially — contracts, pricing tiers, usage negotiations

**You use AI daily:**
- AI coding tools are part of your workflow today, not something you're curious about
- You can articulate specifically how AI makes you faster and where it doesn't help
- You'd be frustrated if you couldn't use AI in your work

Experience with financial or crypto market data

Experience with streaming systems (Kafka or similar) at scale

Vector database or embedding pipeline experience

Experience with unstructured data extraction (PDFs, documents, NLP)

Senior individual contributor role with full ownership of the data domain

Direct access to leadership — no bureaucracy, fast decisions

AI tools provided and encouraged across all work

Remote-friendly, async-first

Compensation commensurate with experience