Applied AI engineering studio

Ship GenAI that actually does the job.

We design and deliver fine-tuned models, retrieval pipelines, agentic workflows and computer vision systems for B2B teams that need outcomes, not slideware.

Discuss a project See services

Nha Trang · Vietnam Since Nov 2023 EN · VI · RU

What we build

Five disciplines, one delivery team.

We pick the smallest set of tools that solves your problem. No buzzword-driven architecture.

Fine-tuning language models

Adapt open-weight models to your domain, data and terms of use. LoRA / QLoRA for fast iteration, full fine-tunes when warranted, preference tuning (DPO) where the data supports it.

Domain-specific instruction tuning
Evaluation harness with your tasks
Self-hosted or managed inference

RAG & MCP infrastructure

Turn internal knowledge into a queryable system. Hybrid retrieval, evaluation pipelines, observability, and MCP servers your agents and IDEs can call directly.

Hybrid BM25 + dense retrieval
Eval pipelines, not vibes
MCP servers for agents & IDEs

Corporate multi-agent systems

Coordinated agents that plan, act, recover and stay in scope. Tool use, memory, guardrails and human-in-the-loop hand-offs — production, not demos.

Plan / act / reflect loops
Guardrails & permission boundaries
Observable, replayable runs

Security audits & pentests

Adversarial review of your stack — web, network and AI-specific. Prompt injection, jailbreak, model and prompt extraction, supply-chain risks across the LLM lifecycle.

OWASP & OWASP LLM top 10
AI red-teaming & jailbreak suites
Reproducible reports & remediations

Computer vision research

Custom models for measurement, detection and broadcast analytics. From edge devices to broadcast-grade pipelines, designed for the metrics that matter to your operation.

Detection, tracking, pose & geometry
Realtime ONNX / TensorRT pipelines
Edge to broadcast deployment

How we work

Short loops. No vapor.

A predictable engagement model that lets your team see traction in weeks, not quarters.

01 · Discovery

Outcome before architecture

A 60–90 minute working call. We map the outcome, constraints, available data and what "shipped" looks like for your team.
02 · Pilot

Smallest proof, real data

A scoped 2–6 week pilot that proves the approach on your data. Fixed scope, fixed timeline, written hypothesis.
03 · Delivery

Production, your infrastructure

Ship into your cloud or on-prem with your guardrails. Documented, observable, owned by your team — not by us.
04 · Support

Drift, evals, evolution

Optional retainer for updates, eval suites, and keeping pace with the rate at which models — and your data — drift.

Selected work

Things we have shipped.

Public case studies are in progress. Below is a short selection — happy to discuss specifics under NDA.

01 sports

Sports broadcast computer vision

Per-frame player and ball tracking, event detection and broadcast overlays running at production latency for live sports.

Role: CV models, realtime pipeline, broadcast integration
Stack: PyTorch · ONNX · TensorRT · NVDEC · WebRTC

02 locker

Smart locker control software

End-to-end control software for mobile parcel lockers — device firmware bridge, operator dashboard, customer flow and analytics.

Role: System architecture, backend, operator UX
Stack: Go · PostgreSQL · MQTT · Embedded Linux

03 voice

Synchronous voice translation

Real-time speech-to-speech translation from video, preserving speaker characteristics. Built for long-form content and live streams.

Role: ASR, MT, voice cloning, latency budget
Stack: Whisper · NLLB · Coqui · Custom alignment

Talk through your project

About

A small, senior team in Nha Trang.

DeQuzzy was founded on 2 November 2023 in Nha Trang, Vietnam. We are a deliberately small team of senior engineers who have shipped applied AI, classical computer vision and secure systems for B2B clients across Asia and Europe.

We pick projects where modern AI is the unfair advantage — not the marketing layer. Vietnamese, English and Russian working languages. Remote-first, but happy to fly to you when it matters.

5+: core engineers
3: working languages
0: sub-contracted code

Contact

Tell us what you are building.

The form is being rebuilt. In the meantime, the fastest channel is e-mail — or open the channels below.