Intelligent touchscreens integrated with locally hosted VLM to achieve low-latency, high-privacy AIoT experience. Expansion Ready to fit your needs.
Existing HMI solutions haven't kept pace with the latest innovations in multimodal LLMs. Users expect intelligent, conversational interfaces — not static button grids.
Adding LLM or AI features means ongoing cloud API costs or expensive library licenses. Per-query pricing makes scaling unpredictable and eats into margins.
Facial recognition and voice commands processed in the cloud mean your biometric data leaves your premises. For many applications, this is simply not acceptable.
Non real-time operating systems can't prioritize critical events. When an alarm triggers, Android and Linux panels offer "best effort" — not guaranteed response times.
Finding the right hardware is a challenge. Long lead times, minimum order quantities, and component shortages make it difficult to get what you need, when you need it.
Modern applications require integrating new sensors — radar, multiple cameras, environmental monitors. Most HMI panels lack the I/O flexibility to accommodate them.
Our solution addresses all of these challenges.
Hardware and software designed together, optimized for local AI and real-time performance. Cost-effective design leveraging open-source platforms.
Thin client architecture — powerful enough for AI vision, light enough to stay responsive. Acts as an intelligent interface to your hub, panel, or automation system.
Dual RISC-V processor running at up to 400 MHz, designed for HMI applications and edge computing with rich I/O capabilities.
2.4 and 5 GHz dual-band Wi-Fi 6, Bluetooth 5 (LE), and 802.15.4 for Thread, Zigbee, Matter, HomeKit, and MQTT support.
Facial recognition for secure operations, processed entirely on the device. No cloud, no external servers, no privacy compromise.
Hardware video encoding for WebRTC streaming. Display IP camera feeds or use built-in camera for video intercom applications.
Available I/O for adding 60 / 77 GHz radar for 3D person sensing, dual cameras, co-processors for Z-Wave, proprietary wireless, or RS-485 bus protocols.
Real-time operating system delivers consistent, predictable performance — faster and more reliable than Linux or Android-based alternatives.
An edge AI platform for developing and deploying intelligent agents — completely offline with enterprise-grade capabilities.
40 TOPS AI acceleration with Ara-2 Runtime SDK. Run 8B VLM models locally with OpenAI-compliant REST API — no internet required.
Built on Google ADK orchestrator for multi-agent workflows. Seamlessly integrate AI agents via Agent-to-Agent (A2A) protocol for complex task automation.
Extend capabilities with Model Context Protocol (MCP) tools. Connect to Home Assistant, time series databases, and custom backends with plug-and-play ease.
Deploy at scale via Docker containers. Gateway server architecture enables easy monitoring and management of ecosystem components.
Bring any supported LLM to work within the orchestrator. Fine-tune with your RAG data for domain-specific responses.
Built-in safety mechanisms ensure appropriate, reliable responses for consumer and enterprise deployments.
Pre-built client agent / meet apps for Android, iOS, Desktop, Web, and Embedded Linux — get started instantly on any platform.
From compact interfaces to full tablet experiences — same platform, same software stack.
We deliver complete hardware and software solutions tailored to your requirements.
All core functionality runs on your premises. No cloud accounts, no subscriptions, no data leaving your network.
Touchscreen and VLM are one-time costs. Engineering hours available for customization, porting, and expansion.
No heavy processes on the display. Clean messaging interface with your hub, panel, or automation system.
Connect to local AI agents like Qwen3 VL or cloud-based services. MCP support for extensible AI workflows.
Intelligent display paired with your hub or panel. Facial recognition for secure disarm, natural language control, and seamless integration with existing automation systems.
Rich connectivity options — Thread, Zigbee, Matter, HomeKit, MQTT — make this the ideal interface for IoT deployments. Display sensor data, control devices, monitor systems.
Real-time operating system delivers consistent performance for machine interfaces. Expansion options for RS-485 bus protocols and industrial sensors. Rugged and reliable.
Expressive face display with local AI vision and VLM connectivity. Perfect for social robots, assistants, and interactive installations requiring natural interaction.
Safe, offline AI with built-in guardrails. No inappropriate content, no data collection, no cloud dependency. Parents can trust what their children interact with.
More information about our team and mission will be available here.
Common questions about our platform and technology.
No, ARM64 apps cannot run directly on the ESP32-P4. The ESP32-P4 uses a RISC-V 32-bit architecture and runs FreeRTOS (a real-time operating system), not Linux. ARM64 binaries are incompatible — it's a completely different instruction set and operating environment.
However, the VLM Box runs on ARM64 Linux (i.MX8M Plus with Cortex-A53), so existing ARM64 Linux applications can potentially run there with minimal modifications. This architecture gives you the best of both worlds: a responsive RTOS-based touchscreen for HMI, and a Linux-based edge computer for complex processing tasks.
The ESP32-P4 can perform basic face detection using Espressif's ESP-WHO framework. However, for secure face recognition with liveness detection (anti-spoofing), the processing requirements exceed what the ESP32-P4 can handle in real-time.
For advanced biometric applications, we use a two-step approach: the ESP32-P4 handles initial face detection, then streams the camera frames along with Time-of-Flight (ToF) sensor data to the i.MX8M Plus edge computer. The i.MX8M Plus runs the liveness detection algorithm using ToF depth data to distinguish real faces from photos, videos, or masks — all processed locally without cloud dependency.
The ESP32-P4 is purpose-built for Human-Machine Interface (HMI) applications. Its RTOS foundation delivers consistent, predictable response times — critical for touch interfaces. It handles display rendering, touch input, camera preview, audio I/O, and network communication with sub-millisecond latency.
More powerful Linux-based processors introduce unpredictable latency, higher power consumption, longer boot times, and greater complexity. By separating the UI (ESP32-P4) from heavy compute (VLM Box), we optimize each component for its specific role.
No, all core functionality runs offline. The touchscreen and VLM Box communicate over your local network. AI processing, face recognition, voice commands, and VLM inference all happen on-premises — no cloud accounts, no subscriptions, no data leaving your network.
Internet connectivity is optional and only needed if you want features like remote access, OTA updates, or integration with external services.
The touchscreen (ESP32-P4) is the user interface — it handles display, touch, camera, microphone, speaker, and real-time interactions. It runs an RTOS for instant responsiveness and can perform basic on-device AI like face detection.
The VLM Box is an edge AI computer with a 40 TOPS NPU. It handles compute-intensive tasks: running 7B Vision Language Models, advanced face recognition, natural language understanding, and complex reasoning — all locally. Think of it as a local AI server that your touchscreen can query.
Yes. The VLM can be fine-tuned with your own RAG (Retrieval-Augmented Generation) data to provide domain-specific responses. You can customize the knowledge base, personality, and response style for your specific application — whether that's a smart home assistant, industrial support system, or customer service kiosk.
The touchscreen UI is also fully customizable with your branding, color schemes, and interface layouts.
Thank you for your interest in Local VLM Touch! Join our waitlist to receive updates on our development progress and be among the first to access our platform.
We respect your privacy. No spam, ever.
Tell us about your application. We'll respond within 24 hours to schedule a conversation.