This repository contains Docker Compose configurations for deploying the OM1 robot system.
For a fresh Thor (JetPack 7.0) system, follow these steps to set up OM1:
Use curl to download and install uv:
curl -LsSf https://astral.sh/uv/install.sh | shThe docker is pre-installed on JetPack 7.0 systems, but you need to give it proper permissions:
newgrp docker
sudo usermod -aG docker $USER
groupsYou should see docker in the list of groups. If not, log out and log back in, then check again.
Download and install Docker Compose with the following commands:
sudo curl -L "https://github.com/docker/compose/releases/download/v2.34.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
Set the executable permissions:
sudo chmod +x /usr/local/bin/docker-compose
Verify the installation:
docker-compose --version
Install Poetry using the official installation script:
curl -sSL https://install.python-poetry.org | python3 -Install poetry shell for the environment management:
poetry self add poetry-plugin-shellInstall the required packages:
sudo apt install portaudio19-dev python3-pyaudioInstall FFmpeg using the following command:
sudo apt install ffmpegDownload and install Google Chrome:
sudo snap install chromiumHold snap updates to prevent automatic updates:
snap download snapd --revision=24724
sudo snap ack snapd_24724.assert
sudo snap install snapd_24724.snap
sudo snap refresh --hold snapdFollow the official ROS2 installation guide for Ubuntu: ROS2 Installation.
After installing ROS2, source the ROS2 setup script:
source /opt/ros/jazzy/setup.bashYou can add this line to your ~/.bashrc file to source it automatically on terminal startup.
Install CycloneDDS for ROS2 communication:
sudo apt install ros-jazzy-rmw-cyclonedds-cpp
sudo apt install ros-jazzy-rosidl-generator-dds-idlNow, set CycloneDDS as the default RMW implementation by adding the following line to your ~/.bashrc file:
export RMW_IMPLEMENTATION=rmw_cyclonedds_cppYou can restart your ROS2 daemon with the following command:
ros2 daemon stop
ros2 daemon startIf you prefer to build CycloneDDS from source, use the following commands:
cd Documents
mkdir -p GitHub && cd GitHub
git clone https://github.com/eclipse-cyclonedds/cyclonedds -b releases/0.10.x
cd cyclonedds && mkdir build install && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=../install -DBUILD_EXAMPLES=ON
cmake --build . --target install
Then you need to set the following environment variables in your ~/.bashrc file:
export CYCLONEDDS_HOME=$HOME/Documents/GitHub/cyclonedds/installYou need to open the network settings and find the network interface that the robot connected. In IPv4 settings, set the method to Manual and add the following IP address:
192.168.123.xxx
and set the subnet mask to
255.255.255.0
You can create a CycloneDDS configuration file to customize its behavior. Create a file named cyclonedds.xml in your home directory:
<CycloneDDS>
<Domain>
<General>
<Interfaces>
<NetworkInterface name="enP2p1s0" priority="default" multicast="default" />
</Interfaces>
</General>
<Discovery>
<EnableTopicDiscoveryEndpoints>true</EnableTopicDiscoveryEndpoints>
</Discovery>
</Domain>
</CycloneDDS>Then, set the CYCLONEDDS_URI environment variable in your ~/.bashrc file:
export CYCLONEDDS_URI=file://$HOME/cyclonedds.xmlInstall v4l2-ctl using the following command:
sudo apt install v4l-utilsWe assume you have bought the brain pack. If you don't have it, you can skip this section based on your needs.
To enable the screen animation service, install unclutter first to hide the mouse cursor:
sudo apt install unclutterThen, add the script to /usr/local/bin/start-kiosk.sh and make it executable:
#!/bin/bash
unclutter -display :0 -idle 0.1 -root &
HOST=localhost
PORT=4173
# Wait for Docker service to listen
while ! nc -z $HOST $PORT; do
echo "Waiting for $HOST:$PORT..."
sleep 0.1
done
# Launch with autoplay permissions
exec chromium \
--kiosk http://$HOST:$PORT \
--start-fullscreen \
--disable-infobars \
--noerrdialogs \
--autoplay-policy=no-user-gesture-required \
--disable-features=PreloadMediaEngagementData,MediaEngagementBypassAutoplayPolicies \
--no-first-run \
--disable-session-crashed-bubble \
--disable-translate \
--window-position=0,0Make it executable:
sudo chmod +x /usr/local/bin/start-kiosk.shAdd the script to /etc/systemd/system/kiosk.service to launch the kiosk mode automatically on boot.
# /etc/systemd/system/kiosk.service
[Unit]
Description=Kiosk Browser
After=docker.service
Requires=docker.service
[Service]
Environment=DISPLAY=:0
ExecStart=/usr/local/bin/start-kiosk.sh
Restart=always
User=openmind
[Install]
WantedBy=graphical.target
Enable and start the service:
sudo systemctl daemon-reload
sudo systemctl enable kiosk.service
sudo systemctl start kiosk.serviceNote
To stop the kiosk service, use sudo systemctl stop kiosk.service.
To enable the Acoustic Echo Cancellation (AEC) service, uninstall PipWire if it's installed and install PulseAudio
sudo apt remove --purge pipewire-audio-client-libraries pipewire-pulse wireplumberThen install PulseAudio:
sudo apt install pulseaudio pulseaudio-module-bluetooth pulseaudio-utils pavucontrolNext, stop the PipWire daemon and start the PulseAudio daemon if it's not already running:
systemctl --user mask pipewire.service
systemctl --user mask pipewire.socket
systemctl --user mask pipewire-pulse.service
systemctl --user mask pipewire-pulse.socket
systemctl --user mask wireplumber.service
systemctl --user stop pipewire-pulse.service
systemctl --user stop pipewire.service wireplumber.service
systemctl --user disable pipewire.service wireplumber.service
systemctl --user enable --now pulseaudio.serviceNext, add the script to prevent PulseAudio from going into auto-exit mode.
mkdir -p ~/.config/pulse
cat > ~/.config/pulse/client.conf << 'EOF'
autospawn = yes
daemon-binary = /usr/bin/pulseaudio
EOF
# Create daemon config to disable idle timeout
cat > ~/.config/pulse/daemon.conf << 'EOF'
exit-idle-time = -1
EOFNow, you can restart the system to ensure PulseAudio is running properly.
sudo rebootNote
After reboot, if the audio devices are not automatically detected, you may need to manually start PulseAudio with the command:
systemctl --user restart pulseaudioNow, you can add the script to /usr/local/bin/set-audio-defaults.sh and make it executable:
#!/bin/bash
set -e
sleep 5
# First, set the master source volume to 200%
pactl set-source-volume "alsa_input.usb-R__DE_R__DE_VideoMic_GO_II_FEB0C614-00.mono-fallback" 131072
pactl set-source-mute "alsa_input.usb-R__DE_R__DE_VideoMic_GO_II_FEB0C614-00.mono-fallback" 0
# Unload then load AEC module
pactl unload-module module-echo-cancel || true
pactl load-module module-echo-cancel \
use_master_format=1 \
aec_method=webrtc \
source_master="alsa_input.usb-R__DE_R__DE_VideoMic_GO_II_FEB0C614-00.mono-fallback" \
sink_master="alsa_output.platform-88090b0000.hda.hdmi-stereo" \
source_name="default_mic_aec" \
sink_name="default_output_aec" \
source_properties="device.description=Microphone_with_AEC" \
sink_properties="device.description=Speaker_with_AEC"
# Wait a moment for the module to fully initialize
sleep 2
# Set defaults
pactl set-default-source default_mic_aec
pactl set-default-sink default_output_aec
# Retry volume setting until device appears and volume is set correctly
for i in {1..15}; do
if pactl list short sources | grep -q default_mic_aec; then
# Set volume to 200% (131072)
pactl set-source-volume default_mic_aec 131072
pactl set-source-mute default_mic_aec 0
# Verify the volume was set
CURRENT_VOL=$(pactl list sources | grep -A 7 "Name: default_mic_aec" | grep "Volume:" | awk '{print $3}')
if [ "$CURRENT_VOL" = "131072" ]; then
echo "Microphone volume successfully set to 200%"
break
else
echo "Volume is $CURRENT_VOL, retrying... ($i/15)"
fi
else
echo "Waiting for AEC source to appear... ($i/15)"
fi
sleep 1
done
# Final verification
pactl list sources | grep -A 7 "Name: default_mic_aec" | grep -E "Name:|Volume:"Use the following command to get the list of audio sources and sinks:
pactl list shortNote
Replace alsa_output.platform-88090b0000.hda.hdmi-stereo with your speaker source and alsa_input.usb-R__DE_R__DE_VideoMic_GO_II_FEB0C614-00.mono-fallback with mic source
Make it executable:
sudo chmod +x /usr/local/bin/set-audio-defaults.shCreate a systemd user service to run the script on login:
mkdir -p ~/.config/systemd/user
sudo vim ~/.config/systemd/user/audio-defaults.serviceAdd the following content:
[Unit]
Description=Set Default Audio Devices
After=pulseaudio.service
Wants=pulseaudio.service
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/local/bin/set-audio-defaults.sh
[Install]
WantedBy=default.target
Enable and start the service:
systemctl --user daemon-reload
systemctl --user enable audio-defaults.service
systemctl --user start audio-defaults.serviceNow, you need to export USER ID as an environment variable in your ~/.bashrc file:
export HOST_USER_ID=$(id -u)to allow the docker containers to access the PulseAudio server properly. Then, reload your Bash profile to apply the changes:
source ~/.bashrcThe cloud docker management service allows remote management of Docker containers via a web interface. To enable this service, follow these steps:
-
Sign up for an account on OpenMind Portal.
-
Create your OpenMind API key from the Dashboard page.
-
Set the API key as an environment variable in your
Bashprofile:
vim ~/.bashrc
export OM_API_KEY="your_api_key_here"- Get the API Key ID from the Dashboard page. The API Key ID is a 16-digit character string, such as
om1_live_<16 characters>. Now, export the API Key ID as an environment variable:
vim ~/.bashrc
export OM_API_KEY_ID="your_api_key_id_here"- Set the robot type that you are using
vim ~/.bashrc
export ROBOT_TYPE="go2" # or "go1", "tron"Now, reload your Bash profile to apply the changes:
source ~/.bashrcTo enable the Over-The-Air (OTA) update service for Docker containers, you need to set up two docker services: ota_agent and ota_updater. These services will allow you to manage and update your Docker containers remotely via the OpenMind Portal.
To create a ota_upater.yml file, follow these steps:
cd ~
vim ota_updater.ymlAdd the following content from this ota_upater.yml to the ota_updater.yml file.
Note
You can use the stable version as well. The file example provided on the top is the latest version.
Save and close the file. Now, you can start the OTA updater service using Docker Compose:
docker-compose -f ota_updater.yml up -dA .ota directory will be created in your home directory to store the OTA configuration files.
Now, you can set up the ota_agent service. Create an ota_agent.yml file:
cd .ota
vim ota_agent.ymlAdd the following content from this ota_agent.yml to the ota_agent.yml file.
Note
You can use the stable version as well. The file example provided on the top is the latest version.
Save and close the file. Now, you can start the OTA agent service using Docker Compose:
docker-compose -f ota_agent.yml up -dNow, both the OTA updater and agent services should be running. You can verify their status using the following commands:
docker ps | grep ota_updater
docker ps | grep ota_agentYou can now manage and update your Docker containers remotely via the OpenMind Portal.
Riva models are encrypted and require authentication to download. To download Riva models, you need to set up the NVIDIA NGC CLI tool.
Warning
Please run the following command in your root directory. Otherwise, the docker-compose file we provide for Riva services may not work properly.
To generate your own NGC api key, check this video.
wget --content-disposition https://ngc.nvidia.com/downloads/ngccli_arm64.zip && unzip ngccli_arm64.zip && chmod u+x ngc-cli/ngc
find ngc-cli/ -type f -exec md5sum {} + | LC_ALL=C sort | md5sum -c ngc-cli.md5
echo export PATH=\"\$PATH:$(pwd)/ngc-cli\" >> ~/.bash_profile
source ~/.bash_profile
ngc config set
This will ask several questions during the install. Choose these values:
Enter API key [no-apikey]. Choices: [<VALID_APIKEY>, 'no-apikey']: <YOUR_API_KEY>
Enter CLI output format type [ascii]. Choices: ['ascii', 'csv', 'json']: ascii
Enter org [no-org]. Choices: ['<YOUR_ORG>']: <YOUR_ORG>
Enter team [no-team]. Choices: ['<YOUR_TEAM>', 'no-team']: <YOUR_TEAM>
Enter ace [no-ace]. Choices: ['no-ace']: no-ace
Warning
ngc cli will create a .bash_profile file if it does not exist. If you already have a .bashrc file, please make sure to merge the two files properly. Otherwise, your bash environment may not work as expected.
Download Riva Embedded version models for Jetson 7.0:
ngc registry resource download-version nvidia/riva/riva_quickstart_arm64:2.24.0
cd riva_quickstart_arm64_v2.24.0
sudo bash riva_init.sh
# initialize riva model locally
# this will ask the NGC api key to download the model, use <YOUR_API_KEY>
# this will take a while to downloadNote
The following command is for testing.
Run Riva locally:
cd riva_quickstart_arm64_v2.24.0
bash riva_start.shNow, please expose these environment variables in your ~/.bashrc file to use Riva service:
export RIVA_API_KEY=<YOUR_API_KEY>
export RIVA_API_NGC_ORG=<YOUR_ORG>
export RIVA_EULA=accept
source ~/.bashrcWe create a openmindagi/riva-speech-server:2.24.0-l4t-aarch64 docker image that has Riva ASR and TTS endpoints with the example code to run Riva services on Jetson devices. You can pull the image directly without downloading the models from NGC:
docker pull openmindagi/riva-speech-server:2.24.0-l4t-aarch64The dockerfile can be found here and the docker-compose file can be found here.
Note
Once you download the models from NGC and export the environment variables, you can use OpenMind Portal to download Riva dockerfile and run Riva services.
Once you have Riva services running, you can use the following script to test the ASR and TTS endpoints:
git clone https://github.com/OpenMind/OM1-modules.git
cd OM1-modules
# Activate poetry shell
poetry shell
# Install dependencies
poetry install
# Test ASR
python3 -m om1_speech.main --remote-url=ws://localhost:6790
# Test TTS
poetry run om1_tts --tts-url=https://api-dev.openmind.org/api/core/tts --device=<optional> --rate=<optional>- 1935: MediaMTX RTMP Server
- 6790: OM Riva ASR Websocket Server API
- 6791: OM Riva TTS HTTP Server API
- 8000: MediaMTX RTMP Server API
- 8001: MediaMTX HLS Server API
- 8554: MediaMTX RTSP Server API
- 8860: Qwen 30B Quantized API
- 8880: Kokoro TTS API
- 8888: MediaMTX Streaming Server API
- 50000: Riva Server API
- 50051: Riva NMT Remote TTS/ASR API