Get started
This guide provides step-by-step instructions to set up the necessary environment for your project, including CUDA, Docker, dependencies, and system execution.
—
Host PC Installation
GPU Driver Installation
Prerequisites: - Ubuntu version 20.04 or later (22.04 recommended)
Steps to Install CUDA Drivers:
Add the official graphics drivers repository:
sudo add-apt-repository ppa:graphics-drivers/ppa sudo apt update reboot
After rebooting, open Software & Updates → Additional Drivers and select the appropriate driver for your GPU.
Apply the changes and reboot the system:
reboot
Dependencies Installation
sudo update sudo apt install terminator htop -y sudo apt install python3-dev python3-venv python-pip -y
Docker and nvidia docker toolkit Installation
https://docs.docker.com/engine/install/ubuntu/
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
Add some custom useful alias (Optional)
# utils
alias chown_keti='sudo chown -R keti .'
alias sb='source ~/.bashrc'
alias eb='sudo gedit ~/.bashrc'
alias nb='nano ~/.bashrc'
alias about_pc='lsb_release -a'
alias create_py_simple_pkg='cookiecutter https://github.com/mtbui2010/python_pkg_simple_template.git'
alias vscode='sudo code --no-sandbox --user-data-dir ~/.vscode_cache'
alias run_pyvir='source ~/.pyvir/bin/activate'
#docker
alias docker_start='sudo docker start'
alias docker_stop='sudo docker stop'
alias docker_run='sudo docker run -it'
alias docker_watch='sudo watch docker ps -a'
alias image_watch='sudo watch docker image ls -a'
alias image_rm='sudo docker image rm'
#functions
dockerexec() {
xhost local: & sudo docker start "$@" & sudo docker exec -it "$@" /bin/bash
}
dockerrm() {
sudo docker stop "$@" && sudo docker rm "$@"
}
gitpush() {
local date_str=$(date +%Y%m%d)
git add . .gitignore && git commit -m "$date_str - $*" && git push
}
rund() {
local container_name="$1"
local ros_domain_id="$2"
shift 2
# Allow local GUI access
xhost +local:
# Start the container
sudo docker start "$container_name"
# Execute the command inside the container
sudo docker exec -it "$container_name" bash -c "
source ~/.bashrc
source /opt/ros/humble/setup.bash
source ~/ros2_ws/install/setup.bash
export ROS_DOMAIN_ID=$ros_domain_id
$*"
}
Install and configure SSH and FTP server (Optional)
Configure FTP
#!/bin/bash set -e FTP_DIR="/media/keti/workdir" FTP_USER="keti" echo "🛠 Backing up original config..." sudo cp /etc/vsftpd.conf /etc/vsftpd.conf.bak echo "📝 Writing new vsftpd config..." sudo tee /etc/vsftpd.conf > /dev/null <<EOL listen=YES listen_ipv6=NO anonymous_enable=NO local_enable=YES write_enable=YES local_umask=022 chroot_local_user=YES allow_writeable_chroot=YES user_sub_token=\$USER local_root=${FTP_DIR} pasv_enable=YES pasv_min_port=10000 pasv_max_port=10100 EOL echo "📁 Setting permissions for $FTP_DIR..." sudo chown -R "$FTP_USER":"$FTP_USER" "$FTP_DIR" sudo chmod -R 755 "$FTP_DIR" echo "🔁 Restarting vsftpd..." sudo systemctl restart vsftpd sudo systemctl enable vsftpd echo "✅ FTP setup complete. You can now FTP into this machine with your user account."
Test SSH and FTP servers
ssh keti@0.0.0.0 ftp 0.0.0.0
Servere PC Installation
Clone the Docker configuration repository:
git clone https://github.com/keti-ai/dockers.git cd dockers
Build container
cd dockers ./build_recognition_container.sh --share-dir=<SHARE_DIR>
<SHARE_DIR>: Share directory with source codes [default: None, no sahred folder]
Clone the following repositories to set up the necessary dependencies:
git clone https://github.com/keti-ai/pyrecognition.git
git clone https://github.com/keti-ai/pyconnect.git
git clone https://github.com/keti-ai/pyinterfaces.git
Install the repositories as editable Python packages:
pip install -e pyrecognition
pip install -e pyconnect
pip install -e pyinterfaces
Install ssh and sshfs server (optional)
sudo apt update
sudo apt install openssh-server
sudo systemctl start ssh
sudo systemctl enable ssh
sudo apt install sshfs
Edge and Control PCs Container Installation
Git credential:
git config --global credential.helper store
To set up a Docker containerized environment for your project, follow these steps:
Build containers
Build the Docker image with the required specifications:
./build_image.sh --ubuntu=<UBUNTU_VERSION> --cuda=<CUDA_VERSION> --ros=<ROS_DISTRO>
Replace <UBUNTU_VERSION>, <CUDA_VERSION>, and <ROS_DISTRO> with your specific environment settings.
Support versions:
Ubuntu20.04, 22.04 (default)
CUDA 11.1.1, 11.7.1, 12.1.0, 12.4.1, 12.6.3(default)
ROS2: foxy, humble (default)
Create and run a Docker container:
./build_container.sh --ubuntu=<UBUNTU_VERSION> --cuda=<CUDA_VERSION> --ros=<ROS_DISTRO> --name=<CONTAINER_NAME> --share-dir=<SHARE_DIR>
<CONTAINER_NAME>: Name of the container [default: name of ubuntu and cuda version , e.g u22cu12]
<SHARE_DIR>: Share directory with source code [default: None, no sahred folder]
Clone the following repositories to set up the necessary dependencies:
git clone https://github.com/keti-ai/carerobotapp.git
git clone https://github.com/keti-ai/pyconnect.git
git clone https://github.com/keti-ai/pyinterfaces.git
git clone https://github.com/keti-ai/pydevice.git
OR mount ssh driver from server PC:
sudo nano /etc/fstab
append sshfs#$SERVER_USER@$SERVER_IP:SERVER_DIR $CLIENT_MOUNT_DIR fuse defaults,_netdev,allow_other,IdentityFile=/home/$CLIENT_USER/.ssh/id_rsa 0 0
save and close
sudo mount -a
Install the repositories as editable Python packages:
pip install -e carerobotapp
pip install -e pyconnect
pip install -e pyinterfaces
pip install -e pydevice
Install ROS Interfaces
Create a symbolic link to rosinterfaces inside the ROS2 workspace:
cd ~/ros2_ws/src ln -s $ROSINTERFACES_PATH .
Build the ROS package:
cd ~/ros2_ws colcon build --packages-select rosinterfaces
System Execution
Step 1. Edge PC: Run Robot and Device Server
robot # Initializes robot arm, elevator, head, etc.
femto # Runs the Femto camera
hand # Runs the wrist camera
Step 2. Server PC: Run LLM/VLM Servers
Ollama Server Execution:
sudo docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
sudo docker exec -it ollama /bin/bash
VLM Server Execution:
python3 -m pyrecognition.run_server server_type=tcp port=8805 detector=groundedsam_grasp,fastsam,groundingdino,audio,mask2grasps,groundedsam,fastsam_grasp
Step 3. Control PC: Run Control Nodes in different Terminal Window
Terminal 1 — Skill Serve
python -m carerobotapp.node_skill_servers
Terminal 2 — Task Manager
python3 -m carerobotapp.node_taskmanager
Terminal 3 — WebRTC Server (remote browser control)
python -m carerobotapp.node_prompt_webrtc_server
# Open http://<robot-ip>:<port> in a browser
Terminal 4 - Extra Device Server (wrist camera, optional)
python -m carerobotapp.node_extra_device_server
Configuration Files
Configurations carerobotapp.configs directory.
Make configuration file
Copy and Edit configs/robot_10.py -> configs/robot_$ROBOT_DOMAIN_ID.py
Edit configs/tasks.py: from carerobotapp.configs.robot_10 -> carerobotapp.configs.robot_$ROBOT_DOMAIN_ID
Confirure html
Copy and Edit html/text_entries_pangyo.py -> configs/text_entries_$ROBOT_DOMAIN_ID.py
Edit offerUrl -> https://$ROBOT_IP:8443/offer