1. Registration and Login
If you haven't registered a HUGGINGFACE_TOKEN, you need to visit and register/login at:
Hugging Face official website: https://huggingface.co/login?next=%2Fsettings%2Ftokens

After successful registration and login:
Create a new dataset token

Check all available options

Click to create token

Save your Token securely for future use

2. Data Collection
Start collecting data:
For example, if my token is: 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
Replace ${HUGGINGFACE_TOKEN} below with your own token:
huggingface-cli login --token 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 --add-to-git-credential
huggingface-cli login --token ${HUGGINGFACE_TOKEN} --add-to-git-credential Example: huggingface-cli login --token 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 --add-to-git-credential
Possible error: FileNotFoundError: [Errno 2] No such file or directory: 'git'

Solution:
sudo apt install git

After installation, re-execute the previous huggingface-cli command. Successful execution with red warnings is normal.

May get stuck at login page as shown below:

This is due to network issues (Solution: Use VPN)
Store your Hugging Face repository name in a variable to run these commands:
HF_USER=$(huggingface-cli whoami | head -n 1) echo $HF_USER

3. Data Collection Process
Preparation phase begins with prompt tone. First recording starts after tone, right click "→" to end recording. Rest phase begins (images will freeze), restore the object scene and press "→" to skip rest. Data compression and saving will begin automatically. After saving completes, second recording starts automatically.
Record 2 episodes and upload your dataset to the hub:
python lerobot/scripts/control_robot.py \ --robot.type=so100 \ --control.type=record \ --control.fps=30 \ --control.single_task="Grasp a lego block and put it in the bin." \ --control.repo_id=${HF_USER}/so100_test \ --control.tags='["so100","tutorial"]' \ --control.warmup_time_s=5 \ --control.episode_time_s=30 \ --control.reset_time_s=30 \ --control.num_episodes=2 \ --control.push_to_hub=true python lerobot/scripts/control_robot.py \ --robot.type=so100 \ --control.type=record \ --control.fps=30 \ --control.single_task="Grasp a lego block and put it in the bin." \ --control.repo_id=hgmtest/so100_test \ --control.tags='["so100","tutorial"]' \ --control.warmup_time_s=5 \ --control.episode_time_s=30 \ --control.reset_time_s=30 \ --control.num_episodes=2 \ --control.push_to_hub=false
python lerobot/scripts/control_robot.py \ # Run LeRobot control script --robot.type=so100 \ # Specify robot type as SO-100 --control.type=record \ # Operation mode: record --control.fps=30 \ # Recording frame rate: 30fps --control.single_task="Grasp a lego block and put it in the bin." \ # Task description: Grasp lego block and place in bin --control.repo_id=${HF_USER}/so100_test \ # HuggingFace repository ID for dataset storage --control.tags='["so100","tutorial"]' \ # Dataset tags --control.warmup_time_s=5 \ # Preparation time before each recording (5s) --control.episode_time_s=30 \ # Duration of each episode (30s) --control.reset_time_s=30 \ # Robot position reset time (30s) --control.num_episodes=2 \ # Record 2 episodes --control.push_to_hub=true # Automatically upload to HuggingFace Hub after recording
python lerobot/scripts/control_robot.py \ # Run LeRobot control script --robot.type=so100 \ # Specify robot type as SO-100 --control.type=record \ # Operation mode: record --control.fps=30 \ # Recording frame rate: 30fps --control.single_task="Grasp a lego block and put it in the bin." \ # Task description: Grasp lego block and place in bin --control.repo_id=/home/vsu/so100_test \ # Local path for dataset storage --control.tags='["so100","tutorial"]' \ # Dataset tags --control.warmup_time_s=5 \ # Preparation time before each recording (5s) --control.episode_time_s=30 \ # Duration of each episode (30s) --control.reset_time_s=30 \ # Robot position reset time (30s) --control.num_episodes=2 \ # Record 2 episodes --control.push_to_hub=false # Do not upload to HuggingFace Hub after recording
Possible error:

subprocess.CalledProcessError: Command '['ffmpeg', '-f', 'image2', '-r', '30', '-i', '/home/skl/.cache/huggingface/lerobot/pdd46465/so100_test/images/observation.images.laptop/episode_000000/frame_%06d.png', '-vcodec', 'libsvtav1', '-pix_fmt', 'yuv420p', '-g', '2', '-crf', '30', '-loglevel', 'error', '-y', '/home/skl/.cache/huggingface/lerobot/pdd46465/so100_test/videos/chunk-000/observation.images.laptop/episode_000000.mp4']' returned non-zero exit status 1.
Solution:
When using miniconda, if you don't have ffmpeg in your environment:
conda install ffmpeg
Check software versions in your conda environment:
ffmpeg -codecs | grep 264

Locate the file and modify as shown:

Re-running data recording may encounter error:
FileExistsError: [Errno 17] File exists: '/home/skl/.cache/huggingface/lerobot/pdd46465/so100_test'
Solution: Locate and delete the file

After deletion, re-run the process.

As shown above, the dataset has been successfully uploaded. To add more data to an existing dataset, use the following command:
python lerobot/scripts/control_robot.py \ # Run LeRobot control script --robot.type=so100 \ # Specify robot type as SO-100 --control.type=record \ # Operation mode: record --control.fps=30 \ # Recording frame rate: 30fps --control.resume=true \ # Continue recording --control.single_task="Grasp a lego block and put it in the bin." \ # Task description --control.repo_id=${HF_USER}/so100_test \ # HuggingFace repository ID --control.tags='["so100","tutorial"]' \ # Dataset tags --control.warmup_time_s=5 \ # Preparation time (5s) --control.episode_time_s=30 \ # Episode duration (30s) --control.reset_time_s=30 \ # Reset time (30s) --control.num_episodes=2 \ # Record 2 episodes --control.push_to_hub=true # Upload to HuggingFace Hub
To avoid uploading to huggingface hub, change --control.push_to_hub=true to --control.push_to_hub=false
python lerobot/scripts/control_robot.py \ --robot.type=so100 \ --control.type=record \ --control.fps=30 \ --control.resume=true \ --control.single_task="Grasp a lego block and put it in the bin." \ --control.repo_id=${HF_USER}/so100_test \ --control.tags='["so100","tutorial"]' \ --control.warmup_time_s=5 \ --control.episode_time_s=30 \ --control.reset_time_s=30 \ --control.num_episodes=2 \ --control.push_to_hub=true/false
Begin local dataset training with the following command:
python lerobot/scripts/train.py \ --dataset.repo_id=${HF_USER}/so100_test \ --policy.type=act \ --output_dir=outputs/train/act_so100_test \ --job_name=act_so100_test \ --policy.device=cuda \ --wandb.enable=false python lerobot/scripts/train.py \ --dataset.repo_id=hgmtest/so100_test \ --policy.type=act \ --output_dir=outputs/train/act_so100_test \ --job_name=act_so100_test \ --device=cuda \ --wandb.enable=false
python lerobot/scripts/train.py \ # Run LeRobot training script --dataset.repo_id=${HF_USER}/so100_test \ # Training dataset path (HuggingFace repo ID) --policy.type=act \ # Use ACT(Attention-based Control Transformer) policy --output_dir=outputs/train/act_so100_test \ # Training output directory --job_name=act_so100_test \ # Training job name --policy.device=cuda \ # Use CUDA (NVIDIA GPU) for training --wandb.enable=true/false # Enable/disable Weights & Biases logging
The following images indicate training has started:


If encountering the following error (CUDA out of memory), it indicates insufficient GPU memory:

Solution: Modify the train.py configuration file at lerobot_main/lerobot/config/, around line 54 as shown:

num_workers: Affects data preloading parallelism
batch_size: Number of samples per training iteration (preferably even numbers like 2, 4, 16, 32 etc.)
Reducing these parameter values can resolve insufficient memory issues.