ICRA Competition - LLM, VLM, Jetson

Max Neuwinger
Max Neuwinger
5 months ago ·
EvolonicFine-tuningVLMJetson

An Unexpected Journey to ICRA 2024: Building Autonomous Robots in Japan

Bots & Bento Competition

The Unexpected Invitation

In April 2024, I received an unexpected message from one of my teammates at Evolonic. The question was simple yet surprising: "Want to come to Japan with us?" Naturally, I was confused. Why Japan? Why so spontaneous? And why next month in May?

It turned out my teammate had applied to the Bots & Bento Competition, a robotics competition as part of ICRA 2024, and our team of 5 had been accepted. Despite having other plans, the opportunity was too exciting to pass up, and I decided to join this spontaneous adventure.

What is ICRA?

ICRA (International Conference on Robotics and Automation) 2024 is a prestigious international robotics conference featuring speeches, demonstrations, and various competitions. Our team was selected to participate in one of these competitions, partnering with Olive, a Munich-based robotics startup.

The Challenge: Bots & Bento

Our team was one of just six international teams selected for the inaugural "Bots & Bento" Robotic Pallet Handling Competition. The challenge required us to build an autonomous robot using materials provided by Olive that could handle KLTs (small trolleys) and transport them to specific positions. While this might sound straightforward, the complexity of autonomous robotics made it a very challenging project.

The competition was co-organized by Olive Robotics, UTokyoIPC, and TUM Venture Lab Robotics/AI, aiming to bridge European and Japanese robotics innovation. Each team had 15 minutes to demonstrate their robot's ability to sort five KLTs to predefined parking positions, mimicking real factory automation scenarios.

In the weeks leading up to the competition, while I was tied up with other commitments, my teammates were already deep into prototyping. One interesting aspect of our project was that the robotics hardware from Olive was designed to run ROS 2 natively.

Journey to Japan

View from the Flight to Japan

In May, we boarded our flight from Munich to Yokohama. The jet lag was brutal, but Tokyo opened up a whole new world I had never experienced before. The sheer scale of the competition venue was overwhelming - multiple massive halls filled with cutting-edge robotics technology and brilliant minds from around the world.

Luckily, we could see Mount Fuji from our hotel room when the weather was good!

Mount Fuji View from Hotel

The Competition: Bots & Bento Challenge

Bots & Bento Competition Banner

Our challenge, officially named "Bots & Bento," was more than just a robotics competition - it was a vision of the future of warehouse automation. The task? Build and program a robot that could handle standardized KLTs (small wheeled containers) in a way that mimicked real-world warehouse operations.

The Arena and Rules

We competed in a ~5x6 meter arena with some interesting constraints:

Competition Arena Layout

Hardware Kit

Olive Robotics provided each team with a comprehensive set of components:

The catch? The robot had to stay within 65x65x90 cm dimensions while handling KLTs measuring 40x30x12cm. While we could 3D print passive parts and use additional compute units, the core hardware had to come from the provided kit.

Olive Robotics Hardware Components

Competition Structure

The competition was structured as follows:

  1. Build Phase (2 hours): Assemble an "easy plug-and-play" robot using the Olive hardware
  2. Development Phase (3 days): Create the software solution
  3. Competition Runs:
    • KLT Transport Challenge (15 minutes)
    • Technical Challenge (10 minutes)
    • Final Presentation (10 minutes)

However, reality had other plans. What was meant to be a straightforward assembly turned into a significant challenge for all teams. Being Olive's first large-scale hardware test, we encountered various issues that, while understandable for a startup, consumed much of our development time.

Scoring System

The competition used a comprehensive scoring system:

We were allowed up to three resets without penalty, but each additional reset would cost us 25 points. A critical rule was that robots couldn't fully cross the yellow/black boundary tape, though partial crossing was permitted.

Technical Requirements

One interesting aspect that would prove crucial was the AprilTag system. Each KLT was equipped with 8 AprilTags (2 on each side), all sharing the same ID (1-5). This was meant to simplify detection, but as we'd soon discover, it came with its own challenges when using the provided camera.

For more detailed information about the competition rules, you can refer to the official rulebook.

Our Team Strategy

We divided our team of 5 strategically:

My Technical Contributions

As the technical challenge lead, I focused on developing advanced AI-driven features to enhance our robot's capabilities. Here's a detailed breakdown of each contribution:

1. Line Detection System

Initially, I approached the line detection problem using traditional computer vision methods but quickly discovered their limitations:

First Attempt: Classical CV

Final Solution: YOLOv8

KLT Line Detection Example

While the system performed well in testing, we ultimately relied on AprilTag-based orientation due to time constraints and hardware challenges.

2. KLT Detection System

As a backup to the AprilTag system, we developed a robust KLT detection system:

KLT Detection System Results

3. Human Safety Integration with Vision Language Models

One of our most innovative features was the integration of VLMs for intelligent human detection:

Technical Setup:

Functionality:

The system successfully identified human interactions with KLTs across our very small test dataset, providing clear safety commands with explanations.

4. ROS 2 Humble-Specialized LLM

To address the common challenges with LLMs mixing up ROS versions all the time, we created a specialized model:

Data Preparation:

  1. Web scraped comprehensive documentation:
    • ROS 2 Humble documentation
    • Olive Robotics documentation
  2. Data cleaning and preprocessing
  3. Used local Llama 3.1 7B model on Jetson to generate Q&A pairs

Fine-tuning Process:

Results:

The combination of these technical elements created a comprehensive AI-driven system that, while not fully utilized in the final competition due to hardware constraints, demonstrated the potential for advanced robotics applications.

The Final Sprint

The days leading up to the final presentation were intense. We often worked late into the night, pushing our technical limits. I faced significant challenges with the fine-tuning process and Jetson containers, barely getting everything operational on presentation day. The pressure was real, but the experience was invaluable.

Hardware Hurdles

Like other teams, we encountered numerous hardware issues. Being a young startup, Olive's hardware was still in its early stages, and real-world testing revealed various challenges. While these issues provided valuable lessons in hardware debugging and adaptation, they unfortunately consumed more of our development time than we'd anticipated. Instead of focusing on new features, we found ourselves troubleshooting hardware problems.

Technical Challenge Success

When presentation day arrived, my teammates presented the software functionality of our robot, and I showcased the additional technical innovations to the judges. Our efforts paid off spectacularly - we were the only team to achieve maximum points in the technical challenge category.

The Final Competition

However, the competition ended with mixed emotions. Just before our robot's final run, we encountered critical hardware issues. Despite our best efforts to fix them, the robot couldn't perform as intended, resulting in minimal points for the actual competition run.

A Bittersweet Victory

While we didn't win the overall competition, what happened next was unexpected and heartening. The organizers acknowledged that the hardware issues were beyond our control, and the CEO of Olive personally approached our team afterward. He congratulated us, noting that we had "by far the most advanced tech stack and solution." He believed we would have won if not for the hardware complications.

Even more encouraging was what followed - Olive extended several opportunities to our team:

Beyond the Competition

The competition might have ended, but our Japanese adventure didn't stop there. We spent the next week and a half exploring Japan, and as a European, I was captivated by the striking cultural differences and unique experiences the country offered.

Final Thoughts

Looking back, while we didn't secure the victory we'd hoped for, we achieved something perhaps more valuable:

I'm immensely grateful to my teammates and the other participants who made this experience so enriching. The competition taught us that success isn't always measured by winning - sometimes it's about the journey, the learning, and the doors that open along the way.

ICRA 2024 Bots & Bento Teams Group Photo