Product Concept: Autonomous Storage Droid

Project Concept

A household robot which packs and unpacks your things in standard size boxes and can also take out the trash and put it back.

Every once-in-a-while, a concept gets stuck in my brain... in this case, I have wanted to be able to create standardized storage in my workspace and be able to request my tools, parts, usb cables or clothes and have them delivered to my desk or taken and cleanly put away. Ideally, it could also even take out the trash, or at least bring it to the door and put it away.

I decided to actually built it as a project for 2021, to catch up on improvements to ROS2, Navigation2, Deep Learning and have some fun building a full robotic concept.

Robot Design

The BOXBOT can autonomously navigate, then pick up a box, slide it over the platform and deliver it to a different location.

Project Background 

I have been thinking about building a box-carrying robot for a while as an interesting project to fuse navigation, motion, CV and other topics. It has rattled around in my brain for over a year and I have built a few different sketches of it. Basically, this is a small indoor autonomous forklift with the ability to identify storage boxes by their size/shape and QR code to identify contents and then use a standard lift system to move them either out of storage to a desk or back into storage.

Standardized Storage

To simplify this, I have been thinking about what the most standard sized box is. With IKEA and other manufacturers creating a roughly 12x12x12" box in many different types, this seems to be the optimal size (Limited weight, standard size, small enough to move indoors easily, etc). They come in wood, cloth, plastic and many other designs.

Standardized storage cube boxes come in many different flavors.

Additionally, these boxes have standard "cube storage" in 1x4, 2x4, 2x6, 3x6 etc sizes, allowing for a wall of storage.


Many manufacturers make standard storage systems for these sized boxes.

Each box type is slightly different, but from an overall dimensional size, they all can be treated equally. Some of the floppier versions may not be as easy to use and will probably have to be skipped.

Platform Limitations

  • Box Weight - The robot needs to be able to have a 20 lbs plus box cantilevered 3-4 feet in the air off its back and not tip over. Therefore, it needs to be pretty heavy and also have its weight on the opposite side as the lift.
  • Lift Height - The lift needs to be able to pick up a box on the ground and deposit it upon a desk. Desks are roughly 32" tall, so it needs a minimum of roughly 36" lift height plus the height of the box from the ground to the handle.
  • Robot Size - The device needs to be small enough in overall dimensions that it can drive both forward and sideways through a room, between beds, etc. Therefore, it needs to be a maximum of about 24" in its biggest dimension. Therefore, the box can't hang off the back, but has to sit over the actual robot during movement. For height, it also needs to be compact enough that it isn't sticking many feet in the air where it can get caught on things. The closer it is to the height of the known lidar safe area, the better.
  • Motion Safety - Especially for the lift, it needs to have depth sensors on it so that it does not run in to any objects which are above the lidar.

Key Device Elements

Movement

To simplify storage alignment, mecanum wheels allow a two stage process of SLAM navigation to achieve an orientation and location through standard global navigation (SLAM, ROS NAV2, etc) and then much more accurace local alignment based on fiducial markers, to get sub centimeter accuracy to pick or place each box.

Mecanum wheels allow a robot to strafe left and right, in addition to forward and rotational motion.

To enable accurate pinpointing of the box, the mecanum wheels allow the robot to strafe, in addition to turning or driving. This enables fine-tuned left/right motion to center the robot on the fiducial marker accurately.

Motor Controller (RoboteQ XDC2460)

The RoboteQ motor controller is more expensive than many other ones, but if you need to have accurate control of the position of motors, rather than just the power applied to them, it is necessary. Because the mecanum wheels slide and also this robot will have varying weight loads, having positional and velocity control of the motors is invaluable.
RoboteQ makes solid motor controllers.

Standard Box Handle

Each box will have a handle which both allows for the box to be stably lifted and also which encodes the size, orientation and other details for the box.
  • Fiducial Marker - A QR code encoding the size of the box and its ID. It is centered on the box and a known distance below the standard handle
  • Handle - A human and robot-grabbable handle
  • Shims - As many boxes have a slight slope, 2, 5, 10 degree shims
  • Adhesive - Double-sided sticky adhesive

Box Localization Recognition

Using a standard fiducial marker aligned with the standard box handle allows for dead reckoning. The fiducial marker can encode the box size/type/contents or look it up in a database. By encoding the height and width of the box relative to the center of the QR code, we can create a simplified view of the physical object.

Linear Stage

If a 20 lbs box is cantilevered off the back of the robot at top extension, it could tip the robot backwards while driving if it hits a bump. Additionally, having the box held off the back enlarges the footprint of the robot so it might catch on objects around it. 
A ball screw linear stage can accurately align the box forward and backwards.

By adding a linear stage to pull the lift back on to the robot, the box can then be within the footprint of the robot. Additionally, this can allow the lift to lower until the box is supported by the stage, rather than the hooks, which saves significant power.

Box Lift

The lift needs the ability to:
  • Place a box on 1 of 3 levels, 1 foot apart. (This is about 24" of lift necessary from level 1 to 3.)
  • Place a box on the floor. Going down to about 8".
  • Place a box on a roughly 32" tall table. (This requires the handles to be at roughy 40" of height.)
Therefore, a multi-stage lift is necessary with at least 34" of elevation so that the robot does not have to have a 40"+ height. For this, there are multiple ways to implement it. The FIRST robotics competition has 1" aluminum rail versions, there are heavy-duty drawer sliders and dedicated (more expensive) kits.
Standard aluminum extrusion lifts are a feasible implementation at this scale.

The key with these lifts is to minimize the slop/rattle in them and also to allow for accurate control. Therefore, the cable-driven version do not work and a cascading chain or belt-driven version is necessary.

Lift Hooks

With the standard box handle, it will have a standard width which allows the front of the lift to hook into a standardly 3d-printed handle.

Navigation

For navigation, a combination of a RealSense camera for visual slam and a 2d lidar will be used. The RealSense camera will allow for localization of the robot within the environment with high accuracy, but is not good at testing for collisions and changes in objects around it.

The 2d lidar has less accuracy with localization, but is able to create 2d occupancy maps and also have a 360 degree view of the world.

Parts

Compute (NVIDIA Jetson NX)

From a computational side, the system needs to be able to run multiple vision-based algorithms simultaneously, so having an NVIDIA-based, GPU-accelerated, compute module is a necessity. That said, the actual computation is not massive so it does not need the highest compute versions of Xavier.
  • Fiducial marker detection
  • Stereo pair SLAM localization
  • Storage location detection (3D template matching)
  • ROS Navigation2 + 2D Lidar

The NVIDIA Jetson NX has a good balance of compute and low power.

Lidar (RPLidar)

The 2D circular lidar allows for standard navigation using libraries such as the ROS2 Navigation2 library.


The RPLidar A3 gives accurate distance data at a good price point.

3D Vision (RealSense D435i)

A forward facing depth camera enables more robust localization and obstacle avoidance in cluttered environments than just 2d lidar, which can miss objects outside of its line of sight.


The inertial sensor on the D435i allows for better localization.

NOTE: Make sure to get the 'i' version as it has inertial sensors (Accelerometer) which enables much better slam.

QR Camera

To sense the position and skew of QR codes accurately, a backwards-facing medium angle 1080 camera will work. The robot will drive up to the storage unit or box handle using the RealSense camera to get close and then rotate around to face backwards, where the QR code will be visible.
A simple board camera faces backwards for QR code alignment

A Polished Design

Many robots leave their wires hanging everywhere. A key challenge here is to enclose all the wires in a safe and clean way, while allowing linear travel and the lift to function cleanly without snags.