Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
LOCATING AND ATTACHING INTERCHANGEABLE TOOLS IN-SITU
Document Type and Number:
WIPO Patent Application WO/2020/056380
Kind Code:
A1
Abstract:
Current technologies allow a robot to acquire a tool only if the tool is in a set known location, such as in a rack. In an embodiment, a method and corresponding system, can determine the previously unknown pose of a tool freely placed in an environment. The method can then calculate a trajectory that allows for a robot to move from its current position to the tool and attach with the tool. In such a way, tools can be located and used by a robot when placed at any location in an environment.

More Like This:
Inventors:
TAYOUN ANTHONY (US)
JOHNSON DAVID M S (US)
WAGNER SYLER (US)
ROONEY JUSTIN (US)
LINES STEVEN (US)
Application Number:
PCT/US2019/051183
Publication Date:
March 19, 2020
Filing Date:
September 13, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CHARLES STARK DRAPER LABORATORY INC (US)
TAYOUN ANTHONY (US)
International Classes:
B25J9/16; B25J11/00
Foreign References:
US20180147718A12018-05-31
US20170326728A12017-11-16
US20180147723A12018-05-31
Attorney, Agent or Firm:
MEAGHER, Timothy J. et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method comprising:

determining, by a computer, a pose of at least one tool, configured to mate with an interface of a robot, within an environment based upon at least one image of the environment;

determining, by the computer, an attachment pose of the robot interface, based on the determined tool pose; and

computing, by the computer, a trajectory of the robot that includes moving the robot interface from a first position to the attachment pose and mating the robot interface with the at least one tool.

2. The method of Claim 1, wherein determining the tool pose includes: labeling at least one pixel of the at least on image with an object class or a probability distribution of the object class; and locating the at least one tool in the segmented image.

3. The method of Claim 2, wherein locating the at least one tool the tool includes

applying a neural network to the at least one image.

4. The method of Claim 2, wherein the at least one image includes a color image and a depth image, and determining the tool pose further includes:

mapping the located at least one tool to a point cloud based on the depth image, to produce a partial 3D representation of the located at least one tool;

comparing the partial 3D representation to a set of models from a model library to find a best matching model;

generating multiple hypotheses, each hypothesis including the best matching model at an estimated tool pose;

selecting a hypothesis based on a comparison of the multiple hypotheses to the partial 3D representation; and outputting the tool pose of the at least one tool based on the estimated tool pose of the hypothesis selected.

5. The method of Claim 1, wherein determining the tool pose of the at least one tool includes applying a scale-invariant transform to the at least one image.

6. The method of Claim 1, wherein determining the tool pose includes: identifying a 2D visual code on the at least one tool in the at least one image; and determining the tool pose based on an orientation of the 2D visual code.

7. The method of Claim 1, wherein the computed trajectory is at least one of: free of a collision between the robot and at least one object in the environment; free of causing the at least one tool to move by contacting the at least one tool with the robot prior to reaching the attachment pose; free of the robot contacting an object that applies a force or torque to the at least one tool prior to reaching the attachment pose; causes the robot to interact with the tool to move the tool to a new tool pose that facilitates mating; and

aligns the robot interface in the attachment pose where a locking mechanism can be engaged to secure the at least one tool to the robot interface.

8. The method of Claim 1, further comprising: determining, by the computer, an updated tool pose of the at least one tool at a time when determining the pose of at least one tool has been completed;

determining, by the computer, an updated attachment pose of the robot interface, based on the updated tool pose; and

computing, by the computer, an updated trajectory based on the updated attachment pose.

9. The method of Claim 1, where the computed trajectory additionally includes: moving the robot interface to a drop-off location prior to moving the robot interface to the attachment pose; and removing an attached tool from the robot interface at the drop-off location.

10. The method of Claim 1, wherein mating the robot interface with the at least one tool includes performing a peg-in-hole insertion, by a contact-based controller, using at least one of a force and a torque sensor, the insertion employing the determined tool pose as an initial pose estimate of the at least one tool.

11. The method of Claim 10, wherein mating the robot interface with the at least one tool further includes employing, by the contact-based controller, a neural network that provides at least one trajectory adjustment while performing the peg-in-hole insertion.

12. The method of Claim 1, further comprising executing, by the robot as an open loop controller, the trajectory.

13. The method of Claim 12, further comprising determining if the mating was successful based on at least one of a proximity sensor mounted on at least one tool and the robot, visual feedback, force feedback, and torque feedback.

14. The method of Claim 1, wherein the attachment pose is within a 1.0 mm distance from a target location, within a 0.8 degree clocking offset, and within a 2.0 degree twisting offset of a target rotational orientation.

15. A system comprising:

a processor; and

a memory with computer code instructions stored thereon, the processor and the memory, with the computer code instructions, being configured to cause the system to: determining a pose of at least one tool, configured to mate with an interface of a robot, within an environment based upon at least one image of the environment; determining an attachment pose of the robot interface, based on the determined tool pose; and

computing a trajectory of the robot that includes moving the robot interface from a first position to the attachment pose and mating the robot interface with the at least one tool.

16. The system of Claim 15, wherein the at least one image includes an RGB image, and determining the tool pose includes: segmenting the RGB image; labeling at least one pixel of the segmented RGB image with an object class or a probability distribution of the object class; and locating the at least one tool in the segmented image.

17. The system of Claim 16, wherein determining the tool pose further includes applying a neural network to the located tool in the segmented image.

18. The system of Claim 16, wherein the at least one image includes the RGB image and a depth image, and determining the pose further includes:

mapping the located at least one tool to a point cloud based on the depth image, to produce a partial 3D representation of the located at least one tool;

comparing the partial 3D representation to a set of models from a model library to find a best matching model;

generating multiple hypotheses, each hypothesis including the best matching model at an estimated tool pose;

selecting a hypothesis based on a comparison of the multiple hypotheses to the partial 3D representation; and

outputting the tool pose of the at least one tool based on the estimated tool pose of the hypothesis selected.

19. The system of Claim 15, wherein determining the pose of the at least one tool

includes applying a scale-invariant feature transform to the at least one image.

0. The system of Claim 15, wherein determining the pose includes: identifying a 2D visual code on the at least one tool in the at least one image; and

determining the pose based on an orientation of the 2D visual code.

Description:
Locating And Attaching Interchangeable Tools In-Situ

RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No.

62/730,933, filed on September 13, 2018, U.S. Provisional Application No. 62/730,703, filed on September 13, 2018, U.S. Provisional Application No. 62/730,947, filed on September 13, 2018, U.S. Provisional Application No.62/730, 918, filed on September 13, 2018, U.S.

Provisional Application No. 62/730,934, filed on September 13, 2018 and U.S. Provisional Application No. 62/731,398, filed on September 14, 2018.

[0002] This application is related to U.S. Patent Application titled“Manipulating

Fracturable And Deformable Materials Using Articulated Manipulators”, Attorney Docket No. 5000.1049-001; U.S. Patent Application titled“Food-Safe, Washable, Thermally- Conductive Robot Cover”, Attorney Docket No. 5000.1050-000; U.S. Patent Application titled“Food-Safe, Washable Interface For Exchanging Tools”, Attorney Docket No.

5000.1051-000; U.S. Patent Application titled“An Adaptor for Food-Safe, Bin-Compatible, Washable, Tool-Changer Utensils”, Attorney Docket No. 5000.1052-001; U.S. Patent Application titled“Determining How To Assemble A Meal”, Attorney Docket No.

5000.1054-001; U.S. Patent Application titled“Controlling Robot Torque And Velocity Based On Context”, Attorney Docket No. 5000.1055-001; U.S. Patent Application titled “Stopping Robot Motion Based On Sound Cues”, Attorney Docket No. 5000.1056-000; U.S. Patent Application titled“Robot Interaction With Human Co-Workers”, Attorney Docket No. 5000.1057-001; U.S. Patent Application titled“Voice Modification To Robot Motion Plans”, Attorney Docket No. 5000.1058-000; and U.S. Patent Application titled“One-Click Robot Order”, Attorney Docket No. 5000.1059-000, all of the above U.S. Patent Applications having a first named inventor David M.S. Johnson and all being filed on the same day, September 13, 2019.

[0003] The entire teachings of the above application(s) are incorporated herein by reference.

BACKGROUND

[0004] Traditionally, the food industry employs human labor to manipulate food ingredients with the purpose of either assembling a meal such as a salad or a bowl, or packing a box of ingredients such as those used in grocery shopping, or preparing the raw ingredients. Robots have not yet been able to assemble complete meals from prepared ingredients in a food-service setting such as a restaurant, largely because the ingredients change shape in difficult-to-predict ways rendering traditional methods to move material ineffective.

SUMMARY

[0005] Current technologies allow a robot to acquire a tool with its end effector if it is in a set location, such as in a rack. Applicant’s disclosure allows a robot to acquire a tool no matter its pose, such that it can be left in the worksite (e.g., in a food container) by

determining its pose (e.g., position and rotation with respect to the tool-changer attached to the robot) from an RGB image, a depth image, or a combination of an RGB image and a depth image and calculating a trajectory of the robot from its current pose to an attachment pose.

[0006] In an embodiment, a method includes determining, by a computer, a pose of at least one tool, configured to mate with an interface of a robot, within an environment based upon at least one image of the environment. The method also includes determining, by the computer, an attachment pose of the robot interface, based on the determined tool pose and computing, by the computer, a trajectory of the robot that includes moving the robot interface from a first position to the attachment pose and mating the robot interface with the at least one tool.

[0007] In some embodiments, determining the tool pose further includes labeling at least one pixel of the at least on image with an object class or a probability distribution of the object class and locating the at least one tool in the segmented image. In such embodiments, locating the at least one tool the tool may further include applying a neural network to the at least one image. The determining the tool pose may further include mapping the located at least one tool to a point cloud based on the depth image, to produce a partial 3D

representation of the located at least one tool, comparing the partial 3D representation to a set of models from a model library to find a best matching model, generating multiple hypotheses, each hypothesis including the best matching model at an estimated tool pose, selecting a hypothesis based on a comparison of the multiple hypotheses to the partial 3D representation, and outputting the tool pose of the at least one tool based on the estimated tool pose of the hypothesis selected. [0008] In an embodiment of the method, determining the tool pose of the at least one tool includes applying a scale-invariant transform to the at least one image. In an alternative embodiment determining the tool pose includes identifying a 2D visual code on the at least one tool in the at least one image and determining the tool pose based on an orientation of the 2D visual code.

[0009] In an embodiment of the method, the computed trajectory is at least one of free of a collision between the robot and at least one object in the environment, free of causing the at least one tool to move by contacting the at least one tool with the robot prior to reaching the attachment pose, free of the robot contacting an object that applies a force or torque to the at least one tool prior to reaching the attachment pose, causes the robot to interact with the tool to move the tool to a new tool pose that facilitates mating, and aligns the robot interface in the attachment pose where a locking mechanism can be engaged to secure the at least one tool to the robot interface. In an embodiment, the computed trajectory additionally includes moving the robot interface to a drop-off location prior to moving the robot interface to the attachment pose and removing an attached tool from the robot interface at the drop-off location.

[0010] In some embodiments, the method further comprises determining, by the computer, an updated tool pose of the at least one tool at a time when determining the pose of at least one tool has been completed, determining, by the computer, an updated attachment pose of the robot interface, based on the updated tool pose and computing, by the computer, an updated trajectory based on the updated attachment pose.

[0011] In an embodiment of the method, mating the robot interface with the at least one tool includes performing a peg-in-hole insertion, by a contact-based controller, using at least one of a force and a torque sensor, the insertion employing the determined tool pose as an initial pose estimate of the at least one tool. In such an embodiment, mating the robot interface with the at least one tool may further include employing, by the contact-based controller, a neural network that provides at least one trajectory adjustment while performing the peg-in-hole insertion.

[0012] An embodiment of the method further comprises executing, by the robot as an open loop controller, the trajectory. That embodiment may further include determining if the mating was successful based on at least one of a proximity sensor mounted on at least one tool and the robot, visual feedback, force feedback, and torque feedback. [0013] In some embodiments of the method, the attachment pose is within a 1.0 mm distance from a target location, within a 0.8 degree clocking offset, and within a 2.0 degree twisting offset of a target rotational orientation.

[0014] An embodiment includes a system capable of carrying out the above described methods, includes a processor and a memory with computer code instructions stored thereon. The processor and the memory, with the computer code instructions, being configured to cause the system to determine a pose of at least one tool, configured to mate with an interface of a robot, within an environment based upon at least one image of the environment, determine an attachment pose of the robot interface, based on the determined tool pose, and compute a trajectory of the robot that includes moving the robot interface from a first position to the attachment pose and mating the robot interface with the at least one tool.

[0015] The embodiment of the system where determining the tool pose includes segmenting the RGB image, labeling at least one pixel of the segmented RGB image with an object class or a probability distribution of the object class, and locating the at least one tool in the segmented image. In such an embodiment, determining the tool pose further includes applying a neural network to the located tool in the segmented image. In such an

embodiment, determining the pose may further include mapping the located at least one tool to a point cloud based on the depth image, to produce a partial 3D representation of the located at least one tool, comparing the partial 3D representation to a set of models from a model library to find a best matching model, generating multiple hypotheses, each hypothesis including the best matching model at an estimated tool pose, selecting a hypothesis based on a comparison of the multiple hypotheses to the partial 3D representation, and outputting the tool pose of the at least one tool based on the estimated tool pose of the hypothesis selected.

[0016] In an embodiment of the system, determining the pose of the at least one tool includes applying a scale-invariant feature transform to the at least one image. In an embodiment of the system, determining the pose of the at least one tool includes identifying a 2D visual code on the at least one tool in the at least one image and determining the pose based on an orientation of the 2D visual code.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017] The foregoing will be apparent from the following more particular description of example embodiments, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments.

[0018] Fig. l is a block diagram illustrating an example embodiment of a quick service food environment of embodiments of the present disclosure.

[0019] Fig. 2 is a flow diagram illustrating an example embedment of the disclosed method that utilizes Segmentation and Iterative Closest Point (ICP) (SegICP) to determine the pose of a tool.

[0020] Fig. 3A is a diagram illustrating a tool with a visual 301 for use in an example embodiment of the disclosed method.

[0021] Fig. 3B is a picture of tool with a visual tag in a food service environment for use in an example embodiment of the disclosed method.

[0022] Fig. 4 is a flow diagram illustrating an example embodiment of the disclosed method that uses a pose interpreter neural network to determine the pose a tool.

[0023] Fig. 5 is a diagram illustrating an example embodiment of a system implementing an example embedment of the disclosed method for attaching to a tool.

DETAILED DESCRIPTION

[0024] A description of example embodiments follows.

[0025] Operating a robot in a quick service food environment, such as a fast food restaurant, can be challenging for a number of reasons. First, the end effectors e.g., utensils, that the robot uses need to remain clean from contamination. Contamination can include allergens (e.g., peanuts), food preferences (e.g., contamination from pork for a vegetarian or kosher customer), dirt/bacteria/viruses, or other non-ingestible materials (e.g., oil, plastic, or particles from the robot itself). Second, the temperature of the robot should be controlled without sacrificing cleanliness. Third, the robot should be able to manipulate fracturable and deformable materials, and further be able to measure an amount of material controlled by its utensil. Fourth, the robot should be able to automatically and seamlessly switch utensils (e.g., switch between a ladle and tongs). Fifth, the utensils should be adapted to be left in an assigned food container and interchanged with the robot as needed, in situ. Sixth, the interchangeable parts (e.g., utensils) should be washable or dishwasher safe. Seventh, the robot should be able to autonomously generate a task plan and motion plan(s) to assemble all ingredients in a recipe, and execute that plan. Eighth, the robot should be able to modify or stop a motion plan based on detected interference or voice commands to stop or modify the robot’s plan. Ninth, the robot should be able to apply a correct amount of torque based on parameters (e.g., density and viscosity) of the foodstuff to be gathered. Tenth, the system should be able to receive an electronic order from a user, assemble the meal for the user, and place the meal for the user in a designated area for pickup automatically.

[0026] Tool changers for robots require the tool and robot interface to be accurately aligned in both position (-+/- lmm) and rotation (-+/- 0.8 deg clocking offset, and +/- 2deg twisting offset) before the locking mechanism can be successfully actuated. In unstructured environments, it is a challenging task to both recognize and localize an object, even if that object is known to the system beforehand. In this context,‘to recognize’ means to assign which pixels in at least one sensor image correspond to the desired object. In some embodiments the sensor image includes a color RBG image and/or depth image. In this context,‘to localize’ means to identify a linear transformation which moves the known object from the origin in the camera frame to its current observed position. The pose of the object in the camera frame is sufficient to describe the object’s location in any other frame, if there is a known transformation between that frame and the camera frame. Finally, in this context, ‘known’ refers to having an existing 3D model or other representation which enables the user to assign a specific pose to an orientation of the object. In other words, a known

transformation includes a pre-defmed origin and axes alignment with respect to the physical object.

[0027] To use a tool-changer with objects in an unstructured environment, a robot should be able to either: (a) observe the tool, (b) compute its pose with sufficient accuracy that an open-loop trajectory can be created to maneuver the arm from its current location to the mating position with the tool, and/or (c) make an observation of the tool’s location that is insufficiently accurate for an open loop mating trajectory, but accurate enough to maneuver close enough to the tool where force/torque feedback can be used to begin engaging in a closed-loop capture behavior where the mating position is achieved by moving in response to force/torque feedback from contact between the interface and the tool.

[0028] This disclosure includes a system to recognize, localize, and attach a tool to a robot interface in an environment where the location of the desired tool is not known beforehand. In some embodiments, the robot interface may be a master tool changer configured to attached to multiple tools. The system consists of a tool/end effector localization and pose determination mechanism, a trajectory module, and optionally a closed- loop tactile feedback system used to assist in mating the interface with the tool. The localization and pose determination mechanism and/or the trajectory module may be computed implemented methods implemented by a memory with computer code instructions stored thereon and a processor.

[0029] The localization mechanism determines the pose of the tool from information gathered external sensor or other input which provides an estimate of the pose of the tool. While the pose of the tool is unknown beforehand, in some embodiments, the type, size, and visual characteristics of the tool are known. In such embodiments, the type, size, and visual characteristics of the tool can be stored in a database accessible by the localization mechanism.

[0030] Based on the pose estimate of the tool, the system computes a trajectory or chooses from a set of pre-computed trajectories to move the robot from its current position and pose to an attachment position and pose. At the attachment position and pose, a mating action is executed connecting the robot’s interface to the tool. In some embodiments the mating motion can be incorporated into the trajectory. Alternatively, the mating/attachment motion can be performed as a peg-in-hole insertion process.

[0031] The computed trajectory may be computed to avoid collision with any other objects in the environment. The computed trajectory may also approach the tool as to not shift the position of the tool such that the mating position cannot be achieved. Alternatively, the computed may be chosen so that it interacts with the tool to facilitate achieving the attachment position (for example, by pressing the tool against a container so as to allow the robot to exert a torque or force on the tool that is not be possible in free space without interacting with the environment).

[0032] The system may also be equipped with a closed-loop, contact (or tactile) based controller which, based on: (a) a signal from either torque sensors in the robot joints, (b) a force/torque sensor mounted at the wrist of the robot, or (c) an externally mounted force/torque sensor on the tool, adjusts the trajectory of the robot, after the robot has reached the attachment position/pose to successfully mate the robot interface with the tool.

[0033] Fig. l is a block diagram illustrating an example embodiment of a quick service food environment 100 of embodiments of the present invention. The quick service food environment 100 includes a food preparation area 102 and a patron area 120.

[0034] The food preparation area 102 includes a plurality of ingredient containers l06a-d each having a particular foodstuff (e.g., lettuce, cheese, guacamole, beans, ice cream, various sauces or dressings, etc.). Each ingredient container l06a-d stores in situ its corresponding utensil l08a-d. The utensils l08a-d can be spoons, ladles, tongs, dishers (scoopers), or other utensils. Each utensil l08a-d is configured to mate with and disconnect from an interface such as a master end effector connector 112 of a robot arm 110. While the term utensil is used throughout this application, a person having ordinary skill in the art can recognize that the principles described in relation to utensils can apply in general to end effectors in other contexts (e.g., end effectors for moving fracturable or deformable materials in construction with an excavator or backhoe, etc.); and a robot arm can be replaced with any computer controlled actuatable system which can interact with its environment to manipulate a deformable material. The robot arm 110 includes sensor elements/modules such as stereo vision systems (SVS), 3D vision sensors (e.g., Microsoft Kinect™), LIDAR sensors, audio sensors (e.g., microphones), inertial sensors (e.g., internal motion unit (IMU), torque sensor, weight sensor, etc.) for sensing aspects of the environment, including pose (i.e., X, Y, Z coordinates and pitch, roll, yaw) of tools for the robot to mate, shape and volume of foodstuffs in ingredient containers, shape and volume of foodstuffs deposited into food assembly container, moving or static obstacles in the environment, etc.

[0035] To initiate an order, a patron in the patron area 120 enters an order 124 in an ordering station 1 l2a-b, which is forwarded to a network 126. Alternatively, a patron on a mobile device 128 can, within or outside of the patron area 120, generate an optional order 132. Regardless of the source of the order, the network 126 forwards the order to a controller 114 of the robot arm 110. The controller generates a task plan 130 for the robot arm 110 to execute.

[0036] The task plan 130 includes a list of motion plans l32a-d for the robot arm 110 to execute. Each motion plan l32a-d is a plan for the robot arm 110 to engage with a respective utensil l08a-d, gather ingredients from the respective ingredient container l06a-d, and empty the utensil l08a-d in an appropriate location of a food assembly container 104 for the consumer, which can be a plate, bowl, or other container. The robot arm 110 then returns the utensil l08a-d to its respective ingredient container l06a-d, or other location as determined by the task plan 130 or motion plan l32a-d, and releases the utensil l08a-d. The robot arm executes each motion plan l32a-d in a specified order, causing the food to be assembled within the food assembly container 104 in a planned and aesthetic manner.

[0037] Within the above environment, various of the above described problems can be solved. The environment 100 illustrated by Fig. 1 can improve food service to patrons by assembling meals faster, more accurately, and more sanitarily than a human can assemble a meal. Some of the problems described above can be solved in accordance with the disclosure below.

[0038] A robot operating with a variety of interchangeable end-effector tools classically uses a dedicated physical tool changer interface, where each tool has a fixed position relative to the robot when not in use. For example, tools are stored in a rack, where each tool has a designated position on the rack so that the robot knows the tool is in that position when not in use. The system and method contained in this disclosure, locates the tools during robot operation and therefore, bypasses the need for such a dedicated and cumbersome physical tool changer interface. In an embodiment, such a perception system enables a robot to autonomously locate a set of end effector tools in its workspace and attach them without the set of end effector tools having a fixed or designated position.

[0039] A vision system captures at least image of the workspace containing the tools.

The at least one image can include a color image and a depth image. In some embodiments, the color image is an RGB image.

[0040] A perception pipeline, or other mechanism, extracts the pose of each tool in the image. A person of ordinary skill in the art can understand that pose is the position (e.g., X-, Y-, and Z-coordinates) and orientation (e.g., roll, pitch, and yaw) of the object. If the pose is determined in the coordinate of image, the coordinates can be transformed into world coordinates. The system and method of the disclose includes determining the pose of the tools can with a variety of methods. For example, some embodiments of the method determine the pose of the tools by applying a scale-invariant transform to the image and matching outlines and edges in the image with known values of tools. Once the pose of a tool of interest is determined, the robot arm can calculate and execute a trajectory to move an interface located at the end of the robot arm to an attachment pose aligned with the tool.

When the interface is located at the attachment pose the interface can be mated with the tool.

[0041] Fig. 2 is a flow diagram illustrating an example embedment 200 of the disclosed method that utilizes Segmentation and Iterative Closest Point (ICP) (SegICP) to determine pose of a tool. Fig. 2 shows an example embodiment method that segments an image using a convolutional neural network and estimating the pose of an objected identified in the segmented image using an Iterative Closest Point method. However, any combination of segmentation, and pose estimation procedures can be utilized.

[0042] The first step includes capturing a color image 201 of an environment containing at set of tools including first tool 20 la, second tool 20 lb, and third tool 20 lc. The image may be captured by a camera that produces and RGB image. The camera may be a red, green, blue depth (RGBD) camera. The image 201 is segmented to produce segmented image 203. To produce segmented image 203, the system labels the pixels of the image 201 with a category. Each category corresponds to an object type (e.g., bench, floor, spoon, lettuce, counter-top, unknown, etc...). Alternatively, the pixels may be labeled with a category corresponding to a probability distribution of the object type. The number, type, and identity of the categories is determined by the number of objects that the system needs to perform its assigned manipulation task. The SegICP method performs pixel-level semantic segmentation on the image 201 using a convolutional neural network 202 to produce the segmented image 203. A person of ordinary skill in the art can understand that a variety of methods and neural networks can produce segmented image 203. The convolutional neural network may be a SegNet, Atrous Convolution, or DRN neutral network. Alternatively, image 201 may be segmented without the use of a a convolutional neural network 202.

[0043] In segmented image 203, sets of pixels 204a, 204b, 204c are each labeled with their own category. When segmentation is performed correctly, the set of pixels 204a, 204b, 204c in segmented image 203 should correspond to the tools 20 la, 20 lb, and 20 lc in original image 201. For example, set of pixels 204c, correspond to the location of third tool 20lc.

[0044] After segmentation is performed, the pose of tools 20 la, 20 lb, and 20 lc can be determined, using the segmented image 203 and set of pixels 204a, 204b, 204c. The following description focuses on determining the pose of third tool 20 lc using a

corresponding set of pixels 204c. However, it is evident to a person skilled in the art that the method can be applied to any tool within the image 201. The set of pixels 204c labelled with a category corresponding to the desired tool, third tool 20 lc, are passed to a pose a pose- extraction method/module. The SegICP method uses an interactive Closest Point method for pose estimation.

[0045] Depth image 205 provides a point cloud. The point cloud from the depth image 205 is cropped using the segmented image 203. Models 206a, 206b, 206c are retrieved from the object library 206. The best fitting model 206c for third tool 20lc is identified using the category labeling the corresponding set of pixels 204c. A partial 3D representation of the desired third tool 20 lc can be generated from the cropped point cloud from depth image 205.

[0046] The system generates multiple hypotheses of the model 206c at a range estimated poses. Fig. 2 illustrates three multiple hypotheses 207, overlaid on image 201. The system compares the hypothesis 207 of model 206c to the partial 3D representation of third tool 20 lc using a regression analysis. A person skilled in the art would know that any number of hypotheses 207 can be generated and compared to the partial 3D representation of third tool 20 lc. The hypothesis that places model 206c at a pose that best matches the actual pose of the third tool 20 lc has the best regression score and is selected as the best hypothesis. The actual pose of third tool 20 lc can be determined from the estimated pose of model 206c in the best hypothesis.

[0047] Tools 20la, 20lb, and 20lc in original image 201 may also be identified using a bounding box either in combination with pixel labeling or without any other complimentary methods. A bound box method may utilize any methods known to those skilled in the art such as YOLO and Mask R-CNN.

[0048] The disclosure includes a method of to determine the pose of tools in an environment using unique visual tags mounted on each of the tools. In some embodiments, the visual tags are Aruco tags, and the pose can be extracted from an image of the tools and tag using the Aruco library. Said visual tag method can further be combined with the methods above, as understood by a person of ordinary skill in the art.

[0049] Fig. 3A is a diagram illustrating a tool 300 with a visual tag 301 for use in an example embodiment of the disclosed method. Fig. 3B is a picture of the tool 300 with visual tag 301 in a food service environment. The serving tool includes a visual tag 301. Visual tag 301 can be recognized by a robot’s vision systems. The pose of visual tag 301 can be determined using the known size of the visual tag 301, the intrinsic and extrinsic parameters of the camera that took the image of the tag, and the measured comer pixel location in the visual tag 301. The pose of tool 300 extrapolated from the pose of visual tag 301. In addition, tool 300 includes a bin clocking feature 303 that allows the tool 300 to rest in a consistent location in the food container, even if the food container stores varying levels of food. This allows a system to more easily determine the pose of tool 300 and determine a trajectory to attach a robot interface to tool 300 at tool interface comment 302.

[0050] Fig. 4 is a flow diagram of an example embodiment 400 of the disclosed method that uses a pose interpreter neural network to determine the pose of a tool. The first step includes capturing a color image 401 of an environment containing a set of tools 40la-d. The image may be captured by a camera that produces and RGB image, or other visual system. The camera may be a red, green, blue depth (RGBD) camera. The image 401 is segmented to produce the segmented image 403. The segmented image 403 may be created using neural network 402 in a similar manner as described for example embedment 200 in relation to Fig. 2. In relation to Fig. 4, the neural network 402 may be trained using real data from color images. The segmented image 403 includes segmentation masks 403 a-d that are comprised of pixels that correspond to the pixels containing tools 401 a-d in the image 401.

[0051] Each segmentation mask 403 a-d can be analyzed using a pose interpreter neural network 404. The pose interpreted neural network 404 can be trained entirely using synthetic data to determine the pose of an object with known parameters using the segmentation mask of that object. Pose interpreted neural network may apply one of the flowing methods, Dense Object Nets, Pose Interpreter Networks, SE3-Nets, or other similar methods. The pose interpreted neural network 404 provides object poses 405a-d for tools 401 a-d using segmentation masks 403 a-d. Object poses can include at least six degrees of freedom (X, Y, and Z coordinates, and pitch, roll, and yaw). Picture 405 shows the outputted object poses 405a-d overlaid upon the original image 401. The actual poses of tools 40la-d can be determined from the object poses 405a-d.

[0052] One or more of the pose determination methods can be utilized simultaneously. Additionally, one or more of the pose determination methods can be applied to successive observations to allow for repeated or even continuous monitoring. One or more of the pose estimation methods and/or pose determination outputs can be combined with a model of the robot motion, assuming that the utensil is stationary, to improve the estimate of the tool pose using a least-squares estimator.

[0053] Once the pose of a tool or a set of tools is determined, the system calculates a trajectory that allows the robot to move its interface to an attachment position and mate with a target tool. Using inputs including the determined pose of a targeted tool, the current pose of the robot, and a set of constraints based on how the attachment mechanism works, a trajectory planning method determines an attachment pose of the robot interface (e.g., from which the attachment mechanism can be triggered to mate the interface with the tool) and calculates a trajectory of the robot that moves the robot interface to the attachment pose. The calculation may use inverse kinematics to determine the robot trajectory for the robot interface to mate with the tool. The trajectory planning method may be executed by a computer or other similar computation device.

[0054] For each type of tool, the system pre-computes an allowed trajectory (in the reference frame of the tool), which causes the robot interface to mate with the tool. Based on the measured tool pose, inverse kinematics is used to determine the required robot position trajectory for the robot to mate with the tool. If the robot is a robotic arm, the required robot joint position and space is also determined.

[0055] For every portion of the calculated trajectory, the system verifies the robot does not collide with objects in the environment. The location and pose of objects in the environment can be determined using a model of the environment. The model may be a- priori generated, generated by applying the disclosed pose determination methods to all objects in the environment, or directly checking for collision with a point cloud or octomap provided by a depth sensor that is observing the environment. In one embodiment, small collisions may be allowed to remain in the computed trajectory, if a compliant torque controller is used to control the robot. In such an embodiment, the robot can be allowed to make light contact with its environment. In such embodiments, a collision tolerance and a collision angle can be set, and trajectories that have the surface normal below the collision angle and penetration depths less than the collision tolerance remain in the computed trajectory. The system can also verify that the calculated trajectory does not violate any joint, or other physical, limits of the robot.

[0056] If a feasible trajectory cannot be calculated, the system may notify the user or operator that the tool is no longer in a reachable position. Alternatively, a trajectory can be calculated that uses the robot to manipulate the tool by pushing with part of the robot or with another tool to move the tool into a position where a feasible mating trajectory can be calculated.

[0057] The trajectory planning method can consider additional sets of constraints to ensure the calculated trajectory meets a wide range of potentially desired criteria and properties. The constraints can include but are not limited to:

a) ensuring the computed trajectory does not cause a collision between the robot and at least one object in the environment;

b) the computed trajectory is free of causing the targeted tool to move by

contacting the targeted tool with the robot prior to reaching the attachment pose;

c) ensuring the trajectory is does not cause the robot to contact an object that applies a force or torque to the target tool prior to reaching the attachment pose;

d) the trajectory causes the robot to interact with the tool to move the tool to a new tool pose that facilitates mating; and e) the trajectory aligns the robot interface in the attachment pose where a locking mechanism can be engaged to secure the at least one tool to the robot interface.

[0058] The trajectory planning method is also capable of determining a trajectory that includes moving the robot interface to a drop-off location, and while at the drop-off location, removing any tool that is currently attached to the robot interface before moving the robot interface to the attachment position.

[0059] The method and associated includes several alternative manners of planning and performing the mating between the robot’s interface and the target tool. In one embodiment, the attachment pose of the robot interface is sufficiently aligned with the target tool such that the attachment mechanism can be employed at the attachment pose, connecting the interface and the tool without any additional calculations or movements of the robot required. In such embodiments, the trajectory of the robot includes moving the robot interface from its current position to the attachment position and mating the robot interface with the tool. This trajectory can be executed by the robot as acting as an open loop controller.

[0060] In alternative embodiments, the attachment pose may only be accurate enough to place the robot interface close enough to the tool where force/torque feedback can be used to begin engaging in a closed-loop capture behavior. In such a situation, the mating is achieved by moving the robotic arm in response to force/torque feedback from contact between the interface and the tool measured by sensors located on the tool and/or robot. This may be accomplished by performing a peg-in-hole insertion by a contact-based controller. A neural network may be utilized by the contact-based controller to provide trajectory adjustment while performing the peg-in-hole insertion.

[0061] The disclosed method can also employ other methods of a closed loop mating. One such method is that once the robot interface is within a certain distance to the tool, a proximity sensor located on the robot, robot interface, or tool can be queried to see if the robot is at a desired mating location and the robot can proceed along the trajectory until the proximity sensor is triggered. Alternatively, a visual system may provide visual feedback to continuously detect the pose of the tool can be used while the robot interface is approaching the tool.

[0062] The disclosed method, including determining the pose of the target tool, determining the attachment pose of the robot’s interface, and calculating the trajectory, can be repeated at any time after they have initially been carried out. This allows for the system to determine updated tool pose, attachment pose, and trajectory in case the tool is pushed, its position is changed, there was an incorrect pose estimate, or sensor error. The pose of the targeted tool can be monitored from the initial determination of its pose until the interface mates with the tool. In this way, the disclosed method and corresponding system can be made to be self-correcting and able to react to unplanned events and disturbances. The pose of the tool may be continually determined, using the disclosed methods, all the way to the mating position, and this can be used for the entire closed loop operation. Pose monitoring can be discontinued at any time and then the approach and/or mating can be run open loop or closed-loop by using another strategy.

[0063] When the tool is successfully mating to robot interface, a second trajectory can be computed, using the same methods, to remove the tool from its surrounding environment, such as a food container, in a way which does not violate the robots and/or tools physical limits and is collision free. While the tool is attached to the robot interface, query sensors can continue to monitor for changes while moving which would indicate the tool is attached or is no longer attached. For example, if proximity sensors located on the robot and/or tool are filled to trigger, the pose of the tool changes, and/or the mass attached to the robot decreases, it may be a signal that the tool has been unattached.

[0064] Once the robot has performed a task using the tool, the tool can be returned to its original location using the same trajectory determination methods disclosed above. A trajectory is computed from the tools current location to the drop off location that does not violate collision or joint constraints, or other physical constraints of the system such as acceleration or velocity or jerk limits. The drop off location can be located at a pose slightly displaced from the desired resting pose of the tool. Therefore, in the case of placing the tool back into a food container, the tools’ food-contact zone is located at a certain depth below the measured surface of the food material in the container. The robot executes the calculated trajectory to move the tool to the drop off location and triggers a release mechanism to release to tool. Gravity may be used to assist in placing the tool at a desired location. A new trajectory may be computed moves the robot away from the released tool without disturbing the tool. This may be accomplished by a closed loop control process that minimizes torque. Alternatively, the released tool’s pose can be determined and a collision free trajectory computed. In other embodiments, a neural network can determine a physics-based-model trajectory of the tool after release and a collision free trajectory of the robot can be computed based on this assumed tool trajectory. In another embodiment, the tool is assumed to remain in a static location after release and a collision free trajectory of the robot is computed based on this assumption.

[0065] Fig. 5 is a diagram illustrating an example embodiment of a system 500 implementing an example embedment of the disclosed method for attaching to a tool. Set of tools 501 includes target tool 50 la. A controller 502 includes memory and a processor. The memory and a processor of the controller 502 can receive, store and execute computer code instructions. The computer code instructions include at least one of the methods disclosed for determining the pose of a tool and calculating a trajectory. The memory and a processor that the computer code instructions are not necessarily part of the controller 502 and in some embodiments, are located on a sperate component.

[0066] The controller 502 is attached and controls to articulated robot arm 503. In other embodiments, the controller 502 is remote from the robot arm 503. Robot arm 503 includes interface 504 that is configured to removably mate with set of tools 501 including target tool 50 la. System 500 includes vision system 505. Vision system 505 may be a camera combined with a depth sensor such as a Microsoft Kinect.

[0067] The vision system 505 is configured to take at least one image of an environment containing the set of tools 501 including the target tool 50 la. The pose of target tool 50 la can be determined from the images taken by vision system 505. Based on the determined pose of the target tool 50 la an attachment pose 507b for the interface 504 can be determined. The attachment pose 507b is the location and orientation where the mating mechanism of interface 504 can successfully attach to target tool 50 la. Both the pose of target tool 50 la and the attachment pose 507b of the interface 504 are determined by the controller 502.

However, one skilled in the art will understand that the pose determination methods disclosed herein can executed by a range of computer types and similar components.

[0068] After, the attachment pose 507b is determined, it is inputted, along with current pose 507a of interface 504 and the mechanical requirements for attaching interface 504 to target tool 50 la, to a trajectory planning method which calculates a trajectory 506 for robot arm 503. Additional constraints can be used as inputs for the trajectory planning method to ensure trajectory 506 has specific traits, properties, and/or accomplishes additional tasks. When robot arm 503 executes trajectory 506 it will move robot interface 504 from its current position 507a to the attachment position 507b and mate the interface with target tool 50 la. In some embodiments, proximity sensors are mounted on mounted on tools 501, robot interface 504, and robot arm 503 and provide feedback to show that the mating was successful. The robot arm 503, robot interface 504, and/or tools 501 can also include force, torque, and/or visual sensors to provide feedback regarding mating success and to assist in the mating process.

[0069] A novel aspect of the approach of the present disclosure is that the method locates the tools before the tools are attached to the arm, and serves as a replacement for a physical tool changer interface.

[0070] The teachings of all patents, published applications and references cited herein are incorporated by reference in their entirety.

[0071] While example embodiments have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the embodiments encompassed by the appended claims.