SYSTEMS AND METHODS FOR FEW-SHOT TRANSFER LEARNING

Title:

SYSTEMS AND METHODS FOR FEW-SHOT TRANSFER LEARNING

Document Type and Number:

WIPO Patent Application WO/2020/091871

Kind Code:

Abstract:

A method for training a controller to control a robotic system includes: receiving a neural network of an original controller for the robotic system based on origin data samples from an origin domain and labels in a label space, the neural network including encoder and classifier parameters, the neural network being trained to: map an input data sample from the origin domain to a feature vector in a feature space using the encoder parameters; and assign a label of the label space to the input data sample using the feature vector based on the classifier parameters; updating the encoder parameters to minimize a dissimilarity, in the feature space, between: origin feature vectors computed from the origin data samples; and target feature vectors computed from target data samples from a target domain; and updating the controller with the updated encoder parameters to control the robotic system in the target domain.

Inventors:

KOLOURI SOHEIL (US)
ROSTAMI MOHAMMAD (US)
KIM KYUNGNAM (US)

Application Number:

PCT/US2019/045169

Publication Date:

May 07, 2020

Filing Date:

August 05, 2019

Export Citation:

Click for automatic bibliography generation Help

Assignee:

HRL LAB LLC (US)

International Classes:

G06N3/08; G06N3/04

Other References:

MUREZ ZAK ET AL: "Image to Image Translation for Domain Adaptation", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, IEEE, 18 June 2018 (2018-06-18), pages 4500 - 4509, XP033473360, DOI: 10.1109/CVPR.2018.00473
JIAN SHEN ET AL: "Wasserstein Distance Guided Representation Learning for Domain Adaptation", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 5 July 2017 (2017-07-05), XP081309302
ISHAN DESHPANDE ET AL: "Generative Modeling using the Sliced Wasserstein Distance", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 29 March 2018 (2018-03-29), XP080860446
JIQING WU ET AL: "Sliced Wasserstein Generative Models", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 8 June 2017 (2017-06-08), XP081151094
ZHANG WEICHEN ET AL: "Collaborative and Adversarial Network for Unsupervised Domain Adaptation", 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, IEEE, 18 June 2018 (2018-06-18), pages 3801 - 3809, XP033476351, DOI: 10.1109/CVPR.2018.00400
ZOU YANG ET AL: "Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-training", 7 October 2018, INTELLIGENT VIRTUAL AGENT. IVA 2015. LNCS; [LECTURE NOTES IN COMPUTER SCIENCE; LECT.NOTES COMPUTER], SPRINGER, BERLIN, HEIDELBERG, PAGE(S) 297 - 313, ISBN: 978-3-642-17318-9, XP047488865
KOLOURI, SOHEILYANG ZOUGUSTAVO K. ROHDE: "Sliced Wasserstein kernels for probability distributions", PROCEEDINGS OF THE IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 2016
HAGAN, M.T.MENHAJ, M.B.: "Training feedforward networks with the Marquardt algorithm", IEEE TRANSACTIONS ON NEURAL NETWORKS, vol. 5, no. 6, 1994, pages 989 - 993, XP000476812, doi:10.1109/72.329697
LECUN, YANN ET AL.: "Backpropagation applied to handwritten zip code recognition", NEURAL COMPUTATION, vol. 1.4, 1989, pages 541 - 551, XP000789854
KOLOURI, S.MARTIN, C. E.ROHDE, G. K.: "Sliced-Wasserstein Autoencoder: An embarrassingly simple generative model", ARXIV PREPRINT ARXIV:1804.01947, 2018
Y. LECUNL. BOTTOUY. BENGIOP. HAFFNER: "Gradient-based learning applied to document recognition", PROCEEDINGS OF THE IEEE, vol. 86, no. 11, November 1998 (1998-11-01), pages 2278 - 2324, XP000875095, doi:10.1109/5.726791
Y. NETZERT. WANGA. COATESA. BISSACCOB. WUA. Y. NG: "Reading Digits in Natural Images with Unsupervised Feature Learning", NIPS WORKSHOP ON DEEP LEARNING AND UNSUPERVISED FEATURE LEARNING, 2011
SCHWEGMANN, C.KLEYNHANS, W.SALMON, B.MDAKANE, L.MEYER, R.: "Very deep learning for ship discrimination in synthetic aperture radar imagery", IEEE INTERNATIONAL GEO. AND REMOTE SENSING SYMPOSIUM, 2016, pages 104 - 107, XP032990252, doi:10.1109/IGARSS.2016.7729017
GHIFARY, M.KLEIJN, W. B.ZHANG, M.BALDUZZI, D.LI, W.: "European Conference on Computer Vision", 2016, SPRINGER, article "Deep reconstruction-classification networks for unsupervised domain adaptation", pages: 597 - 613
HULL, JONATHAN J.: "A database for handwritten text recognition research", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 16.5, 1994, pages 550 - 554
SIMARD, P. Y.STEINKRAUS, D.PLATT, J. C.: "Best practices for convolutional neural networks applied to visual document analysis", SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, 2003. PROCEEDINGS, 2003, pages 958 - 963, XP010656898, doi:10.1109/ICDAR.2003.1227801

Attorney, Agent or Firm:

LEE, Shaun P. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

WHAT IS CLAIMED IS:

1. A method for training a controller to control a robotic system in a target domain, the method comprising:

receiving a neural network of an original controller for controlling the robotic system based on a plurality of origin data samples from an origin domain and corresponding labels in a label space the neural network of the original controller comprising a plurality of encoder parameters and a plurality of classifier parameters, the neural network being trained to:

map an input data sample from the origin domain to a feature vector in a feature space in accordance with the encoder parameters; and

assign a label of the label space to the input data sample based on the feature vector in accordance with the classifier parameters;

updating the encoder parameters to minimize a dissimilarity, in the feature space, between:

a plurality of origin feature vectors computed from the origin data samples; and

a plurality of target feature vectors computed from a plurality of target data samples from the target domain, the target data samples having a smaller cardinality than the origin data samples; and

updating the controller with the updated encoder parameters to control the robotic system in the target domain.

2. The method of claim 1 , wherein the dissimilarity is computed in accordance with a sliced Wasserstein distance between the origin feature vectors in the feature space and the target feature vectors in the feature space.

3. The method of claim 1 , wherein the updating the encoder parameters comprises iteratively computing a plurality of intermediate encoder parameters, each iteration comprising:

computing the origin feature vectors in the feature space;

computing the target feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing the dissimilarity between the origin feature vectors and the target feature vectors;

updating the intermediate encoder parameters to reduce the dissimilarity between the origin feature vectors and the target feature vectors;

determining whether the dissimilarity is minimized; in response to determining that the dissimilarity is not minimized, proceeding with another iteration with the updated intermediate encoder parameters as the intermediate encoder parameters; and

in response to determining that the dissimilarity is minimized, outputting the intermediate encoder parameters as the updated encoder parameters.

4. The method of claim 3, wherein the dissimilarity is computed in accordance with a sliced Wasserstein distance between the origin feature vectors in the feature space and the target feature vectors in the feature space.

5. The method of claim 3, wherein the computing the origin feature vectors is performed by an origin encoder.

6. The method of claim 3, wherein the computing the origin feature vectors is performed in accordance with the intermediate encoder parameters.

7. The method of claim 1 , wherein the target data samples comprise a plurality of target samples and a plurality of corresponding target labels.

8. The method of claim 1 , wherein the target data samples comprise a plurality of unlabeled target samples.

9. The method of claim 8, wherein the updating the encoder parameters comprises iteratively computing a plurality of intermediate encoder parameters, each iteration comprising:

computing the origin feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing the target feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing predicted labels for the target feature vectors in accordance with the classifier parameters, each of the predicted labels being associated with a confidence;

defining a plurality of pseudo-labels corresponding to the predicted labels having confidences exceeding a threshold;

updating the intermediate encoder parameters based on at least one of: minimizing a dissimilarity between the origin feature vectors and the target feature vectors; and

minimizing a classification loss of the origin data samples; determining whether a stopping condition has been met, wherein the stopping condition comprises at least one of:

a dissimilarity between the origin feature vectors and the target feature vectors; and

a saturation of a number of the pseudo-labels between iterations;

in response to determining that the stopping condition has not been met, proceeding with another iteration with the updated intermediate encoder parameters as the intermediate encoder parameters; and

in response to determining that the stopping condition is met, outputting the intermediate encoder parameters as the updated encoder parameters.

10. The method of claim 9, wherein the updating the intermediate encoder parameters alternates between:

the minimizing the dissimilarity between the origin feature vectors and the target feature vectors; and

the minimizing the classification loss of the origin data samples.

11. The method of claim 1 , wherein the neural network comprises a convolutional neural network, a recurrent neural network, a capsule network, or combinations thereof.

12. A system for training a controller to control a robotic system in a target domain, the system comprising:

a processor; and

non-volatile memory storing instructions that, when executed by the processor, cause the processor to:

receive a neural network of an original controller for controlling the robotic system based on a plurality of origin data samples from an origin domain and corresponding labels in a label space, the neural network of the original controller comprising a plurality of encoder parameters and a plurality of classifier parameters, the neural network being trained to:

map an input data sample from the origin domain to a feature vector in a feature space in accordance with the encoder parameters; and assign a label of the label space to the input data sample based on the feature vector in accordance with the classifier parameters;

update the encoder parameters to minimize a dissimilarity between: a plurality of origin feature vectors computed from the origin data samples; and a plurality of target feature vectors computed from a plurality of target data samples from the target domain, the target data samples having a smaller cardinality than the origin data samples; and

update the controller with the updated encoder parameters to control the robotic system in the target domain.

13. The system of claim 12, wherein the dissimilarity is computed in accordance with a sliced Wasserstein distance between the origin feature vectors in the feature space and the target feature vectors in the feature space.

14. The system of claim 12, wherein the instructions that cause the processor to update the encoder parameters comprise instructions that, when executed by the processor cause the processor to iteratively compute a plurality of intermediate encoder parameters, each iteration comprising:

computing the origin feature vectors in the feature space;

computing the target feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing the dissimilarity between the origin feature vectors and the target feature vectors;

updating the intermediate encoder parameters to reduce the dissimilarity between the origin feature vectors and the target feature vectors;

determining whether the dissimilarity is minimized;

in response to determining that the dissimilarity is not minimized, proceeding with another iteration with the updated intermediate encoder parameters as the intermediate encoder parameters; and

in response to determining that the dissimilarity is minimized, outputting the intermediate encoder parameters as the updated encoder parameters.

15. The system of claim 12, wherein the target data samples comprise a plurality of target samples and a plurality of corresponding target labels.

16. The system of claim 12, wherein the target data samples comprise a plurality of unlabeled target samples.

17. The system of claim 16, wherein the instructions that cause the processor to update the encoder parameters comprise instructions that, when executed by the processor, cause the processor to compute the updated encoder parameters by iteratively computing a plurality of intermediate encoder parameters, each iteration comprising: computing the origin feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing the target feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing predicted labels for the target feature vectors in accordance with the classifier parameters, each of the predicted labels being associated with a confidence;

defining a plurality of pseudo-labels corresponding to the predicted labels having confidences exceeding a threshold;

updating the intermediate encoder parameters based on at least one of:

minimizing a dissimilarity between the origin feature vectors and the target feature vectors; and

minimizing a classification loss of the origin data samples; determining whether a stopping condition has been met, wherein the stopping condition comprises at least one of:

a dissimilarity between the origin feature vectors and the target feature vectors; and

a saturation of a number of the pseudo-labels between iterations;

in response to determining that the stopping condition has not been met, proceeding with another iteration with the updated intermediate encoder parameters as the intermediate encoder parameters; and

in response to determining that the stopping condition is met, outputting the intermediate encoder parameters as the updated encoder parameters.

18. The system of claim 17, wherein the updating the intermediate encoder parameters alternates between:

the minimizing the dissimilarity between the origin feature vectors and the target feature vectors; and

the minimizing a classification loss of the origin data samples.

19. The system of claim 12, wherein the neural network comprises a convolutional neural network, a recurrent neural network, a capsule network, or combinations thereof.

20. A non-transitory computer readable medium having instructions stored thereon that, when executed by a processor, cause the processor to:

receive a neural network of an original controller for controlling a robotic system based on a plurality of origin data samples from an origin domain and corresponding labels in a label space, the neural network of the original controller comprising a plurality of encoder parameters and a plurality of classifier parameters, the neural network being trained to:

map an input data sample from the origin domain to a feature vector in a feature space in accordance with the encoder parameters; and

assign a label of the label space to the input data sample based on the feature vector in accordance with the classifier parameters;

update the encoder parameters to minimize a dissimilarity between:

a plurality of origin feature vectors computed from the origin data samples; and

a plurality of target feature vectors computed from a plurality of target data samples from a target domain, the target data samples having a smaller cardinality than the origin data samples; and

update the controller with the updated encoder parameters to control a robotic system in the target domain.

21. The non-transitory computer readable medium of claim 20, wherein the dissimilarity is computed in accordance with a sliced Wasserstein distance between the origin feature vectors in the feature space and the target feature vectors in the feature space.

22. The non-transitory computer readable medium of claim 20, wherein the instructions that cause the processor to update the encoder parameters comprise instructions that, when executed by the processor cause the processor to iteratively compute a plurality of intermediate encoder parameters, each iteration comprising: computing the origin feature vectors in the feature space;

computing the target feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing the dissimilarity between the origin feature vectors and the target feature vectors;

updating the intermediate encoder parameters to reduce the dissimilarity between the origin feature vectors and the target feature vectors;

determining whether the dissimilarity is minimized;

in response to determining that the dissimilarity is not minimized, proceeding with another iteration with the updated intermediate encoder parameters as the intermediate encoder parameters; and

in response to determining that the dissimilarity is minimized, outputting the intermediate encoder parameters as the updated encoder parameters.

23. The non-transitory computer readable medium of claim 20, wherein the target data samples comprise a plurality of target samples and a plurality of corresponding target labels.

24. The non-transitory computer readable medium of claim 20, wherein the target data samples comprise a plurality of unlabeled target samples.

25. The non-transitory computer readable medium of claim 24, wherein the instructions that cause the processor to update the encoder parameters comprise instructions that, when executed by the processor, cause the processor to compute the updated encoder parameters by iteratively computing a plurality of intermediate encoder parameters, each iteration comprising:

computing the origin feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing the target feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing predicted labels for the target feature vectors using the classifier parameters, each of the predicted labels being associated with a confidence;

defining a plurality of pseudo-labels corresponding to the predicted labels having confidences exceeding a threshold;

updating the intermediate encoder parameters based on at least one of: minimizing a dissimilarity between the origin feature vectors and the target feature vectors; and

minimizing a classification loss of the origin data samples; determining whether a stopping condition has been met, wherein the stopping condition comprises at least one of:

a dissimilarity between the origin feature vectors and the target feature vectors; and

a saturation of a number of the pseudo-labels between iterations; in response to determining that the stopping condition has not been met, proceeding with another iteration with the updated intermediate encoder parameters as the intermediate encoder parameters; and

in response to determining that the stopping condition is met, outputting the intermediate encoder parameters as the updated encoder parameters.

26. The non-transitory computer readable medium of claim 25, wherein the updating the intermediate encoder parameters alternates between: the minimizing the dissimilarity between the origin feature vectors and the target feature vectors; and

the minimizing the classification loss of the origin data samples.

27. The system of claim 20, wherein the neural network comprises a convolutional neural network, a recurrent neural network, a capsule network, or combinations thereof.

AMENDED CLAIMS

received by the International Bureau on 07 JAN 2020 (07.01.2020) WHAT IS CLAIMED IS:

1. A computer-implemented method for training a controller to control a robotic system in a target domain, the method comprising:

receiving a neural network of an original controller for controlling the robotic system based on a plurality of origin data samples from an origin domain and corresponding labels in a label space^ the neural network of the original controller comprising a plurality of encoder parameters and a plurality of classifier parameters, the neural network being trained to;

map an input data sample from the origin domain to a feature vector in a feature space in accordance with the encoder parameters; and

assign a label of the label space to the input data sample based on the feature vector in accordance with the classifier parameters;

updating the encoder parameters to minimize a dissimilarity, in the feature space, between:

a plurality of origin feature vectors computed from the origin data samples; and

a plurality of target feature vectors computed from a plurality of target data samples from the target domain, the target data samples having a smaller cardinality than the origin data samples,

wherein the dissimilarity is computed in accordance with a sliced Wasserstein distance between the origin feature vectors in the feature space and the target feature vectors in the feature space; and

updating the controller with the updated encoder parameters to control the robotic system in the target domain.

2. (Cancelled)

3. The method of claim 1 , wherein the updating the encoder parameters comprises iteratively computing a plurality of intermediate encoder parameters, each iteration comprising:

computing the origin feature vectors in the feature space; computing the target feature vectors in the feature space in accordance with the intermediate encoder parameters;

computing the dissimilarity between the origin feature vectors and the target feature vectors;

updating the intermediate encoder parameters to reduce the dissimilarity between the origin feature vectors and the target feature vectors;

determining whether the dissimilarity is minimized;

in response to determining that the dissimilarity is not minimized, proceeding with another iteration with the updated intermediate encoder parameters as the intermediate encoder parameters; and

in response to determining that the dissimilarity is minimized, outputting the intermediate encoder parameters as the updated encoder parameters.

5. The method of claim 3, wherein the computing the origin feature vectors is performed by an origin encoder.

6. The method of claim 3, wherein the computing the origin feature vectors is performed in accordance with the intermediate encoder parameters.

7. The method of claim 1 , wherein the target data samples comprise a plurality of target samples and a plurality of corresponding target labels.

8. The method of claim 1 , wherein the target data samples comprise a plurality of unlabeled target samples.

9. The method of claim 8, wherein the updating the encoder parameters comprises iteratively computing a plurality of intermediate encoder parameters, each iteration comprising: computing the origin feature vectors in the feature space in accordance with the intermediate encoder parameters;