Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A LEARNING-BASED METHOD AND SYSTEM FOR PATH PLANNING OF AN AUTONOMOUS TRACTOR-TRAILER
Document Type and Number:
WIPO Patent Application WO/2023/192987
Kind Code:
A1
Abstract:
A path planning approach based on semi-supervised learning including a trained encoder-decoder type of deep neural network to generate and plan paths with the objective to minimize the off-track of the tractor-trailer swept area. The encoder encodes input information such as lane markings, static obstacles, and potentially other features, and pass it to the decoder to generate a planned path. A path cost function scores and penalizes each network-generated path based on its deviation from the lane center, the path smoothness and collision with any static obstacles, and backpropagates the cost of the paths through the encoder-decoder network to train it. As the path cost function acts as a critic of the path quality, no collected data from expert driving for training is required, but only randomly generated samples of many possible combinations of lane shapes and obstacles arrangements.

Inventors:
ZHANG XIAN (US)
Application Number:
PCT/US2023/065210
Publication Date:
October 05, 2023
Filing Date:
March 31, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CONTINENTAL AUTONOMOUS MOBILITY US LLC (US)
International Classes:
G06N3/0455; B60W30/08; B62D53/00; G06N3/0442; G06N3/084; G06N3/0895; G06Q10/047; G08G1/16
Other References:
SUNG INKYUNG ET AL: "On the training of a neural network for online path planning with offline path planning algorithms", INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, ELSEVIER SCIENCE LTD, GB, vol. 57, 5 June 2020 (2020-06-05), XP086492083, ISSN: 0268-4012, [retrieved on 20200605], DOI: 10.1016/J.IJINFOMGT.2020.102142
LI PENG ET AL: "Human-like motion planning of autonomous vehicle based on probabilistic trajectory prediction", APPLIED SOFT COMPUTING, ELSEVIER, AMSTERDAM, NL, vol. 118, 29 January 2022 (2022-01-29), XP086979460, ISSN: 1568-4946, [retrieved on 20220129], DOI: 10.1016/J.ASOC.2022.108499
CEN HANGJIE ET AL: "Optimization-based Maneuver Planning for a Tractor-Trailer Vehicle in Complex Environments using Safe Travel Corridors", 2021 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), IEEE, 11 July 2021 (2021-07-11), pages 974 - 979, XP034005990, DOI: 10.1109/IV48863.2021.9575439
Attorney, Agent or Firm:
BEZAK, Christopher J et al. (US)
Download PDF:
Claims:
CLAIMS

1 . A method for training a path planning module of a tractor-trailer combination, the path planning module including a neural network, the neural network receiving input data comprising lane marking information for a target lane of a roadway and generating output data comprising a reference path for autonomous movement by the tractor-trailer combination along the target lane, the method comprising: based on the input data and the output data, determining a cost value associated with the reference path and providing the cost value to the neural network, and updating the neural network based upon the cost value.

2. The method of claim 1 , wherein the neural network comprises an encoderdecoder architecture in which an encoder portion and a decoder portion receives the input data and the decoder portion generates the output data, and updating the neural network comprises updating parameters of the encoder portion and the decoder portion by backpropagating the cost value through the encoder and decoder portions to reduce the cost value.

3. The method of claim 1 , wherein the cost value is determined based on a set of criteria including collision avoidance data relative to at least one static obstacle in or near the target lane.

4. The method of claim 1 , wherein the cost value is determined based on a set of criteria including deviation of a tractor and a trailer of the tractor-trailer combination from the reference path.

5. The method of claim 4, wherein a path of the trailer is inferred from the path of the tractor based on a kinematic model associated with the tractor-trailer combination.

6. The method of claim 1 , wherein the cost value is represented as a linear combination of a plurality of individual costs.

7. The method of claim 1 , wherein the input data comprises randomly generated data without expert driving data.

8. An autonomous driving system for a tractor-trailer combination, comprising: a path planner module which generates a reference path based on input data corresponding to a roadway and data corresponding to at least one static obstacle disposed along the roadway, the path planner module comprising a neural network, the neural network receiving the input and generating output data comprising a reference path for autonomous movement by the tractor-trailer combination along a target lane of the roadway; and a path cost function module coupled to the neural network and which receives the input data and the output data, determines a cost value associated with the reference path and provides the cost value to the neural network for updating the neural network, the path cost function module being used to train the neural network during a training phase thereof.

Description:
A LEARNING-BASED METHOD AND SYSTEM FOR PATH PLANNING OF AN

AUTONOMOUS TRACTOR-TRAILER

BACKGROUND

1. Field

[0001] The present application relates to a method and system for training a deep neural network to generate paths for tractor-trailer combinations to minimize or otherwise reduce off-track of the tractor and the swept area of the trailer.

2. Description of Related Art

[0002] The trucking industry has been seen as the most promising for deploying autonomous driving technology. Articulated structures and larger dimensioned semitrucks (tractor-trailer combinations) may lead to off-tracking (i.e. , on-road driving) without proper planning. Current path/trajectory planners for normal or passenger vehicles do not consider exceedingly large dimensions and could result in infeasible trailer paths, while complex planners, e.g., those based on MPC algorithms, usually require large computational power, to meet the real-time requirements.

[0003] While the problems of perception, mapping and location, prediction are largely the same for the autonomous semi-trucks as for the autonomous passenger cars, with tractor-trailer combinations (see FIG. 1 for an example), new challenges do rise when planning the path / trajectory due to the articulated structures and large dimensions of the systems. Human drivers, when maneuvering these tractor trailer systems, tend to go to the outer side of the lane when making a turn, since the trailer tends to sweep the inner side of the turn. However, current path planners designed for normal autonomous cars do not consider these attached trailers, and if applied directly to autonomous semi-trucks, could lead to infeasible trailer paths that go off track, especially when making sharp turns. While numerical optimization approaches such as MPC-based path planning algorithms do exist, these complex planners usually require large computation resources to meet real-time requirements.

SUMMARY

[0004] Example embodiments are directed to a method and system for training a deep neural network to generate paths for tractor-trailer combinations to minimize or otherwise reduce off-track of the tractor and the swept area of the trailer.

[0005] According to an aspect of an embodiment of the present disclosure, there is provided a method of training a neural network that, at inference time, can relatively quickly generate a feasible path with the objective to minimize the off-track of the tractor and the trailer swept areas and avoid collisions with static obstacles. When a human driver plans such a low-level path or trajectory, there is not always a complex, meticulous calculation involved, for most cases, the plan is made rather quickly, out of the driver’s past experiences, or intuition.

[0006] According to an aspect of an embodiment of the present application, there is provided an Al path planner having an encoder-decoder architecture that generates feasible paths for autonomous semi-trucks to minimize the off-track of the tractor and the trailer swept areas and at the same time avoid collisions with static obstacles.

[0007] According to an aspect of an embodiment of the present application, there is provided a path cost function that evaluates the “true” cost of a path in an absolute, objective sense and use it to guide the training of the Al path planner. This eliminates the need for expert driving data collection/augmentation and enables training on completely faked data.

[0008] According to an aspect of an embodiment of the present application, there is provided a method in which the corresponding trailer path is inferred from a given tractor path using the kinematic model of the tractor-trailer system, which plays a key role in constructing the path cost function.

[0009] [Path planning for tractor-trailer systems]

[0010] Literature in this area focused mainly on off-road, or unstructured driving of tractortrailer systems. Classical motion planning approaches, which can be broadly grouped into three categories - graph search-based (e.g., hybrid A*), incremental sampling and searching (e.g., rapidly exploring random trees, or RRT), and optimization-based (e.g., model predictive control, or MPC) - are employed and extended to tackle this new problem with consideration of the size and the dynamics of the attached trailer.

[0011] For on-road, or structured driving of tractor-trailer systems, only a few papers exist in the literature investigating the path planning of the systems, all of which are optimization-based. In one paper, the path planning problem is formulated as a nonlinear Optimal Control Problem (OCP) with the control objective being minimizing the off-track of the vehicle bodies swept area and avoiding collision with obstacles. The OCP problem is then solved using a Sequential Quadratic Programming (SQP) approach.

[0012] [Motion planning with deep learning]

[0013] While the majority work applying deep learning techniques to autonomous driving problems focus on perception, recent years have seen more and more applications of deep learning to prediction and planning as well. Two of the most representative deep learning paradigms for motion planning are imitation learning and deep reinforcement learning.

[0014] With imitation learning, the idea is to replicate the driving behaviors of some driving experts, usually human drivers, via supervised learning. However, pure imitation is insufficient for handling complex driving scenarios even with huge amount of expert driving data, where in a prior proposed system, ChauffeurNet, a neural network that takes map and perceived environment information in the form of pixel images as input and outputs a planned trajectory at each time step. Rather than purely imitating all data, the authors synthesized the collected data in the form of perturbations to the expert’s driving and augment the imitation loss with additional losses that penalize undesirable events, leading to robustness of the learned model and achieved improved performance.

[0015] In contrast, a deep reinforcement learning (DRL) based agent learns to drive a vehicle without any supervision, but by exploring a simulated environment with defined reward functions. A trained DRL agent does not explicitly plan a path or trajectory, rather, it learns a policy that maps a state to an action, executing which would yield a trajectory implicitly. The “action” can be a low-level action such as a steering wheel angle command, or a high-level action such as abstracted driving strategies. In Schwartz (2016), the authors decomposed their motion planner into a learned driving strategies planner and a trajectory planner with hard constraints that is not learned. The DRL was then applied to learn a policy that maps a state in the (agnostic) state space into a set of Desires, which basically represent high-level driving strategies. The goal of Desires is to enable comfort of driving, while hard constraints guarantee the safety of driving. BRIEF DESCRIPTION OF THE DRAWINGS

[0016] The above and other aspects will be more clearly understood from the following brief description taken in conjunction with the accompanying drawings, in which:

[0017] FIG. 1 is a diagram of a conventional semi-truck and tractor-trailer system;

[0018] FIG. 2 is a diagram of input features and an output path of a network, according to an embodiment;

[0019] FIG. 3 is a diagram of an encoder-decoder architecture for an artificial intelligence (Al) path planner, according to an embodiment;

[0020] FIG. 4 is a diagram illustrating training pipelines, according to an embodiment;

[0021] FIG. 5 is a diagram illustrating a truck-trailer tricycle model, according to an embodiment;

[0022] FIG. 6 is a diagram illustrating data samples, according to an embodiment;

[0023] FIG. 7 is a diagram illustrating data set results, according to an embodiment;

[0024] FIG. 8 is a diagram illustrating a simulation scene, according to an embodiment;

[0025] FIGS. 9A and 9B are diagrams illustrating an encoder portion of a neural network and a decoder portion of a neural network, according to an embodiment; and

[0026] FIG. 10 is a block diagram of an autonomous driving system, according to an embodiment.

DETAILED DESCRIPTION

[0027] Example embodiments of the present disclosure are directed to a so-called “semisupervised” learning approach for path planning of a tractor trailer system. A neural network is configured and trained which maps lane features and nearby static obstacles to a feasible path with the objective to minimize or otherwise reduce the off-track of the tractor and the trailer swept areas and avoid collisions with static obstacles. Vector form representations of the input and the output of the neural network and an encoder-decoder network architecture are described below. In addition, the training of the neural network utilizes a path cost function. The path cost function and the kinematic model of the tractor trailer system is provided in playing an important role in constructing the path cost function.

[0028] [Input output representation]

[0029] The input to the network, i.e. , lane features and static obstacles, use intermediate and vector representation. As pointed out in earlier systems, using vector representation avoids lossy rendering and computationally intensive encoding steps; while using intermediate representation enhances the transferability of the model compared to using raw sensor data, meaning a model trained with intermediate representation in simulation to real scenarios expecting similar model performance may be more easily applied. Also, using vector and intermediate representation facilitates the fake data generation used for model training, as described below.

[0030] As depicted in FIG. 2, a series of waypoints (each point in x, y coordinate) are used to represent the target lane center, or the reference path the tractor-trailer is trying to follow (in colored rendition: black crosses are behind the current tractor rear axle center position, and cyan ahead); a six pointed hexagons is used to represent a static obstacle; and the output planned path is represented using a series of waypoints (circular dots). To avoid confusion, The input reference path or “anchor path,” as each waypoint in the planned path corresponds to and is supposed to stay not too far from (and thus anchor) a waypoint in the input reference path (except for the first point of the planned path). The output planned path is for the tractor (as opposed to the trailer) and can be used as a reference path to a lower-level trajectory planner, which adds time or velocity information to the path.

[0031] Note that everything is in the tractor coordinate at current step, with the origin at the rear axle center of the tractor and the x-axis aligning with the tractor body heading. Therefore, the planned path would always start from the origin of the coordinate, i.e. , (xO, yO) = (0, 0).

[0032] [Model architecture]

[0033] An encoder-decoder type of neural network is used as the path planner, with all the encoders and the decoder implemented using gated recurrent units (GRUs). For clarity, the GRU cells are unrolled in Fig 3. with corresponding inputs and outputs denoted. In the encoder case, each cell takes a point (either a waypoint or a vertex of an obstacle hexagon, in the form of x, y coordinates) as its input and produces a hidden state; in the case of the decoder, the input to a cell at each step is the concatenation of the output from the last step and the corresponding anchor waypoint at this step, while the output at this step is the sum of the GRU cell output and the anchor point at this step. [0034] FIG. 3 illustrates a drawing of the encoder and a drawing of the decoder portions of the neural network.

[0035] For simplicity, all the static obstacle encodings are combined using the summation operation, the encoded obstacles are then concatenated with the encoded lane to form the context vector the decoder uses to generate the planned path. Other ways to combine the encodings from different encoders may also be used, e.g., graph neural network (GNN), Self-Attention mechanism. [0036] [The semi-supervised approach]

[0037] Once the model is ready, the next problem is to train the model to generate a desired path that, if the tractor follows, yields minimal or otherwise reduced off-track of the tractor and the trailer swept areas and avoids collisions with static obstacles, while being smooth and feasible at the same time.

[0038] With expert driving data, it is straightforward to train the model in a supervised learning fashion, with the expert driven path as the target, and a function such as a mean squared error (MSE) as the loss function, which basically evaluates the point-to-point proximity between the network-generated path and the target path. Any discrepancy between the two path is calculated as the training loss and is backpropagated through the encoder-decoder network for the weight update. The training pipeline for the supervised approach is illustrated in FIG. 4(a).

[0039] The supervised approach is a pure imitation of the “expert” driving and while works for the simplest scenarios (i. e. , without static obstacles), suffers from several drawbacks: 1 ) it requires collection of a large amount of data of “good” driving with enough “interesting” scenarios (turns of various curvatures); 2) it might learn the “random wandering” about the lane center, as human drivers never stay perfectly aligned with the lane center; 3) since the network learns from the expert, it can never hope to better the expert driving; and 4) it is extremely hard, if not impossible, to teach the network to avoid static obstacles using the supervised approach, as the training data needed for that purpose would require so many combinations of the lane shapes and different sizes and placements of static obstacles with respect to the lanes that it is impractical to collect from real vehicle tests. [0040] To overcome these drawbacks and difficulties, the semi-supervised approach is utilized where the MSE kind of loss function (that requires a target path from expert driving and evaluates the discrepancy between the network output path the target path) is replaced with an absolute, “true” path cost function that evaluates the quality of a path without having to compare it to some expert path, but by applying a set of predefined criteria. The training pipeline for the semi-supervised approach is illustrated in FIG. 4(b). [0041] The path cost function calculates a cost for each path based on its relationship with the anchor path (e.g., the deviation from the anchor path) and its own attributes (e.g., the path smoothness). The higher the cost, the “worse” the path: a path with 0 cost is a “perfect” path (in terms of the definition of the cost function). The path cost is then backpropagated through the neural network to guide the end-to-end training, updating the parameters of the encoders and the decoder in the direction that would minimize or otherwise reduce the cost. This, of course, requires a design of a differentiable path cost function, discussed below.

[0042] [Path cost function as the loss function]

[0043] The path cost function is configured with the following criteria of a “good” path in mind: a good path should be safe (collision free), rule-following (e.g., keeping in the lane), feasible and comfortable (smooth). Based on this set of criteria, the total cost of a path may be represented as a linear combination of several individual costs, as in the following equation, each corresponding to one criterion.

[0044] [0045] Each individual cost is described in detail below.

[0046] [Path deviation cost]

[0047] The term consists of the path deviation cost. It penalizes the deviation of both the tractor path and the trailer path from the anchor path and corresponds to the rule-following criterion.

[0048] The deviation of a path from the anchor path is quantified using the mean squared lateral distances of all the points of that path to the anchor path. In the case of the tractor path, the deviation cost can be written as:

[0049]

[0050] is the j-th point of the tractor path; the anchor path pathanchor is piece-wise linear as we represent it with a series of waypoints; and lat_dist(-,-) is a function that calculates the lateral distance of a point to a path.

[0051] Note the Al path planner plans a path for the tractor to follow, i.e., the network outputs only the tractor path. To evaluate the trailer path deviation from the anchor path, the trailer path is inferred from the output tractor path, which would require a kinematic model of the tractor-trailer system, which is described below.

[0052] [Tractor yaw acceleration cost] [0053] The term is the tractor yaw acceleration cost and enforces smoothness of the planned path. The term C yaw-acc can be computed as the mean squared yaw acceleration at all (tractor) path points:

[0054] where is the delta distance between path point j and is the tractor yaw rate at path point j and is defined as factor yaw angle is in turn defined as

[0055] [Collision cost]

[0056] The term is the collision cost that promotes collision avoidance of the tractor and the trailer bodies with static obstacles and corresponds to the safe criterion. With n static obstacles, C collision can be written as

[0057] where is the collision cost incurred by the /-th static obstacle with the tractor body the trailer), and is implemented as a penalty to small distance between an obstacle and the tractor path:

[0058] where is the j-th vertex point of the /-th static obstacle; k and thresh are tunable constants. Obstacle-to-path distance smaller than the threshold distance is penalized linearly with respect to the distance difference (obstacle-to-path distance minus thresh), while obstacle-to-path distance larger than the threshold distance is not penalized.

[0059] [Other costs]

[0060] In addition to the three main costs introduced before, some other costs are included (grouped in the Cother term) for different purposes, namely initial yaw cost and anchor points approximation cost, which we only briefly mention here without going to details.

[0061] The initial yaw cost penalizes large yaw angle at the first point of the planned path and can be written as ||0 O || 2 . This cost enforces a (soft) constraint that the tractor path starts with 0 heading since we chose the tractor coordinate at the current time as our coordinate system (FIG. 2).

[0062] The anchor points approximation cost is equivalently a regularization term and is simply the mean squared error of the tractor path and the anchor path, or [0063] When implemented properly, all the individual costs are differentiable, thus the total cost can be backpropagated through the neural network for training. In addition to being used as the loss function in training time, the path cost function can also be used in execution time to evaluate the quality of a planned path to see if it’s feasible.

[0064] [Tractor-Trailer Model and Trailer Path Inference]

[0065] An integral piece in constructing the path cost function is a kinematic model for the tractor-trailer system, as it enables the inference of the corresponding trailer path from a given tractor path.

[0066] Similar to the idea of a bicycle model for two-axle vehicles, a tricycle model is used to represent the articulated structure of a truck and trailer system, as illustrated in FIG. 5. [0067] In Fig 5. , a is the steering angle of the front wheel of the tractor; (x, y) is the position of the rear axle center of the tractor; 0 is the yaw angle of the tractor; [3 is the trailer angle; L0 is the tractor wheelbase, L1 the trailer wheelbase; and Mo is the distance from the trailer hinge point to the tractor rear axle.

[0068] The system dynamics of the tractor-trailer kinematic model is now presented with respect to s, the distance along the path, as we are interested in path instead of trajectory.

[0069] [0070] It is noted that, given a tractor path in form of a series of waypoints (xj,yj), the (modified) control input u = tan may be calculated along the path, which can then be substituted into the trailer angle dynamics equation to calculate the trailer angle:

[0071]

[0072] Once the trailer angles are obtained, the full trailer path may be computed.

[0073] [Tractor-Trailer Combination]

[0074] In FIG. 4(b), a neural network, forming at least part of a path planning module of an autonomous system for a trailer-tractor combination, has an encoder-decoder architecture and is illustrated in FIG. 4(b) by a dotted block having therein encoder and decoder network portions. The neural network receives, as input, data pertaining to, for example, roadway lane markings from the target lane and static obstacles disposed along the lane. The neural network input is provided to the encoder and decoder portions. The decoder portion of the neural network provides the neural network output corresponding to a reference path along the roadway lane. Example neural network models of the encoder/decoder architecture include recurrent neural networks, fully-connected neural networks, graphical neural networks, transformers, etc. [0075] As mentioned, the encoder-decoder architecture of the neural network is, in accordance with an example embodiment, GRU-cell based. Figs. 3 and 9A-9B illustrate a GRU-based implementation of the encoder and decoder portions of the neural network according to an example embodiment.

[0076] The block titled “Path Cost Function” depicts the block used to train the neural network. The Path Cost Function block receives the neural network input as well as the neural network output and generates a path cost value which is backpropagated through the neural network and in particular updates the parameters of the encoders and the decoder in the direction that would minimize or otherwise reduce the cost. The path cost function block may be implemented in hardware, software or both hardware and software. As described above, the output of the path cost function block is based on a set of criteria including collision avoidance of the tractor and trailer with static obstacles, and path deviation for both the tractor and trailer relative to the reference (anchor) path.

[0077] Once the training phase of the path planning neural network is complete, the inference phase is performed in which the path cost function block is no longer utilized such that the neural network generates the reference (anchor) path based on the input data provided to the network.

[0078] FIG. 10 is a block diagram of an autonomous drive system of the tractor in the tractor-trailer combination. Sensors mounted to, on or within the tractor-trailer combination, such as cameras, radar, Lidar, and/or ultrasonic sensors, provide sensor data to a perception module or algorithm which analyzes the sensor data. The perception module perceives and/or recognizes objects, moving and stationary, from the sensor data. The perception module identifies traffic participants (other vehicles, pedestrians, cyclists, etc.) from the perceived objects and provides the perceived objects to a prediction module or algorithm. The prediction module predicts future trajectories of the perceived traffic participants and provides the predicted trajectories of the traffic participants to a planning block.

[0079] The planning block includes a maneuver planner module or algorithm which determines, based upon the global route provided by a route planner block and the predicted trajectories of the perceived traffic participants, the particular lane of a roadway in which the tractor-trailer combination is to travel. The output of the maneuver planner module is provided to a trajectory planner block. The Al path planner, which corresponds to the neural network described above, forms part of the trajectory planner block and receives the output of the maneuver planner module. As described above, the input to the neural network includes lane information and any static obstacles associated with the determined lane in which the tractor-trailer combination is to travel. The output of the Al path planner (neural network) is the reference (anchor) path. Because the reference path is a static path, the reference path output of the Al path planner is used by the trajectory planner block to include or add motion (velocity) information of the tractor-trailer combination to provide a reference trajectory (i.e. , path with velocity information). The reference trajectory is provided by the trajectory planner block to a controller of the tractortrailer combination which uses the reference trajectory for sending instructions to the tractor-trailer combination’s drive system (e.g., the steering system, brake system and the acceleration system) for moving the tractor-trailer combination along the determined lane of the roadway. The controller may be a microcontroller and/or include one or more (core) processors which, when executing program code instructions stored in memory associated with the controller, causes the controller to perform the operations discussed herein. The modules/algorithms discussed above may be implemented in program code having instructions that may be executed by a controller of the tractor autonomous driving system, such as a controller not shown in FIG. 10.

[0080] No data collection is needed for training the neural network. The neural network is shown with millions of possible combinations of lane shapes and obstacles arrangements, which are randomly generated/sampled from a predefined distribution relevant to reality.

[0081] The path cost function (or network that is trained otherwise with supervised learning) is constructed to score or penalize each neural network generated path based on its deviation from the lane center, the path smoothness and collision with any obstacles.

[0082] The training phase of the path planning neural network is trained to minimize or redue the cost of the generated paths.

[0083] [Results]

[0084] Described below is how the data is generated for training and validation, some details of the training process, and present the training result both with a validation dataset and in closed loop simulation in CarMaker.

[0085] [Data generation]

[0086] To train the Al path planner, a very large number (in the order of millions) of combinations of different lane shapes and placements of static obstacles is randomly generated by independently sampling from a (infinite) set of anchor paths and a (infinite) set of static obstacles with predefined distributions. [0087] The anchor paths set that are sampled from contains two types of lane shapes: a 5-th order polynomial, and an arc preceded by a line segment. The polynomial coefficients are sampled from some uniform distributions with 0 means; the arc angle is uniformly sampled from [TP3,TT]; and the path length varies in the range of [80, 250] meters.

[0088] Then there are randomly generated 0 to n (n to be 2 in this example) static obstacles along the generated anchor path, each obstacle positioned in the range of [0, 10] meters from some randomly selected anchor point with a random angle, the size of each obstacle (the obstacle is represented as a hexagon) is in the range of [0.2, 2] meters. [0089] FIG. 6 shows two examples of the generated data. In Fig 6 (a), a polynomial anchor path is generated without any static obstacle nearby, and in FIG. 6 (b) we have a line segment + arc type of anchor path with one static obstacle near the path.

[0090] [Training]

[0091] The model is implemented in PyTorch, with each encoder hidden size selected to be 1000, and the decoder hidden size 2000; the number of recurrent layers of the GRU is selected to be 3. Since the GRU is used for the encoders and decoder, Adam is chosen as the default optimizer. The learning rate is set to 2e-5 initially and is decreased towards the end of the training.

[0092] For each epoch, 5e4 new data samples (we have unlimited data) are generated. The learning process takes about 40 to 100 epochs, meaning the model would have seen 2-5 million data examples to learn to plan a decent path.

[0093] [Validation result]

[0094] In FIG.7, some examples of the results applying our model to the validation dataset after training are demonstrated. [0095] In FIG. 7 (a) and (b), the model has learned the desired driving behavior to go to the outer side of the lane when turning. FIG. 7 (c) and (d) are examples of the planned path swerving to avoid one static obstacle in the lane, while (e) and (f) are cases with two static obstacles. The planned path in FIG.7 (g) is not affected much by the static obstacle since the obstacle is in a safe distance from the driving lane. Lastly in FIG. 7 (h) a failed example is shown where the planned path collides with the obstacles.

[0096] To quantify the model performance in terms of collision avoidance, the model is trained and validated with datasets where each data sample contains exactly one static obstacle. A collision rate of 0.67% is obtained when applying the trained model to the validation dataset (compared to a 48.2% before training). While the performance may improve by tuning some hyperparameters and/or modifying the model architecture, due to the (probabilistic) nature of the deep learning approach, it may be impossible to ever achieve a 0% collision rate, which is a requirement for lower-level motion planners. Nevertheless, by combining the Al path planner with some backup motion planner (such as MPC based), inference speed may be achieved and safety ensured at the same time. One way to complete the Al path planner and guarantee 0% collision rate with static obstacles is, in the rare cases where the Al planned path fails the feasibility check, instead of using the planned path directly, it is only used as a starting point for further optimization by the backup planner.

[0097] [Closed-loop simulation]

[0098] The trained Al path planner has been integrated with a lower-level trajectory planner, which takes the planned path as a reference path and adds velocity information to it. A closed-loop simulation in CarMaker was conducted (for the no obstacle case) and successfully drove the predefined route, proving the model transferability from being trained with randomly generated fake data segments to being applied to (semi) real scenarios. As can be seen in FIG. 8, the planned path tends to go to the outer side of the lane when making the turn.

[0099] [Conclusion]

[0100] A novel Al path planner is presented that can be trained entirely on randomly generated fake data to plan a feasible path for the tractor trailer combinations with the objective to minimize or otherwise reduce the off-track of the tractor and the trailer swept areas and avoid collisions with static obstacles. To guide the training of the path planner, which is implemented in an encoder-decoder architecture, a differentiable path cost function is constructed to evaluate the cost of a path, which is backpropagated through the network during the training. A “semi-supervised” approach is utilized, as no expert data is needed for the “supervision” of the training. Compared to the supervised approach, the present approach 1 ) requires no collection of expert driven data; 2) can better the expert driving as it is not imitating the expert, rather, it is trained to minimize or otherwise reduce the “true” cost of a path in an absolute, objective sense, given that the path cost function is properly designed; and 3) makes it possible to teach the network to avoid static obstacles, as the data needed to train the network for that purpose can be easily generated without limitation.

[0101] The trained path planning model was validated with a validation dataset and in closed-loop simulation in CarMaker as well. The results showed the model successfully learned the desired driving behavior to go to the outer side of the lane when turning, and it also learned to avoid static obstacles in most cases (> 99%). Due to the probabilistic nature of deep learning approaches, though, a 100% path feasibility can be hard to achieve for the paths generated by the Al path planner. Additional safety mechanisms may be included - e.g., feasibility check and backup motion planner - be integrated with the Al path planner to ensure the safety while leveraging the fast inference speed of the Al path planner.

[0102] Future work may include experimenting with new model architectures (e.g., other ways to combine the encodings from different encoders) to further boost the model performance, and considering other features, such as free space, as the model input.