Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AN AUTONOMOUS DRIVING SYSTEM WITH AIR SUPPORT
Document Type and Number:
WIPO Patent Application WO/2023/154717
Kind Code:
A1
Abstract:
Aspects an autonomous driving system with air support are described herein. The aspects may include an unmanned aerial vehicle (UAV) in the air and a land vehicle on the ground communicatively connected to the UAV. The UAV may include at least one UAV camera configured to collect first ground traffic information and a UAV communication module configured to transmit the collected first ground traffic information. The land vehicle may include one or more vehicle sensors configured to collect second ground traffic information surrounding the land vehicle, a land communication module configured to receive the first ground traffic information from the UAV, and a processor configured to combine the first ground traffic information and the second ground traffic information to generate a world model.

Inventors:
XU LEI (CN)
CHENG ERKANG (CN)
ZHU TINGTING (CN)
Application Number:
PCT/US2023/062163
Publication Date:
August 17, 2023
Filing Date:
February 07, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NULLMAX HONG KONG LTD (US)
International Classes:
G08C17/00; B64C39/02; G05D1/02; G08G1/16; B60R16/02; B64U101/00; B64U101/30; G01C15/00; G08G1/01
Foreign References:
US20210132612A12021-05-06
US20220009630A12022-01-13
US20210046925A12021-02-18
US20200109954A12020-04-09
US20190227555A12019-07-25
Attorney, Agent or Firm:
YE, Jun (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. An autonomous driving system, comprising: an unmanned aerial vehicle (UAV) in the air, wherein the UAV includes: at least one UAV camera configured to collect first ground traffic information, and a UAV communication module configured to transmit the collected first ground traffic information; and a land vehicle communicatively connected to the UAC in the air, wherein the land vehicle includes: one or more vehicle sensors configured to collect second ground traffic information surrounding the land vehicle, a land communication module configured to receive the first ground traffic information from the UAV, and a processor configured to combine the first ground traffic information and the second ground traffic information to generate a world model.

2. The autonomous driving system of claim 1, wherein the first ground traffic information is formatted in a coordinate system with a position of the UAV as a coordinate origin.

3. The autonomous driving system of claim 2, wherein the first ground traffic information includes at least one of: a position of the land vehicle in the coordinate system; first position information that indicates locations of one or more still objects on the ground in the coordinate system with the position of the UAV as the coordinate origin; first motion information that indicates velocities of one or more moving objects on the ground; first predicted trajectories of the one or more moving objects on the ground; first status information that indicates statuses of one or more traffic signals; and first area information that indicates one or more accessible areas to the land vehicle from a perspective of the UAV.

4. The autonomous driving system of claim 3, wherein the first position information of the one or more still objects is formed as one or more sets of coordinates in the coordinate system, one or more point clouds, one or more semantic segments, or features of the still objects extracted by a neural network.

5. The autonomous driving system of claim 1, wherein the second ground traffic information is formatted in a coordinate system with a position of the land vehicle as a coordinate origin.

6. The autonomous driving system of claim 5, wherein the second ground traffic information include at least one of: second position information that indicates locations of one or more still objects surrounding the land vehicle in the coordinate system with the position of the land vehicle as the coordinate origin; second motion information that indicates velocities of one or more moving objects surrounding the land vehicle; second predicted trajectories of the one or more moving objects surrounding the land vehicle; second status information that indicates statuses of one or more traffic signals; and second area information that indicates one or more accessible areas to the land vehicle from a perspective of the land vehicle.

7. The autonomous driving system of claim 6, wherein the second position information is formed as one or more sets of coordinates in the coordinate system, one or more point clouds, one or more semantic segments, or features of the still objects extracted by a neural network.

8. The autonomous driving system of claim 1, wherein the processor is configured to: convert the first ground traffic information from a first coordinate system with a position of the UAV as a coordinate origin to a second coordinate system with a position of the land vehicle as the coordinate origin, convert coordinates of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas and predicted trajectories of the one or more moving objects identified in the first coordinate system to coordinates in the second coordinate system; and determine coordinates of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas in the world model based on the converted coordinates and the second ground traffic information.

9. The autonomous driving system of claim 1, wherein the processor is configured to: convert semantic segments of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas identified in the first coordinate system to semantic segment in the second coordinate system; and determine semantic segments of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas and predicted trajectories of the one or more moving objects identified in the world model based on the converted semantic segment and the second ground traffic information.

10. The autonomous driving system of claim 1, wherein the processor is configured to: convert point clouds of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas identified in the first coordinate system to point clouds in the second coordinate system; and determine point clouds of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas and predicted trajectories of the one or more moving objects identified in the world model based on the converted point clouds and the second ground traffic information.

11. A land vehicle in an autonomous driving system, comprising: a land communication module configured to receive first ground traffic information surrounding the land vehicle, wherein the first ground traffic information is collected by an unmanned aerial vehicle (UAV) in the air; one or more vehicle sensors configured to collect second ground traffic information surrounding the land vehicle; a processor configured to combine the first ground traffic information and the second ground traffic information to generate a world model.

12. The land vehicle of claim 11, wherein the first ground traffic information is formatted in a coordinate system with a position of the UAV as a coordinate origin.

13. The land vehicle of claim 12, wherein the first ground traffic information includes at least one of: a position of the land vehicle in the coordinate system; first position information that indicates locations of one or more still objects on the ground in the coordinate system with the position of the UAV as the coordinate origin; first motion information that indicates velocities of one or more moving objects on the ground; first predicted trajectories of the one or more moving objects on the ground; first status information that indicates statuses of one or more traffic signals; and first area information that indicates one or more accessible areas to the land vehicle from a perspective of the UAV.

14. The land vehicle of claim 13, wherein the first position information of the one or more still objects is formed as one or more sets of coordinates in the coordinate system, one or more point clouds, one or more semantic segments, or features of the still objects extracted by a neural network.

15. The land vehicle of claim 11, wherein the second ground traffic information is formatted in a coordinate system with a position of the land vehicle as a coordinate origin.

16. The land vehicle of claim 15, wherein the second ground traffic information include at least one of: second position information that indicates locations of one or more still objects surrounding the land vehicle in the coordinate system with the position of the land vehicle as the coordinate origin; second motion information that indicates velocities of one or more moving objects surrounding the land vehicle; second predicted trajectories of the one or more moving objects surrounding the land vehicle; second status information that indicates statuses of one or more traffic signals; and second area information that indicates one or more accessible areas to the land vehicle from a perspective of the land vehicle.

17. The land vehicle of claim 16, wherein the second position information is formed as one or more sets of coordinates in the coordinate system, one or more point clouds, one or more semantic segments, or features of the still objects extracted by a neural network.

18. The land vehicle of claim 11, wherein the processor is configured to: convert the first ground traffic information from a first coordinate system with a position of the UAV as a coordinate origin to a second coordinate system with a position of the land vehicle as the coordinate origin, convert coordinates of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas and predicted trajectories of the one or more moving objects identified in the first coordinate system to coordinates in the second coordinate system; and determine coordinates of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas in the world model based on the converted coordinates and the second ground traffic information.

19. The land vehicle of claim 11, wherein the processor is configured to: convert semantic segments of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas identified in the first coordinate system to semantic segment in the second coordinate system; and determine semantic segments of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas and predicted trajectories of the one or more moving objects identified in the world model based on the converted semantic segment and the second ground traffic information.

20. The land vehicle of claim 11, wherein the processor is configured to: convert point clouds of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas identified in the first coordinate system to point clouds in the second coordinate system; and determine point clouds of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas and predicted trajectories of the one or more moving objects identified in the world model based on the converted point clouds and the second ground traffic information.

Description:
AN AUTONOMOUS DRIVING SYSTEM WITH AIR SUPPORT

TECHNICAL FIELD

[0001] The present disclosure generally relates to the technical field of autonomous driving, and specifically, relates to an apparatus and method for autonomous driving with air support.

BACKGROUND

[0002] Autonomous driving systems have been proposed to replace the manual driving mode in which a vehicle travels under the control of the driver. An autonomous driving vehicle, or autonomous driving systems embedded therein, typically includes multiple sensors to detect the objects around the vehicle. Those objects should be promptly detected and located to avoid possible collision with the vehicle. Many of the existing autonomous driving systems includes Light Detection and Ranging (LiDAR) devices, cameras, or Radio Detection and Ranging (radar) sensors.

[0003] However, none of these sensors can detect objects blocked by another object, for examples, a pedestrian running behind another vehicle. Those sensors also have difficulties to detect other vehicles in a low visibility weather. Even on a sunny day, the range of those sensors are also limited to around one hundred meters, if not further.

SUMMARY

[0004] The following presents a simplified summary of one or more aspects to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects and is intended to neither identify key or critical elements of all aspects nor delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more aspects in a simplified form as a prelude to the more detailed description that is presented later. [0005] One example aspect of the present disclosure provides an example autonomous driving system. The example autonomous driving system may include an unmanned aerial vehicle (UAV) in the air. The UAV may include at least one UAV camera configured to collect first ground traffic information, and a UAV communication module configured to transmit the collected first ground traffic information. The example autonomous driving system may further include a land vehicle communicatively connected to the UAC in the air. The land vehicle may include one or more vehicle sensors configured to collect second ground traffic information surrounding the land vehicle, a land communication module configured to receive the first ground traffic information from the UAV, and a processor configured to combine the first ground traffic information and the second ground traffic information to generate a world model.

[0006] To the accomplishment of the foregoing and related ends, the one or more aspects comprise the features herein after fully described and particularly pointed out in the claims. The following description and the annexed drawings set forth in detail certain illustrative features of the one or more aspects. These features are indicative, however, of but a few of the various ways in which the principles of various aspects may be employed, and this description is intended to include all such aspects and their equivalents.

BRIEF DESCIPTIOIN OF THE DRAWINGS

[0007] The disclosed aspects will hereinafter be described in conjunction with the appended drawings, provided to illustrate and not to limit the disclosed aspects, wherein like designations denote like elements, and in which:

[0008] Fig. 1 illustrates a diagram showing an autonomous driving system with air support in accordance with the disclosure;

[0009] Fig. 2 illustrates a diagram showing the autonomous driving system with air support in accordance with the disclosure; [0010] Fig. 3 illustrates a diagram showing another autonomous driving system with air support in accordance with the disclosure;

[0011] Fig. 4 illustrates a diagram showing example components of the autonomous driving system with air support in accordance with the disclosure;

[0012] Fig. 5 illustrates a diagram showing a conversion of traffic information in the example autonomous driving system with air support in accordance with the disclosure;

[0013] Fig. 6 illustrates a diagram showing a detection of accessible areas by the example autonomous driving system with air support in accordance with the disclosure;

[0014] Fig. 7 illustrates a diagram showing a combined detection range of the example autonomous driving system with air support in accordance with the disclosure;

[0015] Fig. 8 illustrates a diagram showing an example perception neural network in the example autonomous driving system with air support in accordance with the disclosure;

[0016] Fig. 9 illustrates a diagram showing another example neural network in the example autonomous driving system with air support in accordance with the disclosure;

[0017] Fig. 10 illustrates a diagram showing another example neural network in the example autonomous driving system with air support in accordance with the disclosure;

[0018] Fig. 11 illustrates a diagram showing an example neural network in the example autonomous driving system with air support in accordance with the disclosure;

[0019] Fig. 12 illustrates a flow chart of an example method for performing autonomous driving in the example autonomous driving system in accordance with the disclosure; and [0020] Fig. 13 illustrates a diagram showing the autonomous driving system with air support in an example scenario. DETAILED DESCRIPTION

[0021] Various aspects are now described with reference to the drawings. In the following description, for purpose of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more aspects. It may be evident, however, that such aspect(s) may be practiced without these specific details.

[0022] In the present disclosure, the term “comprising” and “including” as well as their derivatives mean to contain rather than limit; the term “or,” which is also inclusive, means and/or.

[0023] In this specification, the following various embodiments used to illustrate principles of the present disclosure are only for illustrative purpose, and thus should not be understood as limiting the scope of the present disclosure by any means. The following description taken in conjunction with the accompanying drawings is to facilitate a thorough understanding of the illustrative embodiments of the present disclosure defined by the claims and its equivalent. There are specific details in the following description to facilitate understanding. However, these details are only for illustrative purpose. Therefore, persons skilled in the art should understand that various alternation and modification may be made to the embodiments illustrated in this description without going beyond the scope and spirit of the present disclosure. In addition, for clear and concise purpose, some known functionality and structure are not described. Besides, identical reference numbers refer to identical function and operation throughout the accompanying drawings.

[0024] Fig. 1 illustrates a diagram showing an example autonomous driving system 100 with air support in accordance with the disclosure.

[0025] As depicted, the example autonomous driving system 100 includes a land vehicle 102 on the ground and an unmanned aerial vehicle (UAV) 104 in the air. In some examples, the UAV 104 may be a drone stored in the land vehicle 102 for charging and released or launched if preferred. In some other examples, the UAV 104 may be any vehicle above the ground, e.g., a satellite.

[0026] When the UAV 104 is released and hovering above the land vehicle, multiple sensors of the UAV 104 may be configured to collect ground traffic information. In some examples, the UAV 104 may include camera sensors, radar sensors, and/or LiDAR sensors. As the UAV 104 is in the air, the sensors of UAV 104 may capture more, or at least different, ground traffic information. For example, as depicted here, when the land vehicle 102 is traveling behind a vehicle 108, sensors on the land vehicle 102 may not be able to capture any information of a vehicle 110 in front of the vehicle 108. In some other examples, an exit or a sharp turn on the road may be blocked by the vehicles 108 and 110 such that the sensors on the land vehicle 102 may not be able to detect the exit. Unlike the sensors on the land vehicle 102, the sensors on UAV 104 may be able to gather ground traffic information typically unperceivable or undetectable by the sensors on the land vehicle 102.

[0027] In some examples, the sensors on the UAV 104 may be configured to collect visual images and/or distances from ground objects to the UAV 104. The visual images and the distances may be further converted to ground traffic information in a three-dimensional (3D) coordinate system (“first ground traffic information” hereinafter). The position of the UAV 104 may be the coordinate origin of the 3D coordinate system. The conversion from the collected visual images and distance information may be performed by a processor of the UAV 104. Alternatively, the collected visual image and the distance information may be transmitted to a control center 106 or the land vehicle 102. The conversion can also be performed a processor of the control center 106 or a processor on the land vehicle 102.

[0028] In some examples, the first ground traffic information may include a position of the land vehicle 102 in the 3D coordinate system, positions of the still objects on the ground (e.g., curbside, lane dividing lines, stop sign, etc.), positions and motion information such as velocity and acceleration of the moving objects on the ground (e.g. , pedestrian, other land vehicles, etc.), status information of the traffic signals (e.g, traffic light 112), and area information that indicates areas accessible to the land vehicle 102 from the perspective of the UAV 104. The positions of the first ground traffic information may be formed as sets of coordinates in the 3D coordinate system, one or more point clouds, one or more semantic segments, or features of those objects.

[0029] While the land vehicle 102 is on the road, the sensors (e.g., camera sensors, radar sensors, and/or LiDAR sensors) on the land vehicle may be configured to collect information surrounding the land vehicle 102. Similarly, visual images and/or distance information of surrounding objects may be collected by the sensors on the land vehicle. The collected visual image and the distance information may further converted to ground traffic information in a two-dimensional (2D) coordinate system (“second ground traffic information” hereinafter). The position of the land vehicle 102 may be the coordinate origin of the 2D coordinate system.

[0030] Similarly, the second ground traffic information may include positions of the still objects on the ground (e.g, curbside, lane dividing lines, stop sign, etc.), positions and motion information such as velocity and acceleration of the moving objects on the ground (e.g., pedestrian, other land vehicles, etc.), status information of the traffic signals, and area information that indicates areas accessible to the land vehicle 102 from the perspective of the land vehicle 102.

[0031] The ground traffic information collected by the UAV 104 and the land vehicle 102 respectively may be further combined to generate a world model. The world model may include a combination of the information collected respectively by the UAV 104 and the land vehicle 102. In some examples, the first ground traffic information in the 3D coordinate system may be converted to the 2D coordinate system with the land vehicle 102 as the coordinate origin. Such conversion may be performed by a processor of the UAV 104, a processor of the land vehicle 102, or a processor at the controller center 106. The process of generating the world model is further described in more detail in accordance with Figs. 5 and 8-11.

[0032] As the world model includes the position information of those objects that are difficult to be perceived by the sensors on the land vehicle 102, it become more efficient, maybe safer, to control the routing, the behavior, and the motion of the land vehicle based on the world model. For example, when the world model includes the velocities and accelerations of the vehicles 108 and 110, the processor of the land vehicle may be configured to generate an instruction to pass the vehicle 108 if the distance between the vehicles 108 and 100 is and will be safe for a time period sufficient for passing.

[0033] From the perspective of the complexity entropy system, the vehicle-UAV cooperative autonomous driving introduces the intelligent element of entropy reduction to counter the entropy increase of the natural iterative growth of the single-vehicle intelligent automatic driving system. Through vehicle-UAV collaboration, the perception and collaborative planning capabilities of the air-side subsystem can be introduced to solve the problem of blind spot perception, while expanding the perception range and improving the safety and robustness of decision-making and planning. In addition, vehicle-UAV collaboration is more conditional for data accumulation and collaboration, and further enhances individual single-vehicle intelligence and learning growth intelligence through data mining. In this way, the vehicle-UAV synergy introduces orthogonal elements such as high-dimensional data of UAV-side intelligence and realizes a new intelligence of entropy reduction against the entropy increase of system complexity.

[0034] Fig. 2 illustrates a diagram showing the example autonomous driving system with air support in accordance with the disclosure. [0035] As an example scenario depicted in Fig. 2, when the land vehicle 102 detects that the land vehicle 102 is approaching an intersection, the UAV 104 may be released to the air and fly toward the interaction before the land vehicle 102, typically before the land vehicle 102 reaches the intersection. In some examples, the UAV 104 may be in the air following or leading the land vehicle 102 during the trip.

[0036] When the UAV 104 is close to or around the intersection, the sensors of UAV 104 may be configured to collect the first ground traffic information including the positions of the crosswalks, the lane dividing lines, the curbsides, or a moving vehicle 202. The first ground traffic information may then be combined with the second ground traffic information by a processor of the land vehicle 102 or the control center 106 to generate the world model. As the world model includes the motion information collected in the first ground traffic information by the UAV 104, a processor of the land vehicle 102 may be configured to determine the speed of a right turn of the land vehicle 102, or whether the land vehicle 102 needs to stop to yield in the case that the velocity of the moving vehicle 202 reaches a given threshold.

[0037] Fig. 3 illustrates a diagram showing another autonomous driving system with air support in accordance with the disclosure.

[0038] As depicted, one or more UAVs (e.g, UAVs 104-107) may be hovering near the intersection. These UAVs may be originally paired to different land vehicles respectively or a part of smart city infrastructure collecting information for traffic control government agencies. These UAVs may be communicatively connected to each other, to the land vehicle 102, or to the control center 106. In the example, the first ground traffic information collected/generated respectively by the UAVs 104-107 may be transmitted to the land vehicle 102, the control center 106, or any of the UAVs 104-107 to generate the world model. [0039] Since first ground traffic information collected/generated respectively by the UAVs 104-107 theoretically includes traffic information of a larger geographic range, the world model may include information of more moving objects. The decisions of the autonomous driving made based upon the world model may be safer or more efficient. For example, the processor on the land vehicle 102 may force a hard stop if the world model includes motion information of a running person around the blind spot of the land vehicle 102.

[0040] Fig. 4 illustrates a diagram showing example components of the autonomous driving system with air support in accordance with the disclosure.

[0041] As depicted, the UAV 104 may include sensors such as one or more UAV camera 402, one or more UAV LiDAR sensor, other UAV sensors (e.g, radar sensors). The UAV camera 402 may be configured to capture images of the ground traffic. The UAV LiDAR sensor may be configured to determine distance information of the objects on the ground, i.e., distances between ground objects to the UAV 104. Other UAV sensors 406 such as radar sensors may be similarly configured to determine the distance information of the ground objects. The collected images and distance information may be sent to a UAV processor 410 to be converted to the first ground traffic information. The first ground traffic information may then be transmitted to the land vehicle 102 via a UAV communication module 408. The UAV communication module 408 may be in communication with a land vehicle communication module 418 and/or a control center communication module 422 in accordance with wireless communication protocols such as Wi-Fi, Bluetooth, ZigBee, Z- Wave, MiWi, etc. In other examples, the images and the distance information may be sent, via the UAV communication module 408, to the land vehicle 102 or the control center 106 for the conversion.

[0042] The land vehicle may include sensors such as one or more land vehicle camera 412, one or more land vehicle LiDAR sensor, other land vehicle sensors (e.g, radar sensors). Similarly, the land vehicle camera 412 may be configured to capture images of the ground traffic surrounding the land vehicle 102. The land vehicle LiDAR sensor 414 and other land vehicle sensors 416 may be configured to distance information of the surrounding objects. Typically, sensors on the land vehicle 102 may collect traffic information within several hundred meters of the land vehicle 102.

[0043] Similarly, the collected images and distance information may be sent to a land vehicle processor 420 to be converted to the second ground traffic information. In some other examples, the collected images and distance information may be sent to the UAV 104 or the control center 106 for the conversion.

[0044] Based on the first ground traffic information collected by the UAV 104 and the second ground traffic information collected by the land vehicle 102, the land vehicle processor 420 may be configured to generate the world model. Notably, in at least some examples, the generating of the world model may be performed by the UAV processor 410 or the center processor 424.

[0045] Based on the world model, the land vehicle processor 420 may be configured to generate decisions for the land vehicle 102.

[0046] Fig. 5 illustrates a diagram showing a conversion of traffic information in the example autonomous driving system with air support in accordance with the disclosure.

[0047] As depicted, the processor of the UAV 104 may be configured to generate the first ground traffic information in the 3D coordinate system. In the 3D coordinate system, each of the ground objects may be associated with one or more sets of coordinates. For example, each segment of the land dividing lines may be associated with two sets of coordinates that indicate a beginning and an end thereof. A vehicle 502 may be associated with four sets of coordinates that respectively indicate four comers of a virtual boundary box that encloses the vehicle 502. [0048] Some of the obj ects may be formatted as function curves in the 3D coordinate system,

Some other objects may be formatted as a chain of links. For example, each segment of the lane dividing lines may be associated with a number of itself and a number of the next segment.

[0049] In some other examples, each of the ground objects may be represented as a semantic segment (or instance segment). The semantic segment may also be associated with coordinates in the 3D coordinate system. Additionally, each semantic segment may include a probability of a category to which the object belongs. For example, a portion of the curbside may be represented as “(x, y, z) (95%) (curbside)” showing the object at coordinate (x, y, z) is highly likely to be a curbside.

[0050] In some other examples, each of the ground objects may be represented as a point cloud that include a set of data points in space. Each of the data points may be associated with a set of coordinates.

[0051] Additionally, some of the ground obj ects may be associated with a direction to which the objects are facing. For example, the direction of a bike, a pedestrian, or a car may be determined based on the images collected by the UAV camera 402.

[0052] Additionally, motion information may be associated to each moving object on the ground. For example, velocity formatted as (v x , v y , v z ) and acceleration formatted as (a x , a y , a z ) may be associated with the vehicle 502. In some examples, the first ground traffic information may include predicted trajectories of the moving objects on the ground. The predicted trajectories of the moving objects may be generated by the UAV processor 410 in accordance with some existing approaches, e.g., model based approach and/or data driven approach. [0053] Different from the first ground traffic information from the perspective of the UAV 104, the second ground traffic information is represented in a 2D coordinate system with the position of the land vehicle 102 being the coordinate origin. The second ground traffic information may similarly include coordinates of the ground objects, 2D function curves in the 2D coordination system to represents some of the ground objects, semantic segments of some ground objects, point clouds of some ground objects, directions to which some objects face, motion information of some moving objects on the ground, predicted trajectories of the moving objects.

[0054] During the process of generating the world model, the first ground traffic information may be converted to the 2D coordinate system. The positions of the ground objects in the 3D coordinate system may be aligned with those positions of the same objects in the 2D coordinate system to generate the world model.

[0055] For example, the land vehicle processor 420, the UAV processor 410, or the center processor 424, may be configured to convert coordinates of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas identified in the 3D coordinate system to coordinates in the 2D coordinate system; and determine coordinates of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas in the world model based on the converted coordinates and the second ground traffic information.

[0056] In other examples, the land vehicle processor 420, the UAV processor 410, or the center processor 424, may be configured to convert semantic segments of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas identified in the 3D coordinate system to semantic segment in the 2D coordinate system; and determine semantic segments of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas identified in the world model based on the converted semantic segment and the second ground traffic information.

[0057] In yet other examples, the land vehicle processor 420, the UAV processor 410, or the center processor 424, may be configured to convert point clouds of one or more still objects, one or more moving objects, one or more traffic signals, and one or more accessible areas identified in the 3D coordinate system to point clouds in the 2D coordinate system; and determine point clouds of the one or more still objects, the one or more moving objects, the one or more traffic signals, and the one or more accessible areas identified in the world model based on the converted point clouds and the second ground traffic information.

[0058] The conversion of the coordinate systems is described in greater detail below.

[0059] Fig. 6 illustrates a diagram showing a detection of accessible areas by the example autonomous driving system with air support in accordance with the disclosure.

[0060] As described above, the first ground traffic information and the second ground traffic information may include area information that indicates the areas accessible to the land vehicle 102. The areas may also be identified as sets of coordinates. Thus, the world model may also indicate the accessible areas as marked in patterns in Fig. 6. In some examples, the determination of the accessible areas 604 may be based on traffic rules and motion information of the ground objects and dynamically adjusted. For example, when the vehicle 602 is detected to apply a hard brake, the accessible areas 604 may be adjusted such that the land vehicle 102 may keep a safe distance.

[0061] Fig. 7 illustrates a diagram showing a combined detection range of the example autonomous driving system with air support in accordance with the disclosure.

[0062] As shown, due the limits of the sensors on the land vehicle 102, the detection range 702 of the land vehicle 102 may be within several hundred meters from the land vehicle 102. Since the sensors on the UAV 104 may collect information that normally cannot be perceived by the sensors on the land vehicle 102, a detection range 704 of the UAV 104 may be much larger than the detection rage 702.

[0063] Further, because the world model essentially includes a combination of the first ground traffic information and the second ground traffic information, the range of the world model may be greater than, or at least equal to, the detection range 704 of the UAV 104.

[0064] Fig. 8 illustrates a diagram showing an example perception neural network in the example autonomous driving system with air support in accordance with the disclosure.

[0065] As shown, the images and the position information (e.g., images 810 and 812, LiDAR points such as point clouds 814 and 816) respectively collected by the sensors of the UAV 102 and the land vehicle 102 (e.g., UAV camera 402, UAV LiDAR sensor 404, UAV sensors 406, land vehicle camera 412, land vehicle LiDAR sensor 414, land vehicle sensors 416, etc.) may be input to a perception neural network 802 via one or more feature extraction networks 818. The feature extraction networks 818 may be configured to extract features from the images and the position information. The extracted features may then be input into the perception neural network 802.

[0066] A system administrator (a person) may label the objects on the images 810 and 812 based on his/her experience to set ground truth values of the perception neural network 802. With sufficient teaching input by the system administrator, the perception neural network 802 may detect the objects described in the images and the position information and output perceived objects 806 (e.g., other vehicles on the road, accessible areas, lane dividing lanes, etc.) as the results.

[0067] Fig. 9 illustrates a diagram showing another example transformation neural network in the example autonomous driving system with air support in accordance with the disclosure. [0068] As described above, the first ground traffic information may be converted into the 2D coordinate system to be consistent with the second ground traffic information; the first ground traffic information and the second ground traffic information may then be combined to generate the world model. The world model may include information of the objects perceivable to the sensors of the UAV 104 or to the sensors of the land vehicle 102. A transformation network 916 may be configured to output a transformation matrix 918 that can convert the coordinates in the 3D coordinate system to coordinates in the 2D coordinate system.

[0069] Images 908 collected by the UAV camera, e.g., 402, may be sent to a feature extraction network 912 to extract features of the objects contained in the images 908. When properly trained, the feature extraction network 912 may output features of those perceived objects by UAV 904.

[0070] Similarly, images 910 collected by the land vehicle camera, e.g., 412, may be sent to a feature extraction network 914 to extract features of the objects contained in the images 910. When properly trained, the feature extraction network 914 may output features of the perceived objects by land vehicle 902.

[0071] Features of both the perceived objects by UAV 904 and the perceived objects by land vehicle 902 may be combined and input to the transformation network 916. After training, the transformation network 916 may output the transformation matrix 918. With the transformation matrix 918, features of the perceived objects by UAV 904 may be converted to features of objects in the 2D coordinate system. The features of the perceived objects by UAV in 2D system 920 may be compared to the perceived objects by land vehicle 902 to determine if the features of the objects perceived by the UAV 104 and the land vehicle 102 are consistent after converting the coordinates. The results of the comparison may be fed back to the transformation network 916 as constraints to further train the transformation network 916 to yield a better transformation matrix 918.

[0072] Notably, in at least some examples, the transformation network 916 may be configured to generate a transformation matrix intended to convert the coordinates in the 2D system to the 3D system. Processes and operations are similar to those described above. [0073] Fig. 10 illustrates a diagram showing another example neural network in the example autonomous driving system with air support in accordance with the disclosure.

[0074] As described above, the first ground traffic information may be converted into the 2D coordinate system to be consistent with the second ground traffic information; the first ground traffic information and the second ground traffic information may then be combined to generate the world model. The world model may include information of the objects perceivable to the sensors of the UAV 104 and/or to the sensors of the land vehicle 102.

[0075] Alternatively, images 1002 collected by the UAV camera, e.g., 402, and images 1004 collected by the land vehicle camera, e.g., 412, may be submitted to a fusion neural network directly without a transformation of coordinates.

[0076] A system administrator (a person) may label the objects on the images 1002 and 1004 based on his/her experience to set ground truth values of the fusion neural network 1006. With sufficient teaching input by the system administrator, the fusion neural network 1006 may eventually detect the objects described in the images and the position information and output perceived objects 1008 (e.g., other vehicles on the road, accessible areas, lane dividing lanes, etc.) as the results.

[0077] Fig. 11 illustrates a diagram showing an example combined structure of multiple neural networks in the example autonomous driving system with air support in accordance with the disclosure.

[0078] As depicted, the structure described in accordance with Fig. 9 may be combined with the fusion neural network described in Fig. 10. [0079] Images 1108 collected by the UAV 104 and images 1110 collected by the land vehicle 102 may be input to feature extraction networks 1112 and 1114 respectively. Features of the perceived objects by UAV 1104 may be generated by the feature extraction network 1112; features of the perceived objects by land vehicle 1102 may be generated by the feature extraction network 1114. As described above in accordance with Fig. 9, the features can be utilized to generate a transformation matrix 1118. Meanwhile, the features can also be input to a fusion neural network 1126 to recognize objects in the images 1108 and 1110.

[0080] Although the transformation matrix 1118 may be not required to recognize the objects in the images 1108 and 1110, the transformation matrix 1118 may be utilized for future route planning and other decisions for autonomous driving.

[0081] Fig. 12 illustrates a flow chart of an example method for performing autonomous driving in the example autonomous driving system in accordance with the disclosure. The flowchart illustrates a process of implementing autonomous driving with air support.

[0082] At block 1202, the operations of the example method may include collecting, by at least one UAV camera, first ground traffic information. For example, the sensors on the UAV 104 may be configured to collect visual images and/or distances from ground objects to the UAV 104. The visual images and the distances may be further converted to first ground traffic information in a 3D coordinate system.

[0083] At block 1204, the operations of the example method may include transmitting, by a UAV communication module, the collected first ground traffic information to a land vehicle. For example, the first ground traffic information may then be transmitted to the land vehicle 102 via the UAV communication module 408.

[0084] At block 1206, the operations of the example method may include collecting, by one or more vehicle sensors, second ground traffic information surrounding the land vehicle. For example, the land vehicle camera 412 may be configured to capture images of the ground traffic surrounding the land vehicle 102. The land vehicle LiDAR sensor 414 and other land vehicle sensors 416 may be configured to distance information of the surrounding objects. Similarly, the collected images and distance information may be sent to a land vehicle processor 420 to be converted to the second ground traffic information.

[0085] At block 1208, the operations of the example method may include receiving, by a land communication module, the first ground traffic information from the UAV. For example, the land vehicle communication module 418 may be configured to receive the first ground traffic information from the UAV 104 via one or more wireless communication links. [0086] At block 1210, the operations of the example method may include fusing, by a processor, the first ground traffic information and the second ground traffic information to generate a world model. For example, the land vehicle processor 420 may be configured to generate the world model. The world model may include a combination of the information collected respectively by the UAV 104 and the land vehicle 102. In some examples, the first ground traffic information in the 3D coordinate system may be converted to the 2D coordinate system with the land vehicle 102 as the coordinate origin. Such conversion may be performed by a processor of the UAV 104, a processor of the land vehicle 102, or a processor at the controller center 106.

[0087] Fig. 13 illustrates a diagram showing the autonomous driving system with air support in an example scenario. As depicted, in some example scenario, while multiple vehicles are travelling around the land vehicle 102, sensors on the land vehicle 102 may be blocked and cannot identify necessary information to determine the position of the land vehicle 102 itself, which may further lead to incorrect driving decisions such as running a red light or missing an exit. [0088] In this example scenario, since the first ground traffic information includes the position of the land vehicle and other objects unperceivable by the sensors on land vehicle 102, the world model may include the information necessary for generating correct driving decisions, e.g., changing lane when approaching the exist.

[0089] The process and method as depicted in the foregoing drawings may be executed through processing logics including hardware (e.g., circuit, special logic, etc.), firmware, software (e.g, a software embodied in a non-transient computer readable medium), or combination of each two. Although the above describes the process or method in light of certain sequential operation, it should be understood that certain operation described herein may be executed in different orders. Additionally, some operations may be executed concurrently rather than sequentially.

[0090] In the above description, each embodiment of the present disclosure is illustrated with reference to certain illustrative embodiments. Any of the above-mentioned components or devices may be implemented by a hardware circuit (e.g., application specific integrated circuit (ASIC)). Apparently, various modifications may be made to each embodiment without going beyond the wider spirit and scope of the present disclosure presented by the affiliated claims. Correspondingly, the description and accompanying figures should be understood as illustration only rather than limitation. It is understood that the specific order or hierarchy of steps in the processes disclosed is an illustration of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged. Further, some steps may be combined or omitted. The accompanying method claims present elements of the various steps in a sample order and are not meant to be limited to the specific order or hierarchy presented.

[0091] The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein but is to be accorded the full scope consistent with the language claims, wherein reference to an element in the singular is not intended to mean “one and only one” unless specifically so stated, but rather “one or more.” Unless specifically stated otherwise, the term “some” refers to one or more. All structural and functional equivalents to the elements of the various aspects described herein that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the claims. Moreover, nothing disclosed herein is intended to be dedicated to the public regardless of whether such disclosure is explicitly recited in the claims. No claim element is to be construed as a means plus function unless the element is expressly recited using the phrase “means for.”

[0092] Moreover, the term “or” is intended to mean an inclusive “or” rather than an exclusive “or.” That is, unless specified otherwise, or clear from the context, the phrase “X employs A or B” is intended to mean any of the natural inclusive permutations. That is, the phrase “X employs A or B” is satisfied by any of the following instances: X employs A; X employs B; or X employs both A and B. In addition, the articles “a” and “an” as used in this application and the appended claims should generally be construed to mean “one or more” unless specified otherwise or clear from the context to be directed to a singular form.