Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEM AND METHOD FOR AUTOMATICALLY GENERATING A DECISION IN REAL-TIME IN A SPORTING EVENT
Document Type and Number:
WIPO Patent Application WO/2022/175983
Kind Code:
A1
Abstract:
The present disclosure relates to a method and system for automatically generating a decision in real time in a sporting event. The method comprises: (1) capturing, by camera device(s) [106], a first graphical data; (2) determining, by a processing unit [102], a set of coordinates of a popping crease based on the first graphical data; (3) receiving, by the processing unit [102], a second graphical data from the camera devices in real time; (3) automatically determining, by a detection unit [104], a set of coordinates of a contact portion of a player with the popping crease, based on the second graphical data; and (4) automatically generating, by the processing unit [102], the decision in real time based on atleast a comparison of the set of coordinates of popping crease and the set of coordinates of contact portion of the player, and a set of pre-defined rules.

Inventors:
KUMBLE ANIL (IN)
BINAYKIA ABHISHEK (IN)
ARAVIND PAI DANDU (IN)
Application Number:
PCT/IN2022/050144
Publication Date:
August 25, 2022
Filing Date:
February 18, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SPEKTACOM TECH PRIVATE LIMITED (IN)
International Classes:
A63B71/06; G06V40/20
Foreign References:
IN201941052041A2019-12-20
US20210004953A12021-01-07
Attorney, Agent or Firm:
SAHNEY, Garima (IN)
Download PDF:
Claims:
We Claim:

1. A method for automatically generating a decision in real time in a sporting event, the method comprising: capturing, by one or more camera devices [106], a first graphical data; determining, by a processing unit [102], a set of coordinates of a popping crease based on the first graphical data; receiving, by the processing unit [102], a second graphical data from the one or more camera devices [106] in real time; automatically determining, by a detection unit [104], a set of coordinates of a contact portion of a player with the popping crease, based on the second graphical data; and automatically generating, by the processing unit [102], the decision in real time based on atleast a comparison of the set of coordinates of the popping crease and the set of coordinates of the contact portion of the player, and a set of pre-defined rules.

2. The method as claimed in claim 1, wherein atleast one camera device [106] is placed on either side of the popping crease.

3. The method as claimed in claim 1, wherein the determining, by the processing unit [102], the set of coordinates of the popping crease based on the first graphical data, further comprises: processing, by the processing unit [102], the first graphical data to automatically detect the popping crease, processing, by the processing unit [102], the first graphical data to automatically detect edges of the popping crease, and calculating, by the processing unit [102], the set of coordinates of the popping crease based on the detected edges.

4. The method as claimed in claim 1, wherein prior to the automatically determining, by the detection unit [104], the set of coordinates of the contact portion of the player, the method further comprises: identifying, by the detection unit [104], a set of objects in the second graphical data, classifying, by the detection unit [102], each object of the set of objects into one of a first object type, a second object type, and a third object type, identifying, by the detection unit [104], a first set of bounding coordinates for the first object type, and a second set of bounding coordinates for the second object type, determining, by the processing unit [102], a set of intersection over union (IOU) scores between the first set of bounding coordinates and the second set of bounding coordinates, and identifying, by the processing unit [102], the contact portion of the player based on a comparison of each score in the set of IOU scores with an IOU score threshold value.

5. The method as claimed in claim 1, wherein the detecting, by the detection unit [104], the contact portion of the player is based on a keypoint detection model.

6. The method as claimed in claim 5, wherein the keypoint detection model is a pre-trained model.

7. A system for automatically generating a decision in real time in a sporting event, the system comprising: one or more camera devices [106] configured to capture a first graphical data; a processing unit [102] configured to: o determine a set of coordinates of a popping crease based on the first graphical data, and o receive a second graphical data from the one or more camera devices [106] in real time; and a detection unit [104] configured to automatically determine a set of coordinates of a contact portion of a player with the popping crease, based on the second graphical data, wherein: the processing unit [102] is further configured to automatically generate the decision in real time based on atleast a comparison of the set of coordinates of the popping crease and the set of coordinates of the contact portion of the player, and a set of pre-defined rules.

8. The system as claimed in claim 7, wherein atleast one camera device [106] is placed on either side of the popping crease.

9. The system as claimed in claim 7, wherein to determine the set of coordinates of the popping crease based on the first graphical data, the processing unit [102] is further configured to: process the first graphical data to automatically detect the popping crease, process the first graphical data to automatically detect edges of the popping crease, and calculate the set of coordinates of the popping crease based on the detected edges.

10. The system as claimed in claim 7, wherein prior to automatically determining the set of coordinates of the contact portion of the player, the detection unit [104] is configured to: identify a set of objects in the second graphical data, classify each object of the set of objects into one of a first object type, a second object type, and a third object type, and identify a first set of bounding coordinates for the first object type, and a second set of bounding coordinates for the second object type.

11. The system as claimed in claim 10, wherein the processing unit [102] is further configured to: determine a set of intersection over union (IOU) scores between the first set of bounding coordinates and the second set of bounding coordinates, and identify the contact portion of the player based on a comparison of each score in the set of IOU scores with an IOU score threshold value.

12. The system as claimed in claim 7, wherein the detection unit [104] detects the contact portion of the player based on a keypoint detection model.

13. The system as claimed in claim 11, wherein the keypoint detection model is a pre-trained model.

Description:
SYSTEM AND METHOD FOR AUTOMATICALLY GENERATING A DECISION IN REAL-TIME IN A SPORTING EVENT

FIELD OF THE INVENTION

The present disclosure relates to the field of sports technology. More particularly, the present disclosure relates to a system and method for automatically generating a decision in real time in a sporting event.

BACKGROUND

The following description of related art is intended to provide background information pertaining to the field of the disclosure. This section may include certain aspects of the art that may be related to various features of the present disclosure. However, it should be appreciated that this section be used only to enhance the understanding of the reader with respect to the present disclosure, and not as admissions of prior art.

There are a variety of sporting games involving rules around the footwork of the player. For instance, in the game of tennis or badminton, if the player steps over the baseline when striking a serve, a foot fault is said to occur. Similarly, in kabaddi, there are rules involving the foot of the raider including those in determining a bonus point and of a successful/unsuccessful raid. Similarly, in a cricket match, a no ball delivery is an illegal delivery to a batsman that is also judged by the footwork of the player. There are many types of no balls defined by the MCC laws of cricket. For instance, one of the types of no balls in cricket is no balls due to overstepping of the crease by the bowler.

In the current state of the art, such no balls due to overstepping the crease or other footwork related events are judged and evaluated manually by umpires, referees, etc. This is however disadvantageous as the umpires/ referees on field have a lot of workload. In the game of cricket, the umpire on the bowler's end for instance, in addition to evaluating no ball, has to carefully examine all bowled deliveries and make various other on-field decisions such as wides, byes, leg byes, leg before wicket, boundaries and sixes. To reduce the workload of the on-field umpires and in an attempt to improve the decision making, various efforts have been made to shift certain responsibilities of the on-field umpires to third umpires who watch the recorded video of the live game and make decisions on that basis. In recent experiments in cricket, the decision of no ball was also shifted to the third umpire. The third umpire was provided with a live camera feed from the match, and he would manually examine each frame to assess whether the bowler has overstepped the crease and consequently if there is a no ball decision to be made. Since a no ball decision is to be made for every delivery bowled by the bowler, persistent attention of the on field umpire was previously required which has been eliminated by shifting this responsibility to the third umpire. Similar attempts have been made in other sports such as tennis and badminton as well.

The problems with the current state of the art is that the process of footwork event detection is still very much a manual process and thus prone to errors. For instance, in cricket, umpires are error prone to calling the front foot no ball in case of close no ball. Apart from being error prone due to manual intervention, the known techniques are also time consuming. In case of an on field umpire assessing the footwork event, there is only a fraction of time available for the on field umpire to make this assessment. Typically, whenever the on-field umpire has a doubt on the assessment of the footwork event, he takes the opinion of the third umpire as well. This two- step checking becomes even more time consuming for the final decision to be made. Even when the footwork event detection is solely performed by the third umpire, the third umpire checks the footwork of the player in multiple frames from the live camera feed multiple times, before making the decision. Thus, overall manual determination of footwork events is a time consuming process. In the live games, assessment of footwork events is often at crucial moments, for instance when a decision for a batsman being out has been made but the no ball decision is still pending. In such crucial and exciting moments for both the players and the fans, it is disheartening to see incorrect or delayed decisions of no ball being taken manually by the umpires.

Thus, there exists an imperative need in the art to provide a system and method to alleviate the above problems of the existing state of the art. This will help in reducing workload of the on-field umpires, leading to enhanced accuracy of decision making, and will also save a lot of time. SUMMARY

This section is intended to introduce certain objects and aspects of the disclosed method and system in a simplified form and is not intended to identify the key advantages or features of the present disclosure.

Accordingly, one object of the present disclosure is to provide a method and system for automatically generating a decision in real time in a sporting event that is not prone to errors and provides accurate results. Another object of the present disclosure is to provide a method and system for generating a decision in real time in a sporting event that is automatic and does not require human intervention/manual operations. Yet another object of the present disclosure is to provide a method and system for automatically generating a decision in real time in a sporting event that quickly provides results and saves time that might be consumed in manual operations.

In order to achieve at least one of the above mentioned objects, one aspect of the present disclosure is to provide a method for automatically generating a decision in real time in a sporting event. The method comprises capturing, by one or more camera devices, a first graphical data. Further, the method comprises determining, by a processing unit, a set of coordinates of a popping crease based on the first graphical data. Thereafter, the method comprises receiving, by the processing unit, a second graphical data from the one or more camera devices in real time. Further, the method comprises automatically determining, by a detection unit, a set of coordinates of a contact portion of a player with the popping crease, based on the second graphical data. Finally, the method comprises automatically generating, by the processing unit, the decision in real time based on atleast a comparison of the set of coordinates of the popping crease and the set of coordinates of the contact portion of the player, and a set of pre-defined rules.

Another aspect of the present disclosure is to provide a system for automatically generating a decision in real time in a sporting event. The system comprises one or more camera devices configured to capture a first graphical data. Further, the system comprises a processing unit configured to determine a set of coordinates of a popping crease based on the first graphical data. The processing unit is further configured to receive a second graphical data from the one or more camera devices in real time. Further, the system comprises a detection unit configured

B to automatically determine a set of coordinates of a contact portion of a player with the popping crease, based on the second graphical data. Finally, the processing unit is further configured to automatically generate the decision in real time based on atleast a comparison of the set of coordinates of the popping crease and the set of coordinates of the contact portion of the player, and a set of pre-defined rules.

BRIEF DESCRIPTION OF DRAWINGS

The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes disclosure of electrical components, electronic components or circuitry commonly used to implement such components.

Fig. 1 illustrates an architecture of a system for automatically generating a decision in real time in a sporting event, in accordance with exemplary embodiments of the present disclosure.

FIG.2 illustrates an exemplary method flow diagram depicting a method for automatically generating a decision in real time in a sporting event, in accordance with exemplary embodiments of the present disclosure.

Fig. 3 illustrates an instance implementation of a method step of a method for automatically generating a decision in real time in a sporting event, in accordance with exemplary embodiments of the present disclosure.

Fig. 4A illustrates an exemplary image frame showing detection of popping crease, in accordance with an embodiment of the present invention.

Fig. 4B illustrates an exemplary image frame showing detection of the tip of the heel of the bowler, in accordance with an embodiment of the present invention. Fig. 4C illustrates an exemplary image frame showing detection of bowler and shoes, in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION

In the following description, a broad description of the invention is provided to ensure understanding of embodiments of the present disclosure. Several features described hereafter can each be used independently of one another or with any combination of other features. An individual feature may not address all of the problems discussed above or might address only some of the problems discussed above. Some of the problems discussed above might not be fully addressed by any of the features described herein. In the following description, for the purposes of explanation, various specific details are set forth in order to provide a thorough understanding of embodiments of the present disclosure. It will be apparent, however, that embodiments of the present disclosure may be practiced without these specific details. Several features described hereafter can each be used independently of one another or with any combination of other features. An individual feature may not address any of the problems discussed above or might address only some of the problems discussed above. Some of the problems discussed above might not be fully addressed by any of the features described herein. Example embodiments of the present disclosure are described below, as illustrated in various drawings in which like reference numerals refer to the same parts throughout the different drawings. The ensuing description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing an exemplary embodiment. It should be understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the disclosure as set forth.

Specific details are given in the following description to provide a thorough understanding of the embodiments. However, it will be understood by one of ordinary skill in the art that the embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other components may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.

Also, it is noted that individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed but could have additional steps not included in a figure.

The word "exemplary" and/or "demonstrative" is used herein to mean serving as an example, instance, or illustration. For the avoidance of doubt, the subject matter disclosed herein is not limited by such examples. In addition, any aspect or design described herein as "exemplary" and/or "demonstrative" is not necessarily to be construed as preferred or advantageous over other aspects or designs, nor is it meant to preclude equivalent exemplary structures and techniques known to those of ordinary skill in the art. Furthermore, to the extent that the terms "includes," "has," "contains," and other similar words are used in either the detailed description or the claims, such terms are intended to be inclusive— in a manner similar to the term "comprising" as an open transition word— without precluding any additional or other elements.

As used herein, a "processor" or "processing unit" includes processing unit, wherein processor refers to any logic circuitry for processing instructions. A processor may be a general-purpose processor, a special purpose processor, a conventional processor, a digital signal processor, a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits, Field Programmable Gate Array circuits, any other type of integrated circuits, etc. The processor may perform signal coding data processing, input/output processing, and/or any other functionality that enables the working of the system according to the present disclosure. Also, the 'processor' or the 'processing unit' may comprise one or more processors for performing different operations. The 'processor' or 'processing unit' may be present at one place or be distributed and connected to other components using a wired or wireless means. As used herein, "storage unit" or "memory unit" refers to a machine or computer-readable medium including any mechanism for storing information in a form readable by a computer or similar machine. For example, a computer-readable medium includes read-only memory ("ROM"), random access memory ("RAM"), magnetic disk storage media, optical storage media, flash memory devices or other types of machine-accessible storage media. The storage unit stores at least the data that may be required by one or more units of the system to perform their respective functions.

As discussed in the background section, the detection of footwork events in sporting games such as no ball detection in the game of cricket is a manual operation or assessment in the current state of the art. To alleviate the problems of the prior art, the present disclosure provides a system and a method of automatic detection of footwork events seamlessly in real time. The present disclosure completely automates the process of assessing footwork events without any human intervention.

Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the solution provided by the present disclosure.

Figure 1 illustrates an architecture of a system for automatically generating a decision in real time in a sporting event, in accordance with exemplary embodiments of the present disclosure. The system [100] provides a hardware for implementing a method for automatically generating a decision in real time in a sporting event. As shown in Figure 1, the system [100] comprises a processing unit [102], a detection unit [104], a camera unit [106] comprising one or more camera devices, and a memory unit [108] All the components of the system [100] should be construed to be operably connected to each other unless indicated in the disclosure.

The disclosure is described below with an exemplary reference to the game of cricket and with respect to detection of no ball event for the purposes of understanding by a person skilled in the art. However, it will be appreciated by those skilled in the art that the disclosure described below can be used in any sporting game without departing from the spirit and scope of the disclosure.

The one or more camera devices [106] are configured to capture a first graphical data. The first graphical data can be a video data and/or image data. In an implementation, the image frames may be extracted from the video data. This first graphical data is used to determine the coordinates of a crease or a popping crease. This crease or popping crease is the line of reference for the system to take a decision regarding an action of a player in real time in a sporting event. For example, this popping crease can be the crease beyond which a bowler in the game of cricket is not allowed to step or else a decision of default, i.e., a no-ball decision may be generated by the system. Also, the line or reference (hereinafter, 'popping crease') is determined with reference to the placement of the camera devices [106] The camera devices [106] could be fixed or moving. So, the location of the popping crease may also change with a change in the position or location or placement of the camera devices [106]

The present disclosure encompasses the use of two side view cameras [106] providing their output to the processing unit [102] or a separate image processing unit [not shown]. In an implementation, the two side view cameras are placed on either side of the popping crease. In the above example of the game of cricket, the advantage of using a dual camera set-up on either side of the popping crease is that it reduces chances of occlusion of the camera output by the bowler or the non-striker batsman.

For the purpose of determining the popping crease, the camera devices [106] are configured to provide their output to the processing unit [102] Alternatively, the camera devices [106] may provide their output to the memory unit [108] and further, for determining the popping crease, the processing unit [102] may fetch the first graphical data (as captured by the one or more camera devices [106]) from the memory unit [108] Thus, the processing unit [102] is configured to determine a set of coordinates of a popping crease based on the first graphical data. For this determination of exact coordinates of the popping crease, the processing unit [102] first fetches the camera output or camera frames/images from the camera devices [106] and processes it automatically to detect the popping crease. In an implementation to detect the popping crease, the present disclosure encompasses blurring the image to remove noise and smoothening the image. Also, in this implementation, the popping crease is detected using an image processing technique of Hough Line transform. More particularly, a region of interest is selected around the bowler to detect popping crease. In an implementation the region of interest is selected around stumps present near the bowler to detect popping crease, wherein the region of interest is selected based on one of a manual operation and an automatic operation. Thereafter, an edge detection technique, such as Canny edge detection, is used to identify edges and then the Hough line transform is applied to the edges. Hough Line Transform works on the principle that any shape represented in the mathematical form can be detected in an image. It works even if the popping crease is distorted or broken. The Hough line transform identifies all vertical lines and horizontal lines in the image. From this set, the horizontal lines are first removed. The vertical line falling within this region of interest is identified as the popping crease. Pursuant to this processing, the two end points of the popping crease are identified. For instance, refer to Figure 4A which shows coordinates pi (xl, yl) and p2 (x2, y2) identifying the starting and end points of the identified popping crease. Referring to Figure 4A, pl(788, 392) and p2(783, 531) denote the endpoints of a popping crease in an image. Thus, for a particular cricket match, if the cameras are static, the values of the popping crease coordinates remain constant throughout. However, if the cameras are movable, these coordinates may have to be recalculated again and again each time the position of the cameras are changed. Thus, to determine the set of coordinates of the popping crease based on the first graphical data, the processing unit [102] processes the first graphical data to automatically detect the popping crease. Thereafter, the processing unit [102] processes the first graphical data to automatically detect edges of the popping crease. And finally, the processing unit [102] calculates the set of coordinates of the popping crease based on the detected edges. It may be appreciated by a person skilled in the art, that is determination of coordinates of the popping crease may be performed by the processing unit [102] or alternatively, by a separate image processing unit [not shown]. After the processing of the first graphical data, the processing unit [102] or the image processing unit [not shown] provides the exact coordinates of the popping crease.

Once the popping crease is identified, the present disclosure encompasses identification of contact portion of the player. For example, in the game of cricket, for determining a decision regarding no-ball, this contact portion may be the tip of heel of the bowler. As the bowler takes a run-up for delivering/throwing a ball to the batsman, he has to make sure that he does not step beyond the popping crease. As he jumps before the popping crease in the process of delivering a ball and lands on the ground generally on his heel, the tip of the heel that is close to the popping crease, on which the bowler lands on the ground, is thus considered to be the contact portion of the bowler. For this purpose of detecting the coordinates of the contact portion of the player by the detection unit [104], the processing unit [102] receives a second graphical data from the one or more camera devices [106] in real time and sends the data to the detection unit [104] or alternatively, the detection unit [104] receives the second graphical data directly from the one or more camera devices [106] in real time. For instance, the point indicated by the reference R1 (x3, y3) in Figure 4B represents the coordinates of the tip of the heel of the bowler detected by the system of the present invention.

In an implementation, the disclosure encompasses using keypoint detection technique for detecting the coordinates of the contact portion of the player. The model is trained with a customised data set of thousands of images comprising a bowler bowling a delivery and the popping crease. This customised dataset is further enlarged by creating different variants of the same image such as by random rotation, changing brightness and contrast, blurring and noise, shifting and scaling, and cropping the images. The model is then trained with this customised dataset to make it more robust. In an instance implementation, the dataset is divided into a training set and test set in the ratio of 70:30. Augmentation is applied to each of the sets prior to the training process. In an implementation, the pre-trained keypoint detection model may be a neural network model and AdamW optimizer is used to optimize the weights of the neural network to further improve the performance of the pre-trained keypoint detection model. Also, in an implementation the optimizer is not limited to AdamW optimizer and any other optimizer that is obvious to a person skilled in the art for implementing the features of the present disclosure may be used. Also, in an implementation Adaptive learning rates are used during training to achieve better performance of the pre-trained keypoint detection model. In order to adjust the learning rates to improve the performance, a step decay learning rate scheduler is set which drops the learning rate after every n epochs by y, wherein y refers to a percentage by which the learning rates are required to be dropped. The pre-trained keypoint detection model is finetuned on the custom dataset to identify the contact portion of the player, for example, tip of heel of the bowler in cricket. It uses Faster R-CNN architecture with Resnet50 and Feature Pyramid Network (FPN) as a backbone. Thus, an input image is passed through the backbone for feature extraction. The backbone in this exemplary implementation is a combination of resnet50 and FPN. So, during training, each layer of the keypoint detection model is finetuned on the training dataset except the first 2 layers of the Resnet50. After each epoch, the model is evaluated on the test set and then the best model is saved for the inference. In an instance implementation, the keypoint detection model is a pre-trained model, trained based on neural network deep learning techniques, implemented in the detection unit [104] And, the detection unit [104] detects the contact portion of the player using this keypoint detection model. Thus, in this implementation, the detection unit [104] is a pre-trained unit, trained based on neural network deep learning techniques.

In the above example of implementing the present disclosure for making a decision regarding no ball, as shown in Figure 4C, the keypoint detection model localizes the bowler and his front foot, and then the tip of heel is detected within the bounding box B2 of the front foot. The keypoint detection model identifies and draws the bounding box B1 around the bowler and the bounding box B2 of the front foot of the bowler, and then the keypoint is detected within the bounding box of the front foot. The keypoint detection model uses a deep learning architecture known as Faster R-CNN (Region-based Convolutional Neural Network) for object detection and Convolutional Neural Networks for the keypoint detection. Faster R-CNN is a combination of Fast R-CNN and Regional Proposal Network.

Thus, for detecting the contact portion of the player using this keypoint detection model, the detection unit [104] identifies a set of objects in the second graphical data. This second graphical data is real time graphical data captured by the camera devices [106] The camera devices [106] are configured to send the second graphical data to the detection unit [104] Also, the camera devices [106] are also configured to send the second graphical data to the processing unit [102] and the memory unit [108] for further use of the data. After this, the processing unit [102] is configured to classify the each object of set of objects into one of a first object type, a second object type, and a third object type. In the above example of detecting no-ball in the game of cricket, the first object type can be the bowler, the second object type can be all the shoes in the frame of the second graphical data, and a third object type can be other things in the frame such as the wickets, non-striker batsman, etc. As shown in Figure 4C, the box Bl, i.e., the box around the bowler shows the object type 1, i.e., the bowler. In an implementation, the bowler is detected using second graphical data. In another implementation, the bowler is detected using sensors placed on the arm or body of the bowler, or using pressure sensors placed on the ground. Thus, the detection unit [104] identifies a first set of bounding coordinates for the first object type, and a second set of bounding coordinates for the second object type. Referring to Figure 4C, it may be noted that box B2 as shown in Figure 4C represents a box for the bounding coordinates for the shoe of the bowler, i.e., one of the coordinates in the second set of bounding coordinates for the second object type. Thus, in an implementation, the detection unit [104] identifies the landing frame of the bowler. An intersection over union (IOU) of the relevant shoe for the current frame and previous frame is computed for multiple frames when the bowler is bowling a delivery (detected by round arm action using sensors, etc. as explained above). That frame is then detected as the target frame for which intersection over union is above a minimum predefined threshold.

Further, continuing with the above system performing method steps of the present disclosure, the detection unit [104] detects the correct contact portion of the player. Referring to the above example of detecting no-ball in the game of cricket, once the bounding box for all the shoes and the bounding box for the bowler in the second graphical data is identified, the shoes of the bowler are identified based on the location of the detected shoes. For instance, in Figure 4C, four shoes may be detected represented by SHI, SH2, SH3 and SH4. To detect the shoes of the bowler, the intersection over union (IOU) for each of the shoes are computed with respect to the bounding box of the bowler and then those shoes are identified for which the IOU score is greater than a minimum threshold. Intersection over Union is an evaluation metric used to measure the accuracy of an object detector on a particular dataset. Thus, the processing unit [102] is configured to determine a set of IOU scores between the first set of bounding coordinates and the second set of bounding coordinates. Since SHI and SH2 lie completely within the bounding box B1 of the bowler, these are identified as the shoes of the bowler. Further, the processing unit [102] identifies the contact portion of the player based on a comparison of each score in the set of IOU scores with an IOU score threshold value. The disclosure also encompasses identifying the shoes of the bowler when there are more than two shoes in the bounding box of the bowler. Referring to above example, it may be a case that the non-striker batsman is standing in a position such that the shoes of the non-striker batsman fall within the bounding box of the bowler. In such a scenario, the detection unit [104] processes the graphical data by drawing an imaginary line from the stumps, thus creating a region of interest. Then, the bowler's shoes are selected based on the region of interest. Following this, the detection unit [104] automatically determines a set of coordinates of a contact portion of a player with the popping crease. The disclosure also encompasses identifying the correct shoe of interest. Referring to the above example, if the front foot no-ball is to be assessed, the shoe closer to the popping crease is identified as the relevant shoe. Here, the shoe of interest is identified by the coordinates of the shoes of the bowler, i.e. the shoe closer to the bottom right of the bowler's bounding box B1 is identified as the shoe of interest. Once the relevant shoe of interest is identified, the coordinates for the contact portion of the player are determined. In the above example, this contact portion is the tip of the heel of the bowler, and its coordinates may be determined as R1 (793, 381). Thereafter, the processing unit [102] automatically generates the decision in real time based on atleast a comparison of the set of coordinates of the popping crease and the set of coordinates of the contact portion of the player, and a set of pre-defined rules. This set of predefined rules changes for every sporting event, and also changes for detecting different actions in the same sporting event for which the present disclosure may be used. Referring to the above example, the processing unit [102] generates a decision regarding no-ball based on the coordinates of the popping crease and the coordinates of the tip of heel of the bowler. The set of pre-defined rules in this example may contain that if the value of coordinates of the tip of heel of the bowler, as shown in Figure 4B, is outside an accepted value range associated with coordinates of the popping crease, then the decision regarding no-ball may be generated as to indicate that the ball delivered by the bowler is a no-ball delivery. Also, the predefined rules may contain that: x2 — xl x3 — xl

(y3-yl)(x2-xl)-(x3-xl)(y2-yl) y2 - yl y3 - yl

The result will be positive for points lying on one side, negative on the other, and zero for lying on the line. So, based on the value of the above determinant, no-ball or fair delivery is decided if:

1. Value of determinant > 0, meaning that the point lies behind the popping crease. In this case, a decision of 'fair delivery' should be generated.

2. Value of determinant < 0, meaning that the point lies on or after the popping crease. In this case, a decision of 'no-ball' should be generated

Considering the above exemplary points, pl(xl,yl): (788,392) and p2(x2,y2): (783,531) are the coordinates of the popping crease, and Rl(x3,y3): (793,381) is the tip of heel of the bowler. Computing the determinant:

(y3-yl)(x2-xl)-(x3-xl)(y2-yl) = (381-392)(783-788) - (793-788)(531-392)

= - 640 , which is a negative value Based on this, the sign of the determinant is negative, the processing unit [102] understands that the tip of heel of the shoe of the bowler crossed the popping crease, which is a default in the game of cricket, and thus generates a decision automatically in real time that the ball delivered is a no-ball delivery.

Now referring to Figure 2 which illustrates an instance implementation of a method step of a method for automatically generating a decision in real time in a sporting event, in accordance with exemplary embodiments of the present disclosure. The method starts at step 202 and goes to step 204. At step 204, the one or more camera devices [106] capture a first graphical data. The first graphical data can be a video data and/or image data. In an implementation, the image frames may be extracted from the video data. This first graphical data is used to determine the coordinates of a crease or a popping crease. This crease or popping crease is the line of reference for the system to take a decision regarding an action of a player in real time in a sporting event. This line or reference or popping crease is determined with reference to the placement of the camera devices [106] The camera devices [106] could be fixed or moving. So, the location of the popping crease may also change with a change in the position or location or placement of the camera devices [106]

The present disclosure encompasses the use of two side view cameras [106] providing their output to the processing unit [102] or a separate image processing unit [not shown]. In an implementation, the two side view cameras are placed on either side of the popping crease. In the above example of the game of cricket, the advantage of using a dual camera set-up on either side of the popping crease is that it reduces chances of occlusion of the camera output by the bowler or the non-striker batsman.

For the purpose of determining the popping crease, the camera devices [106] provide their output to the processing unit [102] Alternatively, the camera devices [106] may provide their output to the memory unit [108] and further, for determining the popping crease, the processing unit [102] may fetch the first graphical data (as captured by the one or more camera devices [106]) from the memory unit [108] Thus, at Step 206, the processing unit [102] determines a set of coordinates of a popping crease based on the first graphical data. For this determination of exact coordinates of the popping crease, the processing unit [102] first fetches the camera output or camera frames/images from the camera devices [106] and processes it automatically to detect the popping crease. In an implementation, the present disclosure encompasses blurring the image to remove noise and smoothening the image. In an implementation, the popping crease is detected using an image processing technique of Hough Line transform. A region of interest is selected around the bowler to detect popping crease. In an implementation the region of interest is selected around stumps present near the bowler to detect popping crease, wherein the region of interest is selected based on one of a manual operation and an automatic operation. Thereafter, an edge detection technique, such as Canny edge detection, is used to identify edges and then the Hough line transform is applied to the edges. The Hough line transform identifies all vertical lines and horizontal lines in the image. From this set, the horizontal lines are first removed. The vertical line falling within this region of interest is identified as the popping crease. Pursuant to this processing, the two end points of the popping crease are identified. It is pertinent to note that for a particular cricket match, if the cameras are static, the values of the popping crease coordinates remain constant throughout. However, if the cameras are movable, these coordinates may have to be recalculated again and again each time the position of the cameras are changed. Thus, to determine the set of coordinates of the popping crease based on the first graphical data, the processing unit [102] processes the first graphical data to automatically detect a popping crease. Thereafter, the processing unit [102] processes the first graphical data to automatically detect edges of the popping crease. And finally, the processing unit [102] calculates the set of coordinates of the popping crease based on the detected edges. It may be appreciated by a person skilled in the art, that is determination of coordinates of the popping crease may be performed by the processing unit [102] or alternatively, by a separate image processing unit [not shown]. After the processing of the first graphical data, the processing unit [102] or the image processing unit [not shown] provides the exact coordinates of the popping crease.

Once the popping crease is identified, the present disclosure encompasses identification of contact portion of the player. For example, in the game of cricket, for determining a decision regarding no-ball, this contact portion may be the tip of heel of the bowler. As the bowler takes a run-up for delivering/throwing a ball to the batsman, he has to make sure that he does not step beyond the popping crease. As he jumps before the popping crease in the process of delivering a ball and lands on the ground generally on his heel, the tip of the heel that is close to the popping crease, on which the bowler lands on the ground, is thus considered to be the contact portion of the bowler. For this above purpose of detecting the coordinates of the contact portion of the player by the detection unit [104], the processing unit [102] at Step 208, receives a second graphical data from the one or more camera devices [106] in real time, or alternatively, the detection unit [104] receives the second graphical data directly from the one or more camera devices [106] in real time.

In an embodiment, the disclosure encompasses using keypoint detection technique for detecting the coordinates of the contact portion of the player. The model is trained with a customised data set of thousands of images comprising a bowler bowling a delivery and the popping crease. This customised dataset is further enlarged by creating different variants of the same image such as by random rotation, changing brightness and contrast, blurring and noise, shifting and scaling, and cropping the images. The model is then trained with this customised dataset to make it more robust. In an instance implementation, the dataset is divided into a training set and test set in the ratio of 70:30. Augmentation is applied to each of the sets prior to the training process. In an implementation the pre-trained keypoint detection model may be the neural network model and AdamW optimizer is used to optimize the weights of the neural network to further improve the performance of the pre-trained keypoint detection model. Also, in an implementation the optimizer is not limited to AdamW optimizer and any other optimizer that is obvious to a person skilled in the art for implementing the features of the present disclosure may be used. Adaptive learning rates are used during training to achieve better performance of the pre-trained keypoint detection model. In order to adjust the learning rates to improve the performance, a step decay learning rate scheduler is set which drops the learning rate after every n epochs by y, wherein y refers to a percentage by which the learning rates are required to be dropped. So, during training, each layer of the keypoint detection model is finetuned on the training dataset except the first 2 layers of the resnet50. After each epoch, the model is evaluated on the test set and then the best model is saved for the inference. In an instance implementation, the keypoint detection model is a pre-trained model, trained based on neural network deep learning techniques, implemented in the detection unit [104] And, the detection unit [104] detects the contact portion of the player using this keypoint detection model. Thus, in this implementation, the detection unit [104] is a pre-trained unit, trained based on neural network deep learning techniques.

Referring now to Figure 3 which indicates instance implementation of a method step of a method for automatically generating a decision in real time in a sporting event, in accordance with exemplary embodiments of the present disclosure. The method starts at step 302 and goes to step 304.

At Step 304, for detecting the contact portion of the player using this keypoint detection model, the detection unit [104] identifies a set of objects in the second graphical data. This second graphical data is real time graphical data captured by the camera devices [106] The camera devices [106] send the second graphical data to the processing unit [102], or alternatively, the detection unit [104] receives the second graphical data directly from the one or more camera devices [106] in real time. Also, the camera devices [106] send the second graphical data to the processing unit [102] and the memory unit [108] for further use of the data.

After this, at Step 306, the detection unit [102] classifies each object of the set of objects into a first object type, a second object type, and a third object type. In the above example of detecting no-ball in the game of cricket, the first object type can be the bowler, the second object type can be all the shoes in the frame of the second graphical data, and a third object type can be other things in the frame such as the wickets, non-striker batsman, etc.

Further, the processing unit [102] has to determine the correct contact portion of the player. For this, at Step 308, the detection unit [104], identifies a first set of bounding coordinates for the first object type, and a second set of bounding coordinates for the second object type. Referring to the above example of detecting no-ball in the game of cricket, once the bounding box for all the shoes and the bounding box for the bowler in the second graphical data is identified, the shoes of the bowler are identified based on the location of the detected shoes. For instance, in Figure 4C, four shoes may be detected represented by SHI, SH2, SH3 and SH4. To detect the shoes of the bowler, the intersection over union (IOU) for each of the shoes are computed with respect to the bounding box of the bowler and then those shoes are identified for which the IOU score is greater than a minimum threshold. Intersection over Union is an evaluation metric used to measure the accuracy of an object detector on a particular dataset. Thus, at Step 310, the processing unit [102] determines a set of IOU scores between the first set of bounding coordinates for the first object type, and the second set of bounding coordinates for the second object type. Referring to the above example, since SHI and SH2 lie completely within the bounding box B1 of the bowler, these are identified as the shoes of the bowler. Further, at Step 312, the processing unit [102] identifies the contact portion of the player based on a comparison of each score in the set of IOU scores with an IOU score threshold value. The disclosure also encompasses identifying the shoes of the bowler when there are more than two shoes in the bounding box of the bowler. And, the method of instance implementation as shown in Figure 3 ends at step 314.

Following this, now referring back to Figure 2, at step 210, the detection unit [104] automatically determines a set of coordinates of a contact portion of a player with the popping crease.

Thereafter, at Step 212, the processing unit [102] automatically generates the decision in real time based on atleast a comparison of the set of coordinates of the popping crease and the set of coordinates of the contact portion of the player, and a set of pre-defined rules. This set of predefined rules changes for every sporting event, and also changes for detecting different actions in the same sporting event for which the present disclosure may be used. Referring to the above example, the processing unit [102] generates a decision regarding no-ball based on the coordinates of the popping crease and the coordinates of the tip of the heel of the bowler. The set of pre-defined rules in this example may contain that if the value of coordinates of the tip of heel of the bowler, as shown in Figure 4B, is outside an accepted value range associated with coordinates of the popping crease, then the decision regarding no-ball will be generated as to indicate that the ball delivered by the bowler is a no-ball.

It is evident from the above disclosure, that the solution provided by the disclosure is technically advanced as compared to the prior known solutions. The present disclosure provides a method and system that enables a person skilled in the art for automatically generating a decision in real time in a sporting event that is not prone to errors and provides accurate results. The existing solutions were manual and therefore, not very accurate. Further, the present disclosure provides a method and system generating a decision in real time in a sporting event that is automatic and does not require human intervention/manual operations. Also, the manual solutions due to their nature, were very time-consuming. Further, the present disclosure provides a method and system for automatically generating a decision in real time in a sporting event that quickly provides results and saves time that might be consumed in manual operations.

While considerable emphasis has been placed herein on the disclosed embodiments, it will be appreciated that many embodiments can be made and that many changes can be made to the embodiments without departing from the principles of the present disclosure. These and other changes in the embodiments of the present disclosure will be apparent to those skilled in the art, whereby it is to be understood that the foregoing descriptive matter to be implemented is illustrative and non-limiting.