Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS AND APPARATUSES FOR CLASSIFYING GAME PROPS AND TRAINING NEURAL NETWORK
Document Type and Number:
WIPO Patent Application WO/2023/047173
Kind Code:
A1
Abstract:
According to one or more embodiments of the present disclosure, a method and an apparatus for classifying a game prop as well as a method and an apparatus for training a neural network are provided. A first target classification network and a second target classification network are obtained by jointly training a first initial classification network and a second initial classification network. The first initial classification network and the second initial classification network share a same feature extraction network, the first initial classification network is configured to classify a game prop involved in an input image based on a feature extracted by the feature extraction network from the input image, and the second initial classification network is capable of classifying a game environment where a game prop involved in the input image is located based on the feature extracted by the feature extraction network from the input image.

Inventors:
MA JIABIN (SG)
CHEN JINGHUAN (SG)
LIU CHUNYA (SG)
Application Number:
PCT/IB2021/058828
Publication Date:
March 30, 2023
Filing Date:
September 28, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SENSETIME INT PTE LTD (SG)
International Classes:
G06N3/04; G06N3/08; A63F13/60; G07F17/32
Domestic Patent References:
WO2018227294A12018-12-20
Foreign References:
US20200402342A12020-12-24
CN101655914A2010-02-24
Download PDF:
Claims:
CLAIMS

1. A method of classifying a game prop, comprising: inputting a to-be-processed image involving a target game prop into a pre -trained first target classification network; obtaining a class of the target game prop output by the first target classification network; wherein the first target classification network and a second target classification network are obtained by jointly training a first initial classification network and a second initial classification network, and the first initial classification network and the second initial classification network share a same initial feature extraction network; the first initial classification network is configured to classify a game prop involved in an input image based on a feature extracted by the initial feature extraction network from the input image; the second initial classification network is configured to classify a game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network from the input image.

2. The method according to claim 1, wherein the first initial classification network is trained based on a prop classification loss, and the prop classification loss indicates a classification loss of the first initial classification network classifying a game prop involved in a first sample image based on a feature of the first sample image; the second initial classification network is trained based on an environment classification loss, and the environment classification loss indicates a classification loss of the second initial classification network classifying a game environment where a game prop involved in a first sample image or a second sample image is located based on a feature of the first sample image or the second sample image; the feature of the first sample image and the feature of the second sample image are both obtained by the initial feature extraction network.

3. The method according to claim 2, wherein the first initial classification network comprises: the initial feature extraction network configured to perform feature extraction for the input image and an initial prop classification network configured to classify the game prop involved in the input image based on a feature extracted by the initial feature extraction network; the second initial classification network comprises: the initial feature extraction network and an initial environment classification network configured to classify the game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network; the method further comprises: obtaining an intermediate classification network comprising an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample image and the prop classification loss; obtaining a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss, wherein the first target classification network comprises the target feature extraction network and the target prop classification network.

4. The method according to claim 3, wherein the environment classification loss comprises: a first loss indicating a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and a second loss indicating a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

5. The method according to claim 4, wherein obtaining the target feature extraction network and the target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss comprises: obtaining the target environment classification network by fixing the intermediate feature extraction network and training the initial environment classification network based on the first loss, the first sample image, and the second sample image; obtaining the target feature extraction network by fixing the target environment classification network and training the intermediate feature extraction network based on the second loss, the first sample image, and the second sample image.

6. The method according to any one of claims 2-5, wherein the first sample image and the second sample image are screened out from game images based on a preset condition, the game images are obtained by imaging a game area, and the preset condition comprises that a preset event relating to a game prop occurred in the game area is detected from the game image.

7. The method according to any one of claims 2-6, further comprising: obtaining an image set comprising a first image sub-set and a second image sub-set, each image in the first image sub-set is marked with a first label, each image in the second image sub-set is not marked with the first label, and the first label indicates a class of a game prop; marking images in the first image sub-set and images in the second image sub-set with different second labels, wherein the second label indicates a class of a game environment where a game prop is located; determining an image in the first image sub-set as the first sample image; and determining an image in the second image sub-set as the second sample image.

8. A method of training a neural network, being used to training a first target classification network from a first initial classification network, wherein the first target classification network is used to classify a game prop; and the method comprises: obtaining a first sample image and a second sample image; obtaining the first target classification network by jointly training the first initial classification network and a second initial classification network based on the first sample image and the second sample image, wherein the first initial classification network and the second initial classification network share a same initial feature extraction network; wherein the first initial classification network is configured to classify a game prop involved in the first sample image based on a feature extracted by the initial feature extraction network from the first sample image; the second initial classification network is configured to classify a game environment where a game prop involved in the first sample image or the second sample image is located based on a feature extracted by the initial feature extraction network from the first sample image or the second sample image.

9. The method according to claim 8, wherein the first initial classification network is trained based on a prop classification loss, and the prop classification loss indicates a classification loss of the first initial classification network classifying the game prop involved in the first sample image based on the feature of the first sample image; the second initial classification network is trained based on an environment classification loss, and the environment classification loss indicates a classification loss of the second initial classification network classifying a game environment where the game prop involved in the first sample image or the second sample image is located based on the feature of the first sample image or the second sample image.

10. The method according to claim 9, wherein the first initial classification network comprises: the initial feature extraction network configured to perform feature extraction on an input image and an initial prop classification network configured to classify a game prop involved in the input image based on the feature extracted by the initial feature extraction network; the second initial classification network comprises: the initial feature extraction network and an initial environment classification network configured to classify a game environment where the game prop involved in the input image is located based on the feature extracted by the initial feature extraction network; jointly training the first initial classification network and the second initial classification network comprises: obtaining an intermediate classification network comprising an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample image and the prop classification loss; obtaining a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss; the first target classification network comprises the target feature extraction network and the target prop classification network.

11. The method according to claim 10, wherein the environment classification loss comprises: a first loss indicating a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and a second loss indicating a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

12. The method according to claim 11, wherein obtaining the target feature extraction network and the target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image or the second sample image and the environment classification loss comprises: obtaining the target environment classification network by fixing the intermediate feature extraction network and training the initial environment classification network based on the first loss, the first sample image, and the second sample image; obtaining the target feature extraction network by fixing the target environment classification network and training the intermediate feature extraction network based on the second loss, the first sample image, and the second sample image.

13. The method according to any one of claims 9-12, wherein the first sample image and the second sample image are screened out from game images based on a preset condition, the game images are obtained by imaging a game area, and the preset condition comprises that a preset event relating to a game prop occurred in the game area is detected from the game image.

14. The method according to any one of claims 9-13, further comprising: obtaining an image set comprising a first image sub-set and a second image sub-set, each image in the first image sub-set is marked with a first label, each image in the second image sub-set is not marked with the first label, and the first label indicates a class of a game prop; marking images in the first image sub-set and images in the second image sub-set with different second labels, wherein the second label indicates a class of a game environment where a game prop is located; determining an image in the first image sub-set as the first sample image; and 21 determining an image in the second image sub-set as the second sample image.

15. An apparatus for classifying a game prop, comprising: an inputting module, configured to input a to-be-processed image involving a target game prop into a pre -trained first target classification network; a classifying module, configured to obtain a class of the target game prop output by the first target classification network; wherein the first target classification network and a second target classification network are obtained by jointly training a first initial classification network and a second initial classification network, and the first initial classification network and the second initial classification network share a same initial feature extraction network; the first initial classification network is configured to classify a game prop involved in an input image based on a feature extracted by the initial feature extraction network from the input image; the second initial classification network is configured to classify a game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network from the input image.

16. An apparatus for training a neural network, being used to training a first target classification network from a first initial classification network, wherein the first target classification network is used to classify a game prop; and the apparatus comprises: a sample image obtaining module, configured to obtain a first sample image and a second sample image; a training module, configured to obtain the first target classification network by jointly training the first initial classification network and a second initial classification network based on the first sample image and the second sample image, wherein the first initial classification network and the second initial classification network share a same initial feature extraction network; wherein the first initial classification network is configured to classify a game prop involved in the first sample image based on a feature extracted by the initial feature extraction network from the first sample image; the second initial classification network is configured to classify a game environment where a game prop involved in the first sample image or the second sample image is located based on a feature extracted by the initial feature extraction network from the first sample image or the second sample image.

17. A computer readable storage medium storing computer programs, wherein the programs are executed by a processor to implement the method according to any one of claims 1-14.

18. A computer device, comprising a memory, a processor and computer programs stored in the memory and run on the processor, wherein the programs are executed by the processor to implement the method according to any one of claims 1-14.

19. A computer program, comprising computer-readable codes which, when executed in an electronic device, cause a processor in the electronic device to perform the method of any one of claims 1-14.

Description:
METHODS AND APPARATUSES FOR CLASSIFYING GAME PROPS AND TRAINING NEURAL NETWORK

CROSS-REFERENCE TO RELATED APPLICATION

[01] The present disclosure claims priority to Singapore Patent Application No. 10202110639U, filed on September 26, 2021, all of which are incorporated herein by reference in their entirety.

TECHNICAL FIELD

[02] The present disclosure relates to the field of computer vision technologies, and in particular to a method and an apparatus for classifying a game prop as well as a method and an apparatus for training a neural network.

BACKGROUND

[03] In a game scene, it is required to perform classification for game props. The above process is usually achieved by a neural network. Usually, there are various game props, and game environment conditions such as lamp light, background, and shade in an actual game area are complex and changeable. In order to obtain a robust neural network, the above two factors should both be taken into account. In this case, sample images for training a neural network needs to involve several different classes of game props in several different game environments, and complexities of collecting and marking the sample images are relatively high.

SUMMARY

[04] The present disclosure provides a method and an apparatus for classifying a game prop and training a neural network.

[05] According to a first aspect of embodiments of the present disclosure, a method of classifying a game prop is provided. The method includes: inputting a to-be-processed image involving a target game prop into a pre -trained first target classification network; obtaining a class of the target game prop output by the first target classification network; where the first target classification network and a second target classification network are obtained by jointly training a first initial classification network and a second initial classification network, and the first initial classification network and the second initial classification network share one initial feature extraction network; the first initial classification network is configured to classify a game prop involved in an input image based on a feature extracted by the initial feature extraction network from the input image; the second initial classification network is configured to classify a game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network from the input image.

[06] In some embodiments, the first initial classification network is trained based on a prop classification loss, and the prop classification loss is a classification loss that the first initial classification network classifies a game prop involved in a first sample image based on a feature of the first sample image; the second initial classification network is trained based on an environment classification loss, and the environment classification loss is a classification loss that the second initial classification network classifies a game environment where a game prop involved in a first sample image or a second sample image is located based on a feature of the first sample image or the second sample image; the feature of the first sample image and the feature of the second sample image are both obtained by the initial feature extraction network.

[07] In some embodiments, the first initial classification network includes the initial feature extraction network configured to perform feature extraction for the input image and an initial prop classification network configured to classify the game prop involved in the input image based on a feature extracted by the initial feature extraction network; the second initial classification network includes the initial feature extraction network and an initial environment classification network configured to classify the game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network; the method further includes: obtaining an intermediate classification network including an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample image and the prop classification loss; obtaining a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss, where the first target classification network includes the target feature extraction network and the target prop classification network.

[08] In some embodiments, the environment classification loss includes a first loss and a second loss, the first loss indicates a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and the second loss indicates a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

[09] In some embodiments, obtaining the target feature extraction network and the target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss includes: obtaining the target environment classification network by fixing the intermediate feature extraction network, and training the initial environment classification network based on the first loss, the first sample image, and the second sample image; obtaining the target feature extraction network by fixing the target environment classification network, and training the intermediate feature extraction network based on the second loss, the first sample image, and the second sample image.

[10] In some embodiments, the first sample image and the second sample image are screened out from a game image based on a preset condition, the game image is obtained by imaging a game area, and the preset condition includes: detecting that a preset event relating to a game prop occurs in the game area from the game image.

[11] In some embodiments, the method further includes: obtaining an image set, where the image set includes a first image sub-set and a second image sub-set, an image in the first image sub-set includes a first label, an image in the second image sub-set does not include a first label, and the first label indicates a class of a game prop; marking a second label of an image in the first image sub-set and a second label of an image in the second image sub-set as different second labels, where the second label indicates a class of a game environment where a game prop is located; determining an image in the first image sub-set as the first sample image and determining an image in the second image sub-set as the second sample image.

[12] According to a second aspect of embodiments of the present disclosure, a method of training a neural network is provided. The method is used to obtain a first target classification network by training a first initial classification network, where the first target classification network is used to classify a game prop. The method includes: obtaining a first sample image and a second sample image; obtaining the first target classification network by jointly training the first initial classification network and a second initial classification network based on the first sample image and the second sample image, where the first initial classification network and the second initial classification network share one initial feature extraction network. The first initial classification network is configured to classify a game prop involved in the first sample image based on a feature extracted by the initial feature extraction network from the first sample image. The second initial classification network is configured to classify a game environment where a game prop involved in the first sample image or the second sample image is located based on a feature extracted by the initial feature extraction network from the first sample image or the second sample image.

[13] In some embodiments, the first initial classification network is trained based on a prop classification loss, and the prop classification loss is a classification loss that the first initial classification network classifies the game prop involved in the first sample image based on the feature of the first sample image; the second initial classification network is trained based on an environment classification loss and the environment classification loss is a classification loss that the second initial classification network classifies a game environment where the game prop involved in the first sample image or the second sample image is located based on the feature of the first sample image or the second sample image.

[14] In some embodiments, the first initial classification network includes the initial feature extraction network configured to perform feature extraction for an input image and an initial prop classification network configured to classify a game prop involved in the input image based on a feature extracted by the initial feature extraction network; jointly training the first initial classification network and the second initial classification network includes: obtaining an intermediate classification network including an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample image and the prop classification loss; obtaining a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss, where the first target classification network includes the target feature extraction network and the target prop classification network.

[15] In some embodiments, the environment classification loss includes a first loss and a second loss, the first loss indicates a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and the second loss indicates a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

[16] In some embodiments, obtaining the target feature extraction network and the target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss includes: obtaining the target environment classification network by fixing the intermediate feature extraction network, and training the initial environment classification network based on the first loss, the first sample image, and the second sample image; obtaining the target feature extraction network by fixing the target environment classification network, and training the intermediate feature extraction network based on the second loss, the first sample image, and the second sample image.

[17] In some embodiments, the first sample image and the second sample image are screened out from a game image based on a preset condition, the game image is obtained by imaging a game area, and the preset condition includes: detecting that a preset event relating to a game prop occurs in the game area from the game image.

[18] In some embodiments, the method further includes: obtaining an image set, where the image set includes a first image sub-set and a second image sub-set, an image in the first image sub-set includes a first label, an image in the second image sub-set does not include a first label, and the first label indicates a class of a game prop; marking a second label of an image in the first image sub-set and a second label of an image in the second image sub-set as different second labels, where the second label indicates a class of a game environment where a game prop is located; determining an image in the first image sub-set as the first sample image and determining an image in the second image sub-set as the second sample image.

[19] According to a third aspect of embodiments of the present disclosure, an apparatus for classifying a game prop is provided. The apparatus includes: an inputting module, configured to input a to-be-processed image involving a target game prop into a pre-trained first target classification network; a classifying module, configured to obtain a class of the target game prop output by the first target classification network; where the first target classification network and a second target classification network are obtained by jointly training a first initial classification network and a second initial classification network, and the first initial classification network and the second initial classification network share one initial feature extraction network; the first initial classification network is configured to classify a game prop involved in an input image based on a feature extracted by the initial feature extraction network from the input image; the second initial classification network is configured to classify a game environment where a game prop involved in the input image is located based on a feature extracted by the initial feature extraction network from the input image.

[20] In some embodiments, the first initial classification network is trained based on a prop classification loss, and the prop classification loss is a classification loss that the first initial classification network classifies a game prop involved in a first sample image based on a feature of the first sample image; the second initial classification network is trained based on an environment classification loss, and the environment classification loss is a classification loss that the second initial classification network classifies a game environment where a game prop involved in a first sample image or a second sample image is located based on a feature of the first sample image or the second sample image; the feature of the first sample image and the feature of the second sample image are both obtained by the initial feature extraction network.

[21] In some embodiments, the first initial classification network includes the initial feature extraction network configured to perform feature extraction for the input image and an initial prop classification network configured to classify the game prop involved in the input image based on a feature extracted by the initial feature extraction network; the second initial classification network includes the initial feature extraction network and an initial environment classification network configured to classify the game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network; the apparatus further includes: a first training module, configured to obtain an intermediate classification network including an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample image and the prop classification loss; a second training module, configured to obtain a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss, where the first target classification network includes the target feature extraction network and the target prop classification network.

[22] In some embodiments, the environment classification loss includes a first loss and a second loss, the first loss indicates a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and the second loss indicates a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

[23] In some embodiments, the second training module is configured to: obtain the target environment classification network by fixing the intermediate feature extraction network, and training the initial environment classification network based on the first loss, the first sample image, and the second sample image; obtain the target feature extraction network by fixing the target environment classification network, and training the intermediate feature extraction network based on the second loss, the first sample image, and the second sample image.

[24] In some embodiments, the first sample image and the second sample image are screened out from a game image based on a preset condition, the game image is obtained by imaging a game area, and the preset condition includes: detecting that a preset event relating to a game prop occurs in the game area from the game image.

[25] In some embodiments, the apparatus further includes: an image set obtaining module, configured to obtain an image set, where the image set includes a first image sub-set and a second image sub-set, an image in the first image sub-set include a first label, an image in the second image sub-set does not include a first label, and the first label indicates a class of a game prop; a marking module, configured to mark a second label of an image in the first image sub-set and a second label of an image in the second image sub-set as different second labels, where the second label indicates a class of a game environment where a game prop is located; and a sample image determining module, configured to determine an image in the first image sub-set as the first sample image and determine an image in the second image sub-set as the second sample image.

[26] According to a fourth aspect of embodiments of the present disclosure, an apparatus for training a neural network is provided. The apparatus is used to obtain a first target classification network by training a first initial classification network, where the first target classification network is used to classify a game prop. The apparatus includes: a sample image obtaining module, configured to obtain a first sample image and a second sample image; and a training module, configured to obtain the first target classification network by jointly training the first initial classification network and a second initial classification network based on the first sample image and the second sample image, where the first initial classification network and the second initial classification network share one initial feature extraction network. The first initial classification network is configured to classify a game prop involved in the first sample image based on a feature extracted by the initial feature extraction network from the first sample image. The second initial classification network is configured to classify a game environment where a game prop involved in the first sample image or the second sample image is located based on a feature extracted by the initial feature extraction network from the first sample image or the second sample image.

[27] In some embodiments, the first initial classification network is trained based on a prop classification loss, and the prop classification loss is a classification loss that the first initial classification network classifies the game prop involved in the first sample image based on the feature of the first sample image; the second initial classification network is trained based on an environment classification loss and the environment classification loss is a classification loss that the second initial classification network classifies a game environment where the game prop involved in the first sample image or the second sample image is located based on the feature of the first sample image or the second sample image.

[28] In some embodiments, the first initial classification network includes the initial feature extraction network configured to perform feature extraction for the input image and an initial prop classification network configured to classify the game prop involved in the input image based on a feature extracted by the initial feature extraction network; the second initial classification network includes the initial feature extraction network and an initial environment classification network configured to classify the game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network; the training module includes: a first training unit, configured to obtain an intermediate classification network including an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample image and the prop classification loss; a second training unit, configured to obtain a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample image, the second sample image, and the environment classification loss, where the first target classification network includes the target feature extraction network and the target prop classification network.

[29] In some embodiments, the environment classification loss includes a first loss and a second loss, the first loss indicates a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and the second loss indicates a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

[30] In some embodiments, the second training unit is configured to: obtain the target environment classification network by fixing the intermediate feature extraction network, and training the initial environment classification network based on the first loss, the first sample image, and the second sample image; obtain the target feature extraction network by fixing the target environment classification network, and training the intermediate feature extraction network based on the second loss, the first sample image, and the second sample image.

[31] In some embodiments, the first sample image and the second sample image are screened out from a game image based on a preset condition, the game image is obtained by imaging a game area, and the preset condition includes: detecting that a preset event relating to a game prop occurs in the game area from the game image.

[32] In some embodiments, the apparatus further includes: an image set obtaining module, configured to an image set, where the image set includes a first image sub-set and a second image sub-set, an image in the first image sub-set include a first label, an image in the second image sub-set does not include a first label, and the first label indicates a class of a game prop; a marking module, configured to mark a second label of an image in the first image sub-set and a second label of an image in the second image sub-set as different second labels, where the second label indicates a class of a game environment where a game prop is located; and a sample image determining module, configured to determine an image in the first image sub-set as the first sample image and determine an image in the second image sub-set as the second sample image.

[33] According to a fifth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium storing computer programs, where the programs are executed by a processor to implement the method according to any embodiment.

[34] According to a sixth aspect of embodiments of the present disclosure, there is provided a computer device, including a memory, a processor, and computer programs stored in the memory and run on the processor, where the programs are executed by the processor to implement the method according to any embodiment.

[35] According to a seventh aspect of embodiments of the present disclosure, there is provided a computer program, including computer-readable codes which, when executed in an electronic device, cause a processor in the electronic device to perform the method according to any embodiment.

[36] In the embodiments of the present disclosure, a first target classification network and a second target classification network are obtained by jointly training a first initial classification network and a second initial classification network. The first initial classification network and the second initial classification network share the same feature extraction network, and the second initial classification network is capable of classifying a game environment where a game prop involved in an input image is located based on the feature extracted by the feature extraction network from the input image. Therefore, training the first initial classification network can be assisted with class information of the game environment determined by the second initial classification network from the input image. In this way, the robustness of the trained first target classification network can be improved so that the first target classification network is applicable to different game environments, without collecting a large number of sample images involving different classes of game props in different game environments. Thus, the complexities of collecting and marking the sample images are reduced, and the training cost of the neural network is lowered.

[37] It is understood that the above general descriptions and subsequent detailed descriptions are merely illustrative and explanatory rather than limiting of the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

[38] The accompanying drawings, which are incorporated in and constitute a part of the present description, illustrate examples consistent with the present disclosure and serve to explain the principles of the present disclosure together with the description.

[39] FIGS. 1A, IB, 1C, and ID schematically illustrate different game props respectively.

[40] FIGS. 2A and 2B schematically illustrate game props in different game environments respectively.

[41] FIG. 3 illustrates a flowchart of a method of classifying a game prop according to an embodiment of the present disclosure.

[42] FIG. 4A schematically illustrates a general flow of training a neural network according to an embodiment of the present disclosure.

[43] FIG. 4B schematically illustrates a network structure in which a neural network is trained according to an embodiment of the present disclosure.

[44] FIG. 5 illustrates a flowchart of a method of training a neural network according to an embodiment of the present disclosure.

[45] FIG. 6 illustrates a block diagram of an apparatus for classifying a game prop according to an embodiment of the present disclosure.

[46] FIG. 7 illustrates a block diagram of an apparatus for training a neural network according to an embodiment of the present disclosure.

[47] FIG. 8 schematically illustrates a structure of a computer device according to an embodiment of the present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

[48] Exemplary embodiments will be described in detail herein, with the illustrations thereof represented in the drawings. When the following descriptions involve the drawings, like numerals in different drawings, refer to like or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the present application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of the present application as described in detail in appended claims.

[49] The terms used in the present disclosure are for the purpose of describing particular examples only and are not intended to limit the present disclosure. Terms “a”, “the” and “said” in their singular forms in the present disclosure and the appended claims are also intended to include plurality, unless clearly indicated otherwise in the context. It should also be understood that the term “and/or” as used herein refers to and includes all possible combinations of one or more of the associated listed items. Further, the term “at least one” herein represents any one of multiple or any combination of at least two of multiple.

[50] It is to be understood that although different information may be described using the terms such as first, second, third, etc. in the present disclosure, the information should not be limited to these terms. These terms are used only to distinguish the same type of information from each other. For example, the first information may also be referred to as the second information without departing from the scope of the present disclosure, and similarly, the second information may also be referred to as the first information. Depending on the context, the word “if’ as used herein may be interpreted as “when” or “as” or “determining in response to”.

[51] In order to help those skilled in the art to better understand the technical solutions of the embodiments of the present disclosure and make the above object, features and advantages of the embodiments of the present disclosure clearer, the technical solution of the embodiments of the present disclosure will be further detailed in combination with accompanying drawings.

[52] A game scene usually includes several classes of game props such as game coins, cards, game markers, and dices. Different game props may further include several sub-classes, for example, the cards include various cards having different points or different patterns, and the game coins include several types of game coins having different face values. FIGS. 1A and IB are schematic diagrams of two different classes of cards. FIGS. 1C and ID are schematic diagrams of two different classes of game coins. It can be seen that different classes of cards have different patterns and different classes of game coins have different colors and patterns. Of course, in an actual application, different classes of game props may have other different properties (for example, size and material) in addition to color and pattern.

[53] Furthermore, the game scene may have different classes of game environments in different areas or in different periods of time. Factors affecting game environment class may include but not limited to a class of game area, lamp light color, and shade area. Different game environment classes correspond to one or more types of factors, for example, a game environment A and a game environment B may have the same game area class, the same lamp light color but different shade areas; the game environment B and a game environment C may have the same lamp light color but different game area classes as well as different shade areas.

[54] In order to train out a robust neural network for identifying a game prop in a game scene, sample images are desired to involve different classes of game props in different game environments as possible. For example, if a total class number of game props is M and a total class number of game environments is N, a class number of desired sample images is MxN. Thus, the neural network can learn features of different game props in different game environments. FIGS. 2A and 2B are schematic diagrams of game props in different game environments. Different game environments are illustrated with different shade classes. It can be seen that in the game environment shown in FIG. 2A, lamp light is approximately is directly above the game coins, the shade areas of the game coins are smaller, and the shade of the game coin A blocks a smaller part of the game coin B; in the game environment shown in FIG. 2B, lamp light is located at the upper left corner of the game coins, the shade areas of the game coins are larger, and the shade of the game coin A blocks a larger part of the game coin B. Therefore, the game environment may have a certain impact on the identification of the neural network. As a result, to improve the robustness of the neural network, the sample images for training the neural network need to include a sample image of game coins in the game environment shown in FIG. 2A and a sample image of game coins in the game environment shown in FIG. 2B at the same time.

[55] However, in the above manner, sample images are collected in several different game scenes, and all collected sample images are marked, leading to high complexities of collecting and marking the sample images and hence increasing the training cost of the neural network.

[56] Based on this, an embodiment of the present disclosure provides a method of classifying a game prop. As shown in FIG. 3, the method includes the following steps.

[57] At step 301, a to-be-processed image involving a target game prop is input into a pre-trained first target classification network.

[58] At step 302, a class of the target game prop output by the first target classification network is obtained.

[59] The first target classification network and a second target classification network are obtained by jointly training a first initial classification network and a second initial classification network, and the first initial classification network and the second initial classification network share one initial feature extraction network.

[60] The first initial classification network is configured to classify a game prop involved in an input image based on a feature extracted by the initial feature extraction network from the input image.

[61] The second initial classification network is configured to classify a game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network from the input image. [62] In step 301, the target game prop may be a game prop in a game area, for example, a game coin and a card and the like. A to-be-processed image involving the game prop may be obtained by imaging the game area. In some embodiments, an image collection apparatus may be disposed around the game area to collect a video of a game in real time when the game proceeding and input a video frame involving the target game prop in the collected video into the first target classification network as the to-be-processed image. Alternatively, all video frames in the collected video are input into the first target classification network which screens out the to-be-processed image involving the target game prop for subsequent processing.

[63] In step 302, the first target classification network may output a class of the target game prop, for example, may indicate whether the target game prop is a game coin or a card. Alternatively, the first target classification network may also output a sub-class of the target game prop. For example, the target game prop is a game coin, and each sub-class of game coins corresponds to one face value. Thus, the first target classification network may output a face value class of the game coin. For another example, the target game prop is a card, and each sub-class of cards corresponds to one point number and one suit. Thus, the first target classification network may output the point number and the suit of the card. Further, the first target classification network may also output a number of the target game props. For example, for game coins stacked in a vertical direction, the first target classification network may output the number of the game coins forming this stack of game coins.

[64] In this embodiment, the first target classification network may be obtained through multi-task joint training, that is, the first target classification network and the second target classification network are obtained by jointly training the first initial classification network and the second initial classification network. The first target classification network is a neural network obtained from jointly training the first initial classification network, and the second target classification network is a neural network obtained from jointly training the second initial classification network.

[65] The first initial classification network is used to perform a prop classification task, that is, classify a game prop involved in an input image based on a feature extracted by the initial feature extraction network from the input image. The second initial classification network is used to perform an environment classification task, that is, classify a game environment where a game prop involved in the input image is located based on a feature extracted by the initial feature extraction network from the input image. The input image refers to an image input into the feature extraction network in a general sense. The input image may refer to, depending on training stages, the first sample image or the second sample image.

[66] By a multi-task joint training, training the first initial classification network can be assisted with class information of the game environment determined by the second initial classification network from the input image. In this way, the robustness of the trained first target classification network can be improved such that the first target classification network is applicable to several different game environments without collecting a large number of sample images involving different classes of game props in different game environments. And thus, the complexities of collecting and marking the sample images are reduced, and the training cost of the neural network is lowered.

[67] In some embodiments, the first initial classification network is trained based on a prop classification loss, and the prop classification loss indicates a classification loss of the first initial classification network classifying a game prop involved in the first sample image based on a feature of the first sample image. The second initial classification network is trained based on an environment classification loss, and the environment classification loss indicates a classification loss of the second initial classification network classifying a game environment where a game prop involved in the first sample image or the second sample image is located based on a feature of the first sample image or the second sample image. The feature of the first sample image and the feature of the second sample image are both obtained by the initial feature extraction network.

[68] By training the first initial classification network based on the prop classification loss, the first target classification network is trained to learn features for distinguishing different classes of game props and thus have an adequate classification accuracy. By training the second initial classification network based on the environment classification loss, a public part (i.e. the feature extraction network) between the second initial classification network and the first initial classification network can be constrained by a classification result of the second initial classification network during the training, thereby reducing influence of different game environments on the first target classification network and improving the robustness of the first target classification network in different game environments.

[69] The first sample image may be labeled with a first label and a second label, and the first label is used to indicate a class of a game prop involved in the first sample image. The second sample image may only be labeled with a second label for indicating a class of a game environment where a game prop involved in the second sample image is located. In some embodiments, some images may be collected and marked with the first label and then taken as the first sample images; some other images are collected, and marked with the second label and then taken as the second sample images. Thus, the first sample image and the second sample image are different images.

[70] In some embodiments, an image set may also be obtained, where the image set includes a first image sub-set and a second image sub-set, an image in the first image sub-set may include a first label and a second label, and an image in the second image sub-set does not include a first label. A second label of an image in the first image sub-set and a second label of an image in the second image sub-set are marked as different second labels. An image in the first image sub-set is determined as the first sample image, and an image in the second image sub-set is determined as the second sample image.

[71] For example, the images in the first image sub-set may be pre-collected and marked with the first label, and the first image sub-set may include images collected in several different game environments. All images in the first image sub-set may be marked with the same second label, for example, “1”. The second image sub-set may also include images collected in several different game environments. The images in the second image sub-set may not marked with the first label. All images in the second image sub-set are marked with the same second label, and the second label of the images in the second image sub-set is different from the second label of the images in the first image sub-set. For example, the images in the second image sub-set is marked with a second label “0”. In this manner, it is possible to directly specify two different second labels only, without determining a specific game environment class for each image in the image set respectively. Further, it is possible to mark the first label of the images in the first image sub-set only, without determining the first label for the images in the second image sub-set. Thus, the marking complexity of the sample images is reduced.

[72] The image in the first image sub-set and the image in the second image sub-set may be video frames obtained by backflow from the video collected from a scene of a game area in real time, or may be images collected in a centralized manner from a simulated game scene obtained by simulating a real game scene.

[73] In an embodiment of the present disclosure, it is not required to collect and mark the sample images corresponding to combinations of different game props and different game environments but to collect and mark the first sample images having the first label and specify the second labels of the first sample image marked with the first label and the second sample image unmarked with the first label as different labels. In the related art, it is assumed that a total class number of game props is M, a total class number of game environments is N, and a number of each class of sample images is 1. In this case, the number of sample images to be collected and marked is M X N. In this embodiment, M first sample images that respectively carry the first label and the second label, and N (Kn^N) second sample images that respectively do not carry the first label but only carry the second label may be only collected and marked, thus effectively reducing the number of the desired sample images and lowering the complexities of collecting and marking the sample images.

[74] In order to obtain sample images more valuable for training and to improve the classification accuracy of the first target classification network, the first sample images and the second sample images may be screened out from game images based on a preset condition. The preset condition may include that a preset event relating to a game prop occurred in the game area is detected from the game image. The preset event may be an event in which a game prop is incorrectly operated. In an application scene, a detection and identification result for the game images obtained by imaging the game area may be output to a service layer, such that the service layer may determine whether the game prop is incorrectly operated. For example, a placement position of a specific game prop is detected from one game image and then be output to the service layer which may determine whether the placement position of the game prop is within a preset permitted placement area. If it is not within the permitted placement area, the service layer will report an error. For another example, a placement sequence of specific game props may be identified from multiple game images and then be sent to the service layer which may determine whether the placement sequence of the game props is consistent with a preset placement sequence. In a case of not consistent, the service layer will report an error. Some or all of the game images for which the service layer reports an error may be marked to obtain the first sample images and/or the second sample images.

[75] In some embodiments, the first initial classification network includes an initial feature extraction network configured to perform feature extraction on an image input into the first initial classification network and an initial prop classification network configured to classify a game prop involved in the image input into the first initial classification network based on a feature extracted by the initial feature extraction network. The second initial classification network includes the initial feature extraction network and an initial environment classification network configured to classify a game environment where a game prop involved in the image input into the second initial classification network is located based on a feature extracted by the initial feature extraction network. Optionally, the initial feature extraction network is a convolutional neural network, and a network structure of the initial feature extraction network may be resnet network subject. The initial prop classification network includes a fully connected layer and a softmax layer to output a class of a single game prop, or the initial prop classification network includes a fully connected layer and a Connectionist Temporal Classification (CTC) network to identify a game prop such as stacked game coins and output a sequence identification result.

[76] By referring to FIGS. 4A and 4B, training the first target classification network is described below.

[77] Firstly, an intermediate classification network including an intermediate feature extraction network and a target prop classification network is obtained by first training the initial feature extraction network and the initial prop classification network based on the first sample images and the prop classification loss. The prop classification loss may be determined based on a difference between a classification result of the target prop classification network and a true class of a game prop involved in the first sample image. In some embodiments, the prop classification loss may be a cross entropy loss.

[78] Next, a target feature extraction network and a target environment classification network are obtained by second training the intermediate feature extraction network and the initial environment classification network based on the first sample images, the second sample images, and the environment classification loss. The first target classification network includes the target feature extraction network and the target prop classification network. By the second training, the initial environment classification network and the intermediate feature extraction network are adversarial trained in a way that the target environment classification network is trained to be incapable of identifying an environment class corresponding to the feature extracted by the target feature extraction network, that is, the feature extracted for different game environments by the target feature extraction network has a same distribution feature. In this manner, influence of different classes of game environments on the classification result of the game prop can be reduced, thereby increasing the robustness of the first target classification network.

[79] In some embodiments, the environment classification loss includes a first loss and a second loss, the first loss indicates a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and the second loss indicates a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs. With the first loss, a target environment classification network may be trained to have a good classification performance, and with the second loss, a target feature extraction network may be trained to have a good feature extraction performance. By marking the first sample images and the second sample images with different second labels, even if they involve the same game environment, the features extracted by the target feature extraction network, which is trained based on the first sample images and the second sample images from the intermediate feature extraction network pre-trained out merely based on the first sample image, can confuse the classification performance of the target environment classification network, which is trained based on the first sample images and the second sample images in a way that the target environment classification network cannot identify an environment class based on the feature extracted by the target feature extraction network.

[80] The second training includes two stages. In a first stage, the intermediate feature extraction network is fixed, and a target environment classification network having a good classification performance is obtained by training the initial environment classification network based on the first loss, the first sample images, and the second sample images; in a second stage, the target environment classification network is fixed, and a target feature extraction network having a good feature extraction performance is obtained by training the intermediate feature extraction network based on the second loss, the first sample images, and the second sample images. Through the two stages of trainings, a convergence speed of the training can be increased, and the first target classification network is trained to be more stable.

[81] In the entire training the first target classification network, various training steps are alternately iterated to convergence (the target environment classification network cannot identify an environment class and the target prop classification network functions well). The specific process is described below.

[82] A) an intermediate neural network including an intermediate feature extraction network and a target prop classification network is obtained by training the initial prop classification network based on the first sample images and the prop classification loss (loss3) and updating network parameters of the initial feature extraction network and the initial prop classification network. In this step, the initial environment classification network is not trained.

[83] B) an intermediate feature extraction network and a target environment classification network are obtained by fixing the intermediate feature extraction network while training the initial environment classification network based on the first sample images, the second sample images, and the first loss (lossl). In this step, the target prop classification network is not trained. The first loss may be a cross entropy loss. In an environment class “2”, the first loss is denoted as follows: i lossl=-^ /? ’)logg(0 i=0

[84] where p(i) is a true environment class probability vector, [1, 0] represents the second label “1”, [0, 1] represents the second label “0”, and q(i) is a game environment class predicted by the initial environment classification network.

[85] C) a target feature extraction network is obtained by fixing the target environment classification network while training the intermediate feature extraction network based on the first sample images, the second sample images, and the second loss (loss2). In this step, the target prop classification network is not trained. This step aims to make the target feature extraction network to be unable to identify a class of game environment, that is, the optimization target is uniform distribution [0.5, 0.5]. The second loss is denoted below. 0.5 log q(i)

[86] By the above training, the first target classification network including the target feature extraction network and the target prop classification network while the second target classification network including the target feature extraction network and the target environment classification network are obtained. In a reasoning stage, a game prop involved in a to-be-processed image can be classified by using only the first target classification network, without using the second target classification network.

[87] Furthermore, a test image may also be collected and marked from a real game scene to perform performance test for the first target classification network. If a classification accuracy of the first target classification network is higher than a preset accuracy threshold, the first target classification network is determined as a final neural network for classifying a game prop, and otherwise the first target classification network is trained again.

[88] In an actual application, different game scenes usually have different game environments. When the first target classification network in one game scene is used for another game scene, it is usually required to collect and mark sample images again to train the first target classification network. In the embodiments of the present disclosure, the first target classification network is capable of adapting to different game scenes. In the embodiments of the present disclosure, the first target classification network which achieves high accuracy in a specified scene but generates error easily in a new test environment may be subjected to a fast generalization performance improvement to adapt to several all-new and different test environments, thereby improving the robustness of the first target classification network.

[89] As shown in FIG. 5, an embodiment of the present disclosure provides a method of training a neural network, which is used to obtain a first target classification network by training a first initial classification network. The first target classification network is used to classify a game prop. The method includes the following steps.

[90] At step 501, a first sample image and a second sample image are obtained.

[91] At step 502, the first target classification network is obtained by jointly training the first initial classification network and a second initial classification network based on the first sample images and the second sample images, where the first initial classification network and the second initial classification network share one initial feature extraction network.

[92] The first initial classification network is configured to classify a game prop involved in the first sample image based on a feature extracted by the initial feature extraction network from the first sample image.

[93] The second initial classification network is configured to classify a game environment where a game prop involved in the first sample image or the second sample image is located based on a feature extracted by the initial feature extraction network from the first sample image or the second sample image.

[94] In some embodiments, the first initial classification network is trained based on a prop classification loss, and the prop classification loss is a classification loss that the first initial classification network classifies the game prop involved in the first sample image based on the feature of the first sample image; the second initial classification network is trained based on an environment classification loss and the environment classification loss is a classification loss that the second initial classification network classifies a game environment where the game prop involved in the first sample image or the second sample image is located based on the feature of the first sample image or the second sample image.

[95] In some embodiments, the first initial classification network includes the initial feature extraction network configured to perform feature extraction for an input image and an initial prop classification network configured to classify a game prop involved in the input image based on a feature extracted by the initial feature extraction network; the second initial classification network includes the initial feature extraction network and an initial environment classification network configured to classify a game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network. Jointly training the first initial classification network and the second initial classification network includes: obtaining an intermediate classification network including an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample images and the prop classification loss; obtaining a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample images, the second sample images, and the environment classification loss, where the first target classification network includes the target feature extraction network and the target prop classification network.

[96] In some embodiments, the environment classification loss includes a first loss and a second loss, the first loss indicates a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and the second loss indicates a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

[97] In some embodiments, obtaining the target feature extraction network and the target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample images, the second sample images, and the environment classification loss includes: obtaining the target environment classification network by fixing the intermediate feature extraction network, and training the initial environment classification network based on the first loss, the first sample images, and the second sample images; obtaining the target feature extraction network by fixing the target environment classification network, and training the intermediate feature extraction network based on the second loss, the first sample images, and the second sample images.

[98] In some embodiments, the first sample image and the second sample image are screened out from a game image based on a preset condition, the game image is obtained by imaging a game area, and the preset condition includes: detecting that a preset event relating to a game prop occurs in the game area from the game image.

[99] In some embodiments, the method further includes: obtaining an image set, where the image set includes a first image sub-set and a second image sub-set, an image in the first image sub-set include a first label, an image in the second image sub-set does not include a first label, and the first label indicates a class of a game prop; marking a second label of an image in the first image sub-set and a second label of an image in the second image sub-set as different second labels, where the second label indicates a class of a game environment where a game prop is located; determining an image in the first image sub-set as the first sample image and determining an image in the second image sub-set as the second sample image.

[100] The details of the method of training a neural network may be seen from the embodiments of the above method of classifying a game prop and will not be repeated herein.

[101] Those skilled in the art may understand that in the above methods of the specific embodiments, a sequence of drafting various steps does not mean a strict execution sequence to constitute any limitation to the implementation process, and the specific execution sequence of various steps should be determined based on their function and possible internal logics.

[102] As shown in FIG. 6, an embodiment of the present disclosure provides an apparatus for classifying a game prop. The apparatus includes the following modules.

[103] An inputting module 601 is configured to input a to-be-processed image involving a target game prop into a pre-trained first target classification network.

[104] A classifying module 602 is configured to obtain a class of the target game prop output by the first target classification network.

[105] The first target classification network and a second target classification network are obtained by jointly training a first initial classification network and a second initial classification network, and the first initial classification network and the second initial classification network share one initial feature extraction network.

[106] The first initial classification network is configured to classify a game prop involved in an input image based on a feature extracted by the initial feature extraction network from the input image.

[107] The second initial classification network is configured to classify a game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network from the input image.

[108] In some embodiments, the first initial classification network is trained based on a prop classification loss, and the prop classification loss is a classification loss that the first initial classification network classifies a game prop involved in a first sample image based on a feature of the first sample image; the second initial classification network is trained based on an environment classification loss and the environment classification loss is a classification loss that the second initial classification network classifies a game environment where a game prop involved in a first sample image or a second sample image is located based on the feature of the first sample image or the second sample image. The feature of the first sample image and the feature of the second sample image are both obtained by the initial feature extraction network.

[109] In some embodiments, the first initial classification network includes the initial feature extraction network configured to perform feature extraction for the input image and an initial prop classification network configured to classify the game prop involved in the input image based on a feature extracted by the initial feature extraction network; the second initial classification network includes the initial feature extraction network and an initial environment classification network configured to classify the game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network; the apparatus further includes: a first training module, configured to obtain an intermediate classification network including an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample images and the prop classification loss; and a second training module, configured to obtain a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample images, the second sample images, and the environment classification loss, where the first target classification network includes the target feature extraction network and the target prop classification network.

[110] In some embodiments, the environment classification loss includes a first loss and a second loss, the first loss indicates a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and the second loss indicates a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

[111] In some embodiments, the second training module is configured to: obtain the target environment classification network by fixing the intermediate feature extraction network, and training the initial environment classification network based on the first loss, the first sample images, and the second sample images; obtain the target feature extraction network by fixing the target environment classification network, and training the intermediate feature extraction network based on the second loss, the first sample images, and the second sample images.

[112] In some embodiments, the first sample image and the second sample image are screened out from a game image based on a preset condition, the game image is obtained by imaging a game area, and the preset condition includes: detecting that a preset event relating to a game prop occurs in the game area from the game image.

[113] In some embodiments, the apparatus further includes: an image set obtaining module, configured to an image set, where the image set includes a first image sub-set and a second image sub-set, an image in the first image sub-set include a first label, an image in the second image sub-set does not include a first label, and the first label indicates a class of a game prop; a marking module, configured to mark a second label of an image in the first image sub-set and a second label of an image in the second image sub-set as different second labels, where the second label indicates a class of a game environment where a game prop is located; and a sample image determining module, configured to determine an image in the first image sub-set as the first sample image and determine an image in the second image sub-set as the second sample image.

[114] As shown in FIG. 7, an embodiment of the present disclosure provides an apparatus for training a neural network. The apparatus is used to obtain a first target classification network by training a first initial classification network, where the first target classification network is used to classify a game prop. The apparatus includes the following modules.

[115] A sample image obtaining module 701 is configured to obtain a first sample image and a second sample image.

[116] A training module 702 is configured to obtain the first target classification network by jointly training the first initial classification network and a second initial classification network based on a first sample image and a second sample image, wherein the first initial classification network and the second initial classification network share one initial feature extraction network.

[117] The first initial classification network is configured to classify a game prop involved in the first sample image based on a feature extracted by the initial feature extraction network from the first sample image; the second initial classification network is configured to classify a game environment where a game prop involved in the first sample image or the second sample image is located based on a feature extracted by the initial feature extraction network from the first sample image or the second sample image.

[118] In some embodiments, the first initial classification network is trained based on a prop classification loss, and the prop classification loss is a classification loss that the first initial classification network classifies a game prop involved in the first sample image based on a feature of the first sample image; the second initial classification network is trained based on an environment classification loss, and the environment classification loss is a classification loss that the second initial classification network classifies a game environment where a game prop involved in the first sample image or the second sample image is located based on a feature of the first sample image or the second sample image.

[119] In some embodiments, the first initial classification network includes the initial feature extraction network configured to perform feature extraction for the input image and an initial prop classification network configured to classify the game prop involved in the input image based on a feature extracted by the initial feature extraction network; the second initial classification network includes the initial feature extraction network and an initial environment classification network configured to classify the game environment where the game prop involved in the input image is located based on a feature extracted by the initial feature extraction network; the training module includes: a first training unit, configured to obtain an intermediate classification network including an intermediate feature extraction network and a target prop classification network by first training the initial feature extraction network and the initial prop classification network based on the first sample images and the prop classification loss; a second training unit, configured to obtain a target feature extraction network and a target environment classification network by second training the intermediate feature extraction network and the initial environment classification network based on the first sample images, the second sample images, and the environment classification loss, where the first target classification network includes the target feature extraction network and the target prop classification network.

[120] In some embodiments, the environment classification loss includes a first loss and a second loss, the first loss indicates a difference between a game environment class predicted by the initial environment classification network and a true game environment class where the game prop involved in the first sample image or the second sample image is located, and the second loss indicates a capability of the initial environment classification network for identifying a game environment class to which a feature extracted by the intermediate feature extraction network belongs.

[121] In some embodiments, the second training unit is configured to: obtain the target environment classification network by fixing the intermediate feature extraction network, and training the initial environment classification network based on the first loss, the first sample images, and the second sample images; obtain the target feature extraction network by fixing the target environment classification network, and training the intermediate feature extraction network based on the second loss, the first sample images, and the second sample images.

[122] In some embodiments, the first sample image and the second sample image are screened out from a game image based on a preset condition, the game image is obtained by imaging a game area, and the preset condition includes: detecting that a preset event relating to a game prop occurs in the game area from the game image.

[123] In some embodiments, the apparatus further includes: an image set obtaining module, configured to an image set, where the image set includes a first image sub-set and a second image sub-set, an image in the first image sub-set include a first label, an image in the second image sub-set does not include a first label, and the first label indicates a class of a game prop; a marking module, configured to mark a second label of an image in the first image sub-set and a second label of an image in the second image sub-set as different second labels, where the second label indicates a class of a game environment where a game prop is located; and a sample image determining module, configured to determine an image in the first image sub-set as the first sample image and determine an image in the second image sub-set as the second sample image.

[124] In some embodiments, the apparatus according to the embodiments of the present disclosure has functions and modules which may be used to execute the method according to the above method embodiments. The specific implementation may be referred to the above method embodiments and thus will not be repeated herein for simplicity.

[125] An embodiment of the present disclosure further provides a computer device at least including a memory, a processor and computer programs stored in the memory and run on the processor. The programs are executed by the processor to implement the method according to any one of the above embodiments.

[126] FIG. 8 is a schematic diagram of a hardware structure of a more specific computer device according to an embodiment of the present disclosure. The device may include a processor 801, a memory 802, an input/output interface 803, a communication interface 804, and a bus 805. The processor 801, the memory 802, the input/output interface 803 and the communication interface 804 are in communication connection inside the device via the bus 805.

[127] The processor 801 may be implemented by a general central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits or the like to execute relevant programs so as to realize the technical solution of the embodiments of the present disclosure. The processor 801 may further include a graphics card which may be Nvidiatitan X graphics card or 1080Ti graphics card or the like.

[128] The memory 802 may be implemented by read only memory (ROM), random access memory (RAM), static storage device, and dynamic storage device and the like. The memory 802 stores operating system and other application programs. When the technical solutions of the embodiments of the present disclosure are realized by software or firmware, relevant programs codes are stored in the memory 802 and may be invoked and executed by the processor 801. [129] The input/output interface 803 is used to connect with an inputting/outputting module to realize information input and output. The inputting/outputting module may be configured as a component in a device (not shown), or externally connected with a device to provide corresponding functions. The input device may include keyboard, mouse, touch screen, microphone, and various sensors and the like, and the output device may include display, loudspeaker, vibrator and indicator lamp and the like.

[130] The communication interface 804 is used to connect with a communication module (not shown) to realize communication interaction between the present device and other devices. The communication module may realize communication in a wired manner (for example, USB and network cable and the like) or in a wireless manner (for example, mobile network, WIF, Bluetooth and the like).

[131] The bus 805 includes a passage which transmits information among various components of the device (e.g. the processor 801, the memory 802, the input/output interface 803 and the communication interface 804).

[132] It should be noted that although the above device only shows the processor 801, the memory 802, the input/output interface 803, the communication interface 804 and the bus 805, the device may further include other components to realize normal operation during a specific implementation. Furthermore, those skilled in the art may understand that the above device may only include the components required for realizing the technical solution of the embodiments of the present disclosure rather than all components shown in figures.

[133] An embodiment of the present disclosure further provides a computer readable storage medium, storing computer programs, where the programs are executed by a processor to implement the method according to any one of the above embodiments.

[134] The computer readable storage medium includes permanent, non-permanent, mobile and non-mobile media, which can realize information storage by any method or technology. The information may be computer readable instructions, data structures, program modules and other data. The examples of the computer storage medium include but not limited to: phase change random access memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), and other types of RAMs, Read-Only Memory (ROM), an Electrically-Erasable Programmable Read-Only Memory (EEPROM), a Flash Memory, or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, cassette type magnetic tape, magnetic disk storage or other magnetic storage device or other non-transmission medium for storing information accessible by computing devices. As defined in the present disclosure, the computer readable medium does not include transitory computer readable media such as modulated data signals or carriers.

[135] An embodiment of the present disclosure further provides a computer program, including computer-readable codes, when the computer-readable codes are executed in an electronic device, may cause a processor in the electronic device to perform the method according to any one of the above embodiments.

[136] It may be known from descriptions of the above embodiments that persons skilled in the art may clearly understand that the embodiments of the present disclosure may be implemented by means of software and a necessary general hardware platform. Based on such understanding, the technical solutions of embodiments of the present disclosure essentially or a part contributing to the prior art may be embodied in the form of a software product, and the computer software product may be stored in a storage medium, such as a ROM/RAM, a diskette or a compact disk, and includes several instructions for enabling a computer device (such as a personal computer, a server or a network device) to perform the methods of different embodiments or some parts of the embodiments of the present disclosure.

[137] The systems, apparatuses, modules or units described in the above embodiments may be specifically implemented by a computer chip or an entity or may be implemented by a product with a particular function. A typical implementing device may be a computer and the computer may be specifically a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, a game console, a tablet computer, a wearable device, or a combination of any several devices of the above devices.

[138] Different embodiments in the present disclosure are described in a progressive manner. Each embodiment focuses on the differences from other embodiments with those same or similar parts among the embodiments referred to each other. Particularly, since the apparatus embodiments are basically similar to the method embodiments, the apparatus embodiments are briefly described with relevant parts referred to the descriptions of the method embodiments. The apparatus embodiments described above are merely illustrative, where the modules described as separate members may be or not be physically separated, and the functions of various modules may be implemented in one or more softwares and/hardwares during implementation of the embodiments of the present disclosure. Alternatively, part or all of the modules may be selected according to actual requirements to implement the objectives of the solutions in the embodiments. Those of ordinary skill in the art may understand and carry out them without creative work.

[139] The above descriptions are made to merely some embodiments of the present disclosure. It should be pointed out that those skilled in the art may also make several improvements and modifications without departing from the principle of the present disclosure, and these improvements and modifications shall all fall within the scope of protection of the present disclosure.