Q2ai Neural Network Tutorial
From NEBL
Contents |
Introduction
Creating a set of versatile neural network functions is difficult in scheme. Fortunately, there is already one in existence that was created for previous Quake2AI research. Get the file called "nn.ss" and compile it with this command:
csc -shared nn.ss
Or, if you want the nn library to run faster:
csc -no-warnings -shared -optimize-leaf-routines -no-trace -lambda-lift nn.ss
This will create a file called "nn.so", which you can load directly in chicken scheme.
The NN library
In the neural network library there is a function that creates a standard hidden layer neural network. This network has a bias neuron for the input layer and for the hidden layer. You can specify the number of inputs, middle, and outputs. The command returns the actual neural network, which is a very complicated scheme list. We'll create a network named "nn" with 4 inputs, 3 middle, and 2 outputs:
(define nn (nn.construct-standard 4 3 2))
Now we need to set the weights for the neural network, which means we must calculate the number of weights. In this case there are 4 inputs + 1 bias going to 3 hidden, which is 5*3 = 15 weights between the input and hidden layers. The hidden layer has 3 neurons with an extra bias going to 2 outputs, so there are 4*2 = 8 weights between the hidden layer and the outputs. Total we have 15+8 = 23 weights in our network.
Now make a list of random weights that are between -2.0 and 2.0. This is an elementary scheme problem which will be left to the reader. This list of random weights we'll call nn.weights. Now set the weights of our network "nn":
(nn.setweights nn nn.weights)
Now that the network has weights we can get it inputs and use it to calculate some outputs. Try giving it 4 zeros for intputs:
(nn.setinputs nn '(0 0 0 0))
Now that we have set the weights and the inputs we can calculate. The nn.calc function returns a list containing the values of the outputs. In this case we will bind that list to our variable outs.
(set! outs (nn.calc nn))
We would normally use the values in "outs" to control something, but in this case we won't do anything with it.
After you have calculated the output values for the neural network you can use the backpropogation function to train it. The nn.backprop function takes a calculated neural network, a list of the desired outputs, and a learning rate value. The learning rate value causes the network to change its weight values more or less per backprop. Too high of a value will make the network change so much that it will be difficult for it to become minutely perfect, but too low of a value will prevent the network from learning. In Quake2AI tests we've found 0.001 to be a good learning rate value.
Suppose in this example that we want the outputs to always be 1.0:
(nn.backprop nn '(1.0 1.0) 0.001)
Running this function once will change the weights in the network so that its output next time should be closer to 1.0, 1.0 when all 4 inputs are zero.
Note: back propogation is optional and is only used to teach the network. If you just want to use the outputs of the neural network to control something you don't need to use the nn.backprop function.
Now that we've calculated the outputs and used them, and in this case have even trained the network, we must uncalculate the network:
(nn.uncalc nn)
To make further calculations and training we would call nn.setinputs, nn.calc, nn.backprop, nn.uncalc, nn.setinputs, nn.calc, .... and so on until we were satisfied with the training of our neural network.
The nn.backprop function actually destructively modifies the weight values in the list that was specified in the nn.setweights function. This means that after you are done training your neural network you can just look at your weights list, which in this example we called nn.weights, and it will have the new trained weights instead of the old ones that you started with. This is quite useful for saving the weight information.
An Actual Quake2AI Neural Network Example
SETUP
First of all, change the settings of your quake2 client so that it runs in 320x240 resolution. Higher resolutions are slower and won't work correctly with this example.
Now we are going to train a neural network in quake2ai that will use the visual field for inputs and will learn to imitate the hand-coded bot that we made in the previous bot tutorial. We are going to keep things simple by having only 4 grayscale inputs from the center of the screen. This should be quite enough for such simple behavior.
Before we begin, grab the file room_ctf.bsp and put it in the ctf/maps folder of your quake2ai test folder (create the maps dir if needed). We are going to use ctf (capture the flag) instead of regular deathmatch because ctf allows you to respawn in thin air instead of on a platform, which makes reaching the floor easier.
The map that you downloaded, room_ctf.bsp, is a visually simple map that makes it easy for the visual system to spot the dark enemy against the light walls.
Now start a dedicated quake2 ctf server with the command:
./quake2 +set dedicated 1 +game ctf +map room_ctf +set port 884400 +set dmflags 1024Now launch a regular quake2 client and join the server by pressing the tilde (~) key to bring down the console; type
connect localhost:884400into the console and hit enter.
You should now be connected to the server. Join the blue team by pressing the right or left bracket keys [ or ] and hit enter when blue is highlighted. Now just leave your quake2 player in the map; he will be the one who the neural network learns to shoot.
NN BOT
Create a file called "testnn.ss" in your quake2ai test folder. Now copy and paste all your code from the bot tutorial; we are going to use this bot code for the back-propogation. Change the name of the "AImain" function to "bot.main". We will call this to see what the bot would do every frame. Instead of actually having the bot.main function make the bot perform the actions, we'll return the values for the two actions that it wants to do, turn and shoot. This we can just return in a list. Here is the new bot.main function:
(define bot.main (lambda () (let* ((enemy (getEnemy)) (enemy_xdir (if enemy (xdir (AIself.x) (AIself.y) (car enemy) (cadr enemy)) 0)) (turndiff (anglediff (AIself.yaw) enemy_xdir))) (cond ((and (not (null? enemy)) (> (abs turndiff) 0) (< (abs turndiff) 5)) (list 1 (random 5))) (else (list 0 -4.0))))))
Now let's make our neural network. We'll have 4 inputs, 3 hidden, and 2 outputs. The 4 inputs will just be grayscale values averaged from a block of pixels. In this example we'll do 4 blocks that run across the middle of the screen:
(define nnblocks '( (140 115 10 10) (150 115 10 10) (160 115 10 10) (170 115 10 10)))
Each block specifies: (x y width height)
Remember, we'll use the nn.setinputs function to set the inputs of the network every frame of gameplay, so we need a function that will get the grayscale values for the blocks on nnblocks and return a list of its values. Here is one:
(define calc_blocks (lambda (blocks) (if (null? blocks) '() (cons (gray (AIview.block (car (car blocks)) (cadr (car blocks)) (caddr (car blocks)) (cadddr (car blocks)))) (calc_blocks (cdr blocks))))))
This we can call with the command: (calc_blocks nnblocks)
AIview.block returns a hexidecimal RGB value, which we can convert to grayscale with this function:
(define gray (lambda (color) (/ (/ (+ (red color) (green color) (blue color)) 3) 255)))
Creating the actual network is exactly the same as we did in the first half of this tutorial. We start the weights out as random values between -2.0 and 2.0.
If we wanted the bot to learn very quickly we would let it backpropogate constantly. Because we want to watch it learn, we will have it alternate between learning mode and watching mode, so that we can see it learn for 10 seconds, then watch what it learned for 10 seconds. Since Quake2ai runs at 40 frames per second, 10 seconds of learning time is 400:
(define learntime (* 40 10)) (define learning #f) (define learncount 0)
Now for the AImain function. We'll first set the inputs to the network. Then we'll see if we need to flip the "learning" variable. Then we'll calculate the control that the bot would do, and the control that the nn would do. We'll back-propogate if it's in learning mode, using the control that the bot does to train the network. Then we'll actually control the output of the agent, and also uncalculate the neural network.
(define AImain (lambda () ; calculate block grayscale values and set the inputs to nn (nn.setinputs nn (calc_blocks nnblocks)) ; determine if learning or watching (set! learncount (add1 learncount)) (if (= learncount learntime) (begin (set! learning (not learning)) (if learning (print "LEARNING ++++") (print "WATCHING ----")) (set! learncount 0))) (let* ( (bot (bot.main)) ; calculate bot control (outs (nn.calc nn))) ; calculate nn control ; if it's in learning mode then do backpropogation. (if learning (nn.backprop nn (list ; shooting input. Shoots if value is positive, not if negative. We only correct if it shoots when it shouldn't, or doesn't shoot when it should. (cond ((and (= (car bot) 0) (<= (car outs) 0)) (car outs)) ((and (= (car bot) 0) (> (car outs) 0)) -1.0) ((and (= (car bot) 1) (<= (car outs) 0)) 1.0) ((and (= (car bot) 1) (> (car outs) 0)) (car outs))) (/ (cadr bot) 10.0)) ; turn direction 0.001)) ; learning rate (nn.uncalc nn) (if learning (begin ; learning: bot controls agent (AIself.attack (car bot)) (AIself.yawmove (cadr bot))) (begin ; watching: nn controls agent (AIself.attack (if (<= (car outs) 0) 0 1)) (AIself.yawmove (* 10.0 (cadr outs))))))))
The most important parts of this neural network test bot have been mentioned here. There are some extra functions that must be included, but these have been covered in the previous hand-coded bot tutorial. The full source code for this bot is the file testnn.ss
To run the file, compile it:
csc testnn.ss
Then run it:
./testnn
Now connect to the server that you should have already started. Open the console with tilde and type "connect localhost:884400". Then hit enter to join the red team. The bot should drop to the floor. The console from which you launched testnn will print out LEARNING or WATCHING, depending on what the bot is doing. After a few hours of training the agent should learn fairly well to imitate the bot's behavior, using raw visual input instead of the entity array.
![[Main Page]](http://nebl.cse.unr.edu/logo.png)