Description of RoGi 2 Team: SimulationJosep Lluís de la Rosa, Miquel Montaner, Daniel Bassas, Josep Comas, Santi Figueras and Xavier Pinsach
Institut d’Informàtica i Aplicacions
Universitat de Girona & LEA-SICA
C/Lluís Santaló s/n
E-17071 Girona, Cataloniapeplluis@eia.udg.es, mmontane@eia.udg.es, dbassas@pas.udg.es, jcomas@gna.es, santif@intercom.es,xpins000@correu.udg.es
1 Introduction
Abstract. This paper describes the second year RoGi Team research. Last year team was a development of ideas for rational agents that co-operate and use revision of exchanged information and consensus techniques. The purposes of this year are to improve the world perception using noise filters and objects tracking and to evaluate the whole behaviour of different agent decision system implementations playing together.The main problem of the last year team implementation was the lack of accuracy in the world perception introduced by the SoccerServer. We did not treat this error and the player movements were quite bad. It was a big handicap for our team because when the player made a bad action we did not know if the problem came from the decision system or if it came from the wrong information. This is the principal reason for our first purpose: improve the world perception.
The second purpose is to evaluate the whole behaviour of different agent decision system implementations playing together. We think that the old team had a good decision system. It was based in two phases: reactive phase and deliberative phase. Now, our purpose is to evaluate the interaction of a community of different implementation agents (such as expert systems, fuzzy systems … and, obviously, our system) with a common objective.
Finally, it will be necessary to improve the player actuation system with more skills.
2 World Perception
2.1 A Noise Filter
The SoccerServer introduces an error to the information proportional with the distance. This error produces an inconsistent behaviour on the player, which is very important in the actuator system (in the decision system this inaccuracy is solved by fuzzy logic). So, the information received by the SoccerServer is filtered with a noise algorithm (probably Kalman).
2.2 Objects Tracking
Another problem to solve is that the agent usually loses the other players and the ball. It’s very important to know these positions in the field to take the best decision. We can track the objects with a set of prediction algorithm similar to the robot vision systems.
2.3 A Memory of Seen Objects
All the information obtained by the perception system is saved in a memory object. The decision system and the actuator system work with the memory data to take the decisions and to control the player.
3 Decision System
This is the main objective of our team and the part where we spend most effort. The decision is taken in a two-phase algorithm:
3.1 Reactive Decisions
In a first step of reasoning, every agent decides a private action. In the last year team all the players took the reactive decision with a fuzzy system. This was a good system because it was more robust with the error introduced by the SoccerServer. The purpose for this year is to combine different reactive decision algorithms with the same objective, such as expert systems, neural networks, … We want to test the results of the interaction of different agents and compare the team global behaviour introducing different number of them.
3.2 Rational (Co-operative) Decisions
Rational reasoning in the sense of [Busetta 99] is implemented by communicating the former reactive beliefs. It begins when every agent can know the beliefs set that contains the reactive belief, the certainty of this belief and the identification of the player (reactive_belief, certainty, ID_player) of some other playmates. Therefore, when two playmates realise they have conflictive beliefs then the certainty of their beliefs is taken into account and one of the playmate changes its mind by reconsidering its former reactive beliefs.
Note that the exchange of beliefs and their certainties requires of revision [de la Rosa 92a]. This means that the subjective certainties associated to beliefs that are incoming from other agents have to be filtered (reviewed) at every agent. This process of revision is developed using extra knowledge about the co-operative world by means of some perception of quality and reliability of mates and of oneself [de la Rosa 92b, 93] [Acebo 98]. Our improvement (novelty) is to modify the perception of the co-operative world to make the consensus algorithm more adaptive to changing environments: every agent modifies its perception of the co-operative world. Two methods are proposed: (1) a positional method and (2) a reinforcement method for winners in conflicts to increase persistence:
Method 1: positional method.
For example, the perception of the co-operative world from a forward-player could be: ’I have big necessity of the middle-forward players and not much necessity of the goal-keeper’. However, this perception has to be completed by more information according to the positions of the other playmates. This is the assignment of the prestige and necessity parameters:
Results of the method 1 Collisions in decisions are reduced compared to non-adaptive perception of the co-operative world but not eliminated. Prestige is assigned within the interval [0.5, 1] because every playmate deserves minimum credibility. Necessities vary in the interval [0, 1] but normally are low. Here follows that the behaviour of agents is as follows: when a player is far from the ball it will be passive or conservative and when the ball is closer it will be more active and aggressive.
Prestige is the perception of the co-operative world. It is the confidence on other playmates. Prestige that a player i is seen from a playmate j is based on using the necessity that player j has of going to the ball. This prestige, that it is initialized at a random value (0.5), will change during the game at every conflict:
The agent that has to modify its belief because of a conflict, and happens that its reviewed certainty is lower than the reviewed certainty of the playmate. We write down the identifier of the playmate who won the conflict and its decision.
At any moment again the agent has to modify its belief because of a conflict, then it will consider whether the conflict is with the same previous playmate. In this case, if the conflict is solved in the same way as previously then reinforcement learning will be used, to reinforce, by means of modifying the prestige, the persistence of the rational decisions of the agents.
Results of method 2. The improvement of this method is significative and highly adaptive. Almost collisions in terms of co-operative decisions are eliminated.
4 Action System
The agent has a set of actions that can execute. We can call this set of actions as a high-level language and these are the final decisions of the agent. For instance, intercept ball, drive ball, cover goal, … But the SoccerServer don’t understand this high-level language, he only understands some commands in a low-level language, such as turn, kick or dash. The functionality of the action system is to "transform" the high-level actions to low-level commands. This is a very difficult part because you need a lot of geometric and physic formulas to treat the correct behaviour of the player.
In the last year team there were a lot of skills to execute de decisions such as turn with the ball, kick in some direction, avoid players, … but with the problem of the inaccuracy of the information we have a lot of problems. This year is easier to improve the player skills than the last year because with the filter noise we have more accuracy with the data. So, another purpose of this year is to improve the player skills.
5 Conclusion
The results of the last year weren’t satisfactory enough. We played four matches and we only could obtain one victory (5-0) and one even (0-0). But it was a big experience for this year and we think that we can obtain a good result in Robocup’2000.
6 References