The suggested alternative solution chosen by this project is to concentrate on information relatively easy to collect and available for most types of services, hosts, etc. and to decide for each relevant pair of objects, whether a dependency exists and (if required) the type and attributes of that dependency. This is done by a Neural Network fed with time series of the objects activities to judge whether they ``have something to do with each other'' or not. Of course, values of activity do not show the dependencies explicitly. The fact that two services show activity at the same time does not yet allow to say that they are dependent, but after observing this behavior several times (within a certain period of time), such a conclusion is plausible.
Examples for values of activities measured per object are:
Generally speaking, this is information taken from lower layers, like the operating system, middleware or the transport system.
For the project, we constructed and trained neural networks. After normalization and pre-selection of relevant intervals in the activity data they are capable of deciding for a pair of objects whether there is a relationship or not.
Neural Network decides per Pair of Objects
and others, described in more detail in [#!gko!#]. These advantages are necessary to overcome the lack of explicitly useful information in the simple input values and problems like small timely displacements of values at certain managed objects (e.g., due to not well synchronized clocks). The second point is especially important, because--depending on the kind of values that express activity--there potentially is a lot of ``internal'' activity, meaning that actions are performed which are completely unrelated to other objects outside. The complex training process of the neural networks needed to achieve the necessary robustness and flexibility cannot be presented in the brevity of this paper. The interested reader will find more details about it in [#!ense99b!#].
A possible disadvantage of pairwise decision over dependencies between all objects is that it needs O(n2) time for n elements. For large numbers of n special techniques must be applied: One simple possibility is to pre-exclude pairs that are either not of interest, or where dependencies are not possible anyway. In the web server scenario one could omit all calculations for pairs of web clients what usually makes up a significant percentage, comparing the huge number of clients against a smaller number of servers. Further reduction comes from applying the domain concept introduced in section . Smaller models are generated per interesting domain. Additionally, the activity of the whole domain is condensed into one single ``domain activity'' (e.g., by summing up activity values of important objects) allowing to calculate the dependencies between domains and also between one single object in one domain and (other) `outside' domains.