The systematic congestion SOA of net silver cures the disease from good recipe to root of the trouble
Originally the intersection of IT and administrative supervisor placed hope on SOA solves the system and transports some problems in linking, and this good medicine curing the disease, may bring new disease just, such as net silver is slackened because of systematic congestion. What a new problem overcome under SOA framework, have just become primary task that IT administrative supervisor should face.
The maintenance of operation of system has been the operas involving much singing and action of enterprise IT departments all the time, but have a large-scale enterprise of numerous application systems in the bank like this, seem more outstanding to link the question to transport. When SOA framework appears, the Ministry of Science and Technology of the bank has to face some brand-new difficult problems again. Actually they sent to hope SOA solve the system and transport some problems in linking, the good medicine curing the disease of this, will bring side effect just, how to overcome these bad reactions under SOA framework, have just become primary task that IT administrative supervisor should face.
Congestion of the business peak period
Systematic congestion is a scene that will often be seen. Take silver system of network as examples, at about 10 o'clock in the morning, it is the concentrated business peak in one day most, congestion has taken place in the silver system of network, the question that leads to the fact is that the customer can't visit and log on normally. In the real work, the ones that can discover the problems at first are not often that IT transports and links the department, but customer service department, because they have received a large number of customers to complain and complain, register in IT administrative supervisor's hands progressively after the question is being found and accumulated constantly, then transport and link the department to solve.
At this moment, system stop up, already for long half an hour, it causes to be comparatively extensive harmful effects. Why transport, link personnel, no discover the problems in time? In fact, this is not that their work is not responsible, not the neglecting one's duties of the leader. The reason of systematic congestion of net silver does not appear in a certain system, but appear after SOA integration, a plurality of system walk abreast and coordinated treatment cause systematic congestion.
Through the synoptic diagram of a service periodic line, the reason of systematic congestion of net silver is easier to be understood. A, B, C, D are customer service channels of the bank, E, F, G, H are all backstage supporter's application systems. Suppose A is a silver channel of the network, the service request that the bank customer submits on A channel is sent to ESB, after serving the bus line and asking for process and changing, and then send to the miscellaneous backstage supporter's application system E and G, it may be one, it may be a plurality of too, and will guarantee the whole service and conformance of the affairs among them, recycle acknowledgement back to for the system A of the channel afterwards.
Through analyzing, I find each application system will have one's own flow control, time-out control, safety control and user's access control. Because through integration testing and stress testing before reaching the standard grade, it does not have any question to go to visit a certain system point-to-pointly. But after through SOA carry on systematic integration, find the control parameter setting between each system, it is not optimal, even will cause contradictory and restrictingly.
The controlled variable of flowrate that the system A, B, C, D, E, F, G, H presumes is 60, 5, 40, 10, 80, 50, 30, 60 separately, the controlled variable of flowrate of ESB is 200. If value of a quantity flows in A system to reach crest value of 60 now, while carrying out step 1- 4, services of the overall system are all normal, but because the set point of G system is limited, the complicated service request greater than 30 is refused by G system, thus lead to the fact E system need, make, go back, roll, deal with and the intersection of A and the intersection of user and service request of system cause and stop up. Look like this, 60 of A system set-up has hidden danger of risks to erupt the flowrate simultaneously, in a situation that G system can't promote handling capacity at present, can only set as 30, this is the so-called barrel minor plate effect.
Difficult problem of parameter setting
It was only the most simple example that was enumerated in the front, in fact problems faced are more complicated, difficult in the real work. The change that brings to SOA framework, transport and link the managerial puzzlement that brings to mainly embody in two respects of parameter setting and system supervisory and control.
The arrangement not reasonable reason of the system parameter mainly has six respects.
First, it is to finish by the independent project team that every system builds and runs and maintains, have formed a set of flowrate of establishing one's own system and control method in overtime respectively, but when a plurality of systematic parallel running is in SOA framework, none's unitary rule and mechanism are come to manage, will inevitablely cause the whole operation of the system not to be enough to coordinate and match.
Second, every control parameter of system until autogenous the intersection of handling capacity and assessing value presume originally, but person who assess these run with actual system the intersection of visit and pressure of attitude compare, come on, say, greatly differ from each other, this is one of the reasons causing systematic operating efficiency to be low too.
Third, under the arm's length dealing state, the system is stable operation wholly, but when visit capacity increases systematic congestion to take place, serving is refused, will inevitablely produce the washing and is served to take up the channel capacity of a large amount of, have influenced other normal service requests.
Fourth, each system overly emphasizes the self- protection mechanism, transports and links personnel to hope to minimize the safe risk of the system, so while presuming the system parameter, will try hard to consider disposing to the lower limit of systematic bearing capacity. But say from employing wholly, conservative parameter setting is not enough to meet the systematic visit pressure, thus has reduced the systematic resource utilization rate of the whole SOA framework.
The fifth, run to look wholly from SOA framework, can not find whether what took place in a certain system is system failure or system over load when congestion happen, and to these two kinds of situations, transporting will totally take different processing modes to link personnel. As to the system failure, should adopt isolating systematically and crash recovery way, to the intersection of system and over load, can adopt reduction of pressure and over load divide flow here process. Under the state at present, these two kinds of situations are different to distinguish.
Six, combine the intersection of trouble and isolating mechanism, only when being taken place in system failure, need to isolate the service request stopped up, avoid influencing normal users to visit, and it is very apt to spread to that in the whole SOA framework in correlation system that the system that the simple point causes stops up, cause all service requests to be influenced.