In the next stage - shown in Figure 4 - we'll extend our knowledge about the mission-critical Application A to include the fact that it is driven by two outsourced call centre operations, X and Y. The call centres are duplicated for all the usual resilience reasons.
(Monitoring the activity of the data associated with Application A on Device D shows that it has dropped to, say, 74% of the activity generated by the normal production throughput generated by incoming customer calls. One of the buzzphrases swarming round today's systems management services is policy-based
management. Under such a regime our policies might, quite reasonably, state that when activity on the call centre data falls below a 75% threshold we can relinquish some of the performance-enhancing resources that are directed to that data, to improve the performance of other, contending sets of data belonging to other applications. This might involve reallocating some of the disc cache allocated to Application A's data, reducing the channel bandwidth devoted to it, possibly allowing the data to
migrate inwards on the physical disc devices that hold it (where lower surface speeds give lower data rates than outer tracks), possibly even switching mirroring to another device rather than doing it in the box (D).
Figure 4: proactive systems management to minimise the impact of external events
If the 74% traffic figure is the result of lower demand - such as going through the national school holiday season, when about one third of the company's customers are likely to be away on holiday at any one time, rather than calling call centres - all well and good. On the other hand, in this instance the traffic has dropped because Call centre Y has become non-productive - for whatever reason, fire, flood, digger through 'phone lines, remote server crashed, and so on, just the contingency that justifies having two call centres, in fact.
Far from reducing the aggregate subsystem bandwidth directed to Application A's data, the correct action in this case is to start directing as much bandwidth to it as possible - freeing up cache and channels to allocate to it, migrating data on discs, moving data for other applications to other devices, and so on.
Why? The number of calls being serviced, despite the fact that Call centre X is working flat out, is only 74% of the probable incoming rate, which means that 26% of calls are not being serviced. Those unserviced calls will need to be serviced with great urgency, and may even be causing additional calls. When Call centre Y comes back on stream Application A's data will become an instant hotspot. Proactive data management will have the system ready to withstand that onslaught, where reactive
policy-based management would have the opposite effect.
To tidy up the loose ends - the remote snapshot for DR needs to be continued, although a further stage of refinement might add a local periodic rolling production (as opposed to DR) snapshot to give additional protection against a hiccup such as an application failure under the extreme workload anticipated when both call centres are working flat out.
The point of this case is that the scope of applications monitoring needs to extend beyond the home
enterprise's boundaries to take in what is happening in critical external resources - the call centres. The problem here is not - or not only - to carry out the reconfiguration required within the storage subsystem but to extend the monitoring regime to pick up the necessary alerts, and to equip the management processes to interpret the alerts and opt for a suitable course of action.
A comprehensive solution Finally, we can suggest what a comprehensive storage management picture might look like some time in the future. It is worth emphasising that this is today's problem
but enterprises will have to wait years for a solution. The world of computing will have changed by the time such a level function might be available, of course. This has always been the case, but - far from negating the validity of setting long-term technical goals - it has consistently emphasised the importance of building sound mechanisms and infrastructures.
We could expect the real situation to be more like Figure 5. The call centres are running a client/server application, Application B, which may have to be revised if core Application A is changed. Each call centre also maintains its own data, perhaps a subset of the central database. The change to our core application, A, has repercussions for the satellite application, B, and for the data formats held at the two call centres. As a result roll back from discovering an error in A Rev 2 now involves a whole series of steps, accompanied by logical locks.
Figure 5: the challenge of managing collaborative processing environments
Call-centre-to-call-centre data synchronisation without anything more than minimal involvement of the applications servers would also be a useful touch, making use of some of the mechanisms that are already in place for resynchronising data after taking a snapshot.
By the time the management software developers are beginning to deliver the last pieces in this functional jigsaw, technical developments will have made their lives somewhat easier. The wider availability of connectivity bandwidth, in particular, will make a difference. The ability to extend a SAN to a remote site will extend the management and flexibility benefits mentioned earlier, including those enjoyed by third parties and business partners; high-speed application-to-application links will
also enable applications function and/or data to be relocated for management convenience rather than to
alleviate communications bottlenecks.
As scrittore has remarked in the past, close technical co-operation between storage device manufacturers and third-party software specialists will become increasingly essential to promote efficient and effective interfacing between the management and control facilities provided by each. Unfortunately, the dividing lines between device-specific internal control routines - firmware or
microcode - and management software products have become blurred. The storage hardware suppliers have
promoted this fuzziness to encourage customers to look to them for an increasing proportion of their storage management facilities. While this may have been helpful to date, as it has speeded up the development of in-the-box routines, the type of extension outlined above suggests that specialist software developers will force the pace of progress for the next few years and clear demarcation lines between product boundaries will help them to define what their products should and can do.
Conclusion Enterprise storage management strategists would be advised to consider how aware their prospective hardware and software suppliers are of the future form of storage management. Suppliers should either have a clear idea of what is required to provide the level of function outlined above or be able to make an even clearer case for their alternative view. So far the storage vendors have shown no sign of mapping out an appropriate route. In particular, their insistence on treating management APIs as a competitive differentiator rather than as a functional enabler has been discussed at length in these pages. However, early in June, Brocade, Compaq, EMC, Hitachi Data Systems, IBM and McData - under the aegis of the Storage Networking Industry Association (SNIA) - announced what they described as "Open SAN solutions ... standards-based storage networks that provide tested interoperability of products
supplied by multiple vendors."
The degree of open-ness was not immediately clear, as the emphasis appeared to be on certifying specific heterogeneous configurations, rather than providing high levels of commonality across all levels of management for widely varying storage network set-ups. Nevertheless, the signs are that the vendors have been forced to recognise that there are enough management hurdles to clear without placing the barriers of rigid proprietary interfaces in the way.