TSM and tape hardware.
![]() |
|
Hardware Issues
Library Sharing
Virtual Tapes
Finding Faulty Tapes
Overwritten Tapes
Investigating Windows tape drive problems
UNIX and Windows Tape utilities
Persistent Binding
Library Sharing
How you define library sharing with TSM depends on whether its a SCSI library or a 3494 library, and whether or not you have a SAN.
Library Sharing for SCSI libraries requires that you define the library as follow :
for the library manager DEFINE LIBRARY lib-name LIBTYPE=SCSI SHARED=YES DEVICE=drive-name for the library client DEFINE LIBRARY lib-name LIBTYPE=SHARED PRIMARYLIBMANAGER=server-name
Library Sharing for 3494 libraries does not use the library manager/client configuration as described above. It needs the '3494SHARED YES' server option instead.
You still need to use separate categories for the different servers otherwise you may end up with two servers having the same private/scratch volume in the library inventory. What 3494 library sharing brings is the ability to define all drives to all the servers sharing the 3494. The TSM server will detect if a drive is available and will retry based on the new retry options that were added in 4.1 (DRIVEACQUIRERETRY and MPTIMEOUT options).
Virtual Tape Systems
The virtual tape section discusses some the the different virtual tape systems available. Basically there are two types, those that replace tape completely with disk, and those that front real tapes with a big disk cache. Most hardware implementations use post-process deduplication these days, which avoids storing duplicate data. TSM 6.1 will include software de-duplication.
The advantages of using VTS with TSM, is that you get a lot of virtual tape drives, and can run a lot of processes in parallel. One of the other advantages of a VTS is that it allows you to fill up big tapes with small files. TSM does this for you anyway. If you need to recall data from physical tape to VTS cache, this can add a considerable overhead.
A VTS that still uses physical tapes has another quirk - the logical vs physical implementation. VTS emulates physical tape in that when a file is expired, it still uses space on the virtual tape volume. A virtual volume must be re-written to reclaim scratched space. When virtual volumes become scratch they too occupy space on the physical tapes, which again must be re-written to reclaim the space.
The jury is really still out on this one. It seems that TSM can get a lot of benefit from disk only virtual tape, and for example the Sepaton product is used by quite a few customers. The benefits of a physical tape backend are less clear.
Finding faulty tapes
Follow these steps to identify and fix faulty tapes.
List unavailable tapes
q volume access=readonly and q volume access=unavailable
This will give you list of tapes that have been put in this state, probably because the system has identified an error. However, a tape will be marked as unavailable if TSM tried to mount it and it is not in library.
Look for tapes which have errors
select volume_name, read_errors,write_errors from volumes where (read_errors > 0 or write_errors > 0)
This will give a list of tapes that have reported read and write errors. If you have a lot of these, consider upping the thresholds so you can concentrate on tapes with a lot of errors first.
To fix a problem, run the audit command
AUDIT VOLUME volser FIX=YES
If a part of a tape is faulty, the audit command will try to fix it. If cannot fix a problem file, and a copy exists on another tape, then you need to use the RESTORE VOLUME command. If there is no copy, then the AUDIT command just deletes the entry from the database.
If the tape is hopelessly trashed, and you do not have a copy, the only answer the 'delete volume discard=yes' command. However, its always worth trying a MOVE DATA command first to see if you can rescue something from the tape.
What happens if a tape is accidentally overwritten by another application?
If you have a copy pool, you can restore the tape, otherwise you have to tell TSM to throw the data away.
To restore the volume, use the command
restore volume volname preview=yes
and look at actlog after this process finishes. It will show you all the copytapes needed to recreate the primary tape. Get all these tapes back from your offsite copy group, and run the command again without the preview=yes. The old tape will be marked as destroyed and the data copied to a new tape. The old volume will then be deleted once all the data is restored.
To discard the data use the command
DELETE VOL volname DISCARDDATA=YES
or if that fails
AUDIT VOL volname FIX=YES
and the active data will be backed up again on the next run. Of course, if this is older backup versions, then they are gone forever.
If a tape appears to be assigned to a remote server and the remote server knows nothing about it, you can delete it from the volume history, and so free the tape up for reuse. This particular use of the delete volhist command appears to be undocumented, but it works.
delete volhist todate=today type=remote volume=volumename force=yes
Investigating Windows tape drive problems
Are the Drives / Paths online?
At the TSM Server, check that the drives and paths are online with these commands
q drive q path
If any drives or paths are not online, use these commands to update them, substituting your own server, library and drive names.
update drive lib-name drive-name online=yes update path server-name drive-name srct=server destt=drive libr=lib-name online=yes
RSM
TSM does not co-exist very well with Windows RSM, so check that the service is not started and is disabled. If it is started, stop it and disable it unless you plan to use it.
External Tape Problems
It's possible that there is an external problem with the library or with drives. A good place to start is to check that Windows Device Manager lists the library (Medium Changer) and/or Tape Drives. Right click on 'My computer' and select 'Manage' then 'Device Manager'. If Device Manager indicates a problem then TSM will have problems.
If you cannot see any devices in device manager, then right click on the server name in the right side display and select 'Scan for Hardware Changes'. If this does not discover the devices, you have a problem with your cabling or SAN ports.
Device Drivers
If you can see the devices but they are in an incorrect state, it could be that the incorrect device driver is loaded.
Check the device driver version through Windows Device Manager by right-clicking on the device, select 'Properties', then 'Device Driver' tab.
You will need to check up-to-date TSM and product information
to see which is the correct device driver, but generally, IBM tape devices use IBMTape except that 3494 libraries use ibmatl. Non-IBM tape drives and libraries typically use the tsmscsi device driver that comes packaged with TSM Server and Storage Agent. A non-IBM library that uses IBM tape drives will typically use tsmscsi for the library and IBMtape for the drives.
Another error I came across once was that the physical tape library had been swapped out, and the entries in the Windows registry were incorrect. In this case, I deleted the registry entries then rebooted the server, letting Windows pick up the new Library and drives and then it created new registry entries with the correct serial numbers and WWNs. The exact registry entries will depend on which SCSI port and bus you use so if you think that this is a problem, you will need to check out entries that look like the example below, with the lower case 'n's your numeric value.
HKEY-LOCAL-MACHINE\HARDWARE\DEVICEMAP\SCSI\SCSIPORTn\SCSIBUSn\Target ID n\ Logical unit id n\
Make sure TSM is DOWN before you do anything with the drivers, even on the Lanfree clients (put the Storage Agent on the Lanfree client in Disabled or Manual mode until all drivers are in, return the Storage Agent to Auto when all is well).
To install the IBMtape driver, run install-exclusive.exe.
To install the tsmscsi, from Device Manager, right-click on the device, select Update Driver -> Install from a list or specific location (Advanced) -> Don't search. I will select the driver to install -> Select the IBM Tivoli Storage Manager device driver -> Continue Anyway.
Windows device names incorrect
Windows device names can change after a reboot, use tsmdlst to check them. tsmdlst runs quicker if the TSM service is down, so it's often a good idea to run it as a routine before you start the service.
Open up a DOS command line navigate to C:\Program Files\tivoli\tsm\console, and run tsmdlst
You should see output something like the following. The example below shows a single library and drive.
C:\Program Files\Tivoli\TSM\console>tsmdlst
Tivoli Storage Manager -- Device List Utility
Computer Name: z999UPS01
OS Version: 5.2
OS Build #: 3790
TSM Device Driver: TSMScsi - Not Running
2 HBAs were detected.
Manufacturer Model Driver Version Firmware Descr
iption
--------------------------------------------------------------------------------
--------------------------------
QLogic Corporation QLA2340 ql2300.sys 9.1.4.15 3.03.21 QLogi
c QLA2340 Fibre Channel Adapter
QLogic Corporation QLA2340 ql2300.sys 9.1.4.15 3.03.21 QLogi
c QLA2340 Fibre Channel Adapter
TSM Name ID LUN Bus Port SSN WWN TSM Typ
e Device Identifier
--------------------------------------------------------------------------------
mt0.0.0.2 0 0 0 2 1110018309 500308C1424E8001 LTO
IBM ULT3580-TD2 73V1
lb0.1.0.2 0 1 0 2 0000013201681000 500308C1424E8001 LIBRARY
IBM ULT3583-TL 6.10
Use the query path command to check if paths are using the wrong devices, and if so then correct them with the update path command
To query library paths:
q path destt=libr f=d
To query tape paths for one server:
q path server-name f=d
To update a library path:
update path server-name lib-name srct=server destt=libr
device=<lb#.#.#.#> online=yes
To update a tape path:
update path server-name drive-name srct=server destt=drive libr=lib-name
device=<mt#.#.#.#> online=yes
After a Windows server reboot, many people as a matter of routine will run the following process, just to make sure no problems appear. They have most of these action scripted, so it's a simple case of running the script for each step.
- Run tsmdlst
- Start TSM server with client sessions disabled
- delete tape and library paths
- delete tape drives
- delete tape library
- generate path definitions from tsmdlst output (various ways to do this with scripts or spreadsheets
- define tape library
- define library path
- define tape drives
- define tape paths
- checkin scratch tapes
- checkin private tapes
- enable client sessions
UNIX and Windows Tape utilities
Windows has a utility called ntutil which can be useful for faulty tape drives, to decide if the problem is with TSM or external. The only difficulty with it is that there is no correlation between TSM devices and Windows devices, you have to work out how to match them. For example, in TSM you call your tape drives DRIVE00 to DRIVE32, and DRIVE28 is faulty. First go to TSM and run the following (the '*' just means you don't have to input the library name). Make a note of the drive serial number.
Q DRIVE * DRIVE28 F=D
Now open a DOS command prompt and type ntutil. Take option '1 manual test' from the first menu, then take option '20 open' to open the default drive, usually \\.\tape0\.
Then take option '63 get tape bus info', and you will see a list of tape drives called TAPE00 to TAPE32. However TAPE28 does not correspond to TSM Drive28, that would be much to easy. Scan down the list looking at the tape serial numbers until you spot your tape, then make a note of it's tape number, in this case it was Tape04.
Now take option '1 set device special file' and reply to the prompt with Tape04.
Take option '2 close' to close the previous tape session, then option '2 open' again, and you will open Tape04, otherwise known by TSM as
Drive28. If the drive opens up at this point, it's probably OK.
However you can run various commands against your faulty drive once you open it, you can see the list from the menu. '49 enquiry' and '58 device info' can be useful.
Persistent Binding
When a server is rebooted, the tape drive definitions can change, and this can make the tape paths in both servers and storage agents incorrect. You can prevent this from happening by using Persistent Binding.
In AIX, install the Atape driver. This allows you to rename the tapes in AIX to a standard that suits you, and these names will survive a server reboot.
On Windows, you can get persistent binding if you use Qlogic device adaptors. In Qlogic, bring up the Fibre Channel Port Configuration dialog box, right click on Host Adapter, device or LUN in the HBA tree, then click on Configure in the drop down menu. Select the BIND box, and that will bind each port to its target ID.
