Spectrum Protect Journal based Backups
Journal Based Backup
A standard Spectrum Protect backup will scan entire client file systems and compare what it finds there with data for each file that is held in the Spectrum Protect database. Files are backed up or backups expired based on the differences between the file system and the database.
If you had a journal that was updated every time a file changed then you could just read that journal, with no need to scan a file system or check against the TSM database. This is exactly what Spectrum Protect Journal Based Backup does. A TSM Journal Daemon process runs on your client and traps file changes as they happen then records them in a journal file. A Spectrum Protect backup then reads the journal file. New or update file or directory events cause those objects to be backed up, while delete events are passed to Spectrum Protect for deletion processing. The journal will contain only one entry for each object, recording the latest change to that object. Once a journal entry is processed, Spectrum Protect deletes it from the journal file.
Journaling does not work well with very active file systems as the journals quickly become large and processing them almost takes as much effort as processing the filespace. Journaling works very well for large archive file systems that contain millions of files, but only a few hundred of them change every day. However even then it should not be used to totally replace a true incremental backup as some types of file change can be missed by journaling and some file deletions might be missed if the system is very busy. IBM recommends that if you use journaling, you still schedule an occasional full incremental backup.
Journal based backups can be implemented on all Windows clients. Journal-based backup is supported on AIX for JFS and JFS2 file systems. On Linux, journal-based backup is supported on Ext2, Ext3, Ext4; XFS, ReiserFS, JFS, VxFS, and NSS, and for a local file system shared through NFS.
Journaling is not supported on GPFS file systems. The backup-archive client does not use the journaling facility inherent in Windows NTFS or ReFS file systems or any other journaled file system.
How do you install Journal Based Backup?
The basic steps are
- Install the journal service
- Update the journal configuration file
- Run a full incremental backup to initialise journaling
Installing the Spectrum Protect Journal Service on Windows.
You can do this from the GUI with the Spectrum Protect Journal Setup Wizard, or from the command line using dsmcutil.
You can mostly do the install with the GUI by taking defaults so I've only mentioned changes to the defaults in the actions below.
- Start the Backup-Archive GUI and take Select > Utilities> Setup Wizard
- Select "Help me configure the Journal Engine" > Select Next
- Select "Install a new Journal Engine"> Select Next
- Select the appropriate File System(s) for the Journal Engine to monitor. You don't have to journal all your file systems, and in fact,
journaling will not be appropriate for all your file systems.
- For each file system, specify the location where the journal will be written, the notification filters and the journal database size.
You can take the defaults on all of these.
- Select Yes to "Would you like to start the service upon completion of this wizard?" > Select Next > Select Finish
Without using the wizard, you create your tsmjbbd1.ini file as below then run the command
dsmcutil install journal /name:"TSM journal service name" /JBBCONFIGFILE:"c:\path\tsmjbbd.ini"
Configuring the Journal Service
On Windows, the journal service configuration file is called tsmjbbd.ini. If you used the GUI as above to install the journal service then it would have configured a basic journal file for you. You can edit this file to change the settings or add some advanced settings. Things you may want to change are:
This is the maximum journal database size (in bytes) for a journaled filesystem. A setting of JournalDBSize=0x00000000 means the journal can grow up to 2 gigabytes subject to that amount of space being available
NotifyFilter (Windows only)
Specifies what types of filesystem activity to monitor for a journaled filesystem. Specifying a less comprehensive filter value may reduce the size of journals and improve performance. Probably best to go with the IBM default here, which is NotifyFilter=0x00000117. Possible other values are
0x00000001 File name changes including create, delete and rename
0x00000004 Attribute changes
0x00000008 Size changes, notification is deferred until cache is flushed
0x00000010 Last written time changes, notification is deferred until cache is flushed
0x00000020 Last access time changes
0x00000040 Creation time changes
0x00000100 Security (acl) changes
Multiple activities may be monitored by adding values together. But when adding up, remember these are hex values. You can also change this setting for each individual filespace.
A list of file systems separated by spaces, that you want to journal. They can be local fixed drives, Windows mount points or Linux/AIX virtual mount points but not network or removable filesystems.
Adding an extra file system to an existing journal system is quite simple, for example to add a G: just edit tsmjbbd.ini and look for a line that looks something like
Add your new file system on the end, separated by a space
JournaledFileSystems=E: F: G:
A common mistake is to specify the disk as E:/. This is not valid, so the backup will run as a standard incremental.
There is no need to restart the journaling service when adding a new disk, the next backup will run as a standard incremental and will build the journal database. Subsequent backups will read the journal database to get the files that require a backup.
Specifies the directory where journal database files are stored and written. The default Windows directory is the Journal Service install directory and the journals for all journaled file systems will be placed there. This is not suitable for drives that can take part in cluster failover operations, as in this case you want your journal to fail over with the filespace that is being journaled. The easiest way to do this is to provide overrides for each clustered filespace using the JournaledFileSystemSettings stanza as shown below.
For example, these parameters will set the journal size to unlimited and change the journal location to be on the same drive as the data being backed up.
If you take a journaled file system offline then the default action is to delete the journal, then when the system comes back online again you need to run a fresh full incremental backup before you can start journaling again. This is a safety feature to protect against any external updates or errors, but you can override this by specifying
If you don't specify this parameter then the journals are re-initialised every time the server is rebooted or a cluster fails over. If you have a really large file system this a fresh full backup can take days, so consider carefully which action you want and use the parameter as appropriate. Note that there is an error in the Windows sample journal file, where this parameter is called PreserverDBonExit. Remove that extra 'r' or the parameter will be ignored.
Journaling does not read the dsm.opt file to find out which files should never be backed up, it has its own list of excludes in the tsmjbbd.ini file. You may need to add excludes to both files to ensure files are not backed up by either standard or journalled backups. Note that the syntax in tsmjbbd.ini is not quite the same as in dsm.opt. The major difference is that you use \*\ to exclude a set of directories instead of \...\ Some examples are:
%:\*\*test*.txt any file with test in the name
c:\*\temp\*.* any temp directory
c:\winnt\system32\*.* all the files in a specific directory
c:\dir1\*\*.txt all text files in dir1 and all subdirectories
DeferFsMonStart and DeferRetryInterval
Another setting that is useful for clustered filespaces, where the shared resources can move around the various nodes in a cluster. If DeferFsMonStart is set to 1 this means that a journaled file system will not be brought online until the filesystem is valid and available and the specified journal directory can be created or accessed. The DeferRetryInterval setting determines the interval beween resource checking. The default setting for DeferFsMonStart is 0 which means that if journaled file systems are unavailable or inaccessible then the journal service must be recycled before they will be restarted.
On AIX, IBM provides a sample configuration file called tsmjbbd.ini.smp. You need to edit this file then save it as tsmjbbd.ini. Both the configuration sample file and the saved file should be in the default install directory. The journal configuration file, (tsmjbbd.ini), needs as a minimum a list of the file systems to monitor. These two lines are sufficient:
After the configuration file is created, start the journal daemon using the script file:/usr/tivoli/tsm/client/ba/bin/rc.tsmjbb. The journal will write initialization information to the errorlog. When you are satisfied that the journal is working correctly, you should run the script file, /usr/tivoli/tsm/client/ba/bin/jbbinittab. Running the script file will make entries in /etc/inittab, so that the journal will begin running when you restart your system.
Before journaling will start you must take a normal, full incremental backup. Once this completes and updates the Last Backup Completion Date on the IBM Spectrum Protect server, the change journal is marked as valid and the next backup will be a journaled backup. Note that if you need to recover your Spectrum Protect server database for any reason, all your client change journals will be marked as invalid and the next backup run will be full incrementals.
How do you know a journal based backup is working?
When a backup is journal based, the backup client will display the following messages when the journal backup begins:
"Querying Journal for '\\hostname\f$'"
Processing X Journal entries for '\\hostname\f$
If journal backup isn't possible, the client will always revert to a normal non-journal based backup.
What types of backup operations will use the journal?
Provided the valid journaling conditions are met, both full file system incremental and partial incremental backups will use journaling, either run manually or scheduled. Incremental-by-date backups do not use journaling.
back to top