Search this site:
Enterprise Search Blog
« NIE Newsletter

Can I manage Verity K2 Collections with Java Ant? - Ask Dr. Search

Last Updated Mar 2009

By: Mark Bennett, Volume 3 Number 1 - January 2006

This month a reader wanted to know if it was possible to use Java Ant, the Apache Project open source build tool, to manage Verity K2 collections. A full production-quality Java Ant script is beyond the scope of Dr. Search, although perhaps in the near future one of his associates will provide a script that you would be proud to put into production. However, in the meantime this should serve as a starting point for many of you.

We won't go into great Java Ant detail here; we suggest you consider the book "Ant: The Definitive Guide" by Steven Holzner from O'Reilly (ISBN O-596-00609-8). You can purchase it from Amazon by clicking here. We advise against the book called "Ant Java Notes: An Accelerated Intro Guide to he Java Ant Build Tool" by A. T. Bell.

Note: For you "old timers" out there, Java Ant is very similar to the older Unix / C "make" utility you may have used. In many ways, Ant is just "make" with an XML syntax control file as opposed to the old "makefile" syntax.

The Big Picture

The code in Figure 1 is intended to check for a collection and, if it exists, to remove the directory. Next, Java Ant will create a new collection using the same style files and documents we showed in the Fall 2005 issue of Enterprise Search where we showed how to use the little-known extract option in mkvdk.

You can see that the sample build file in Figuire 1 simple deletes the existing collection direcoty, something you would not do in a production environment. You would probably copy the collection to a backup area prior to deleting it, and in the modern K2 environment, you would take the collection offline before doing anything that might upset the server. We do not suggest you follow these brute force tactics in your production environment.

Now, let's take a look at the code listing in Figure 1.


<?xml version="1.0" ?>
<project default="main">

<!-- Define some variables - known as properties -->
<property name="cmd_shell" value="cmd.exe" />
<property name="k2_util" value="mkvdk.exe" />
<property name="coll_name" value="d:\dev\ant\coll1" />
<property name="remove_parms" value="/c rmdir ${coll_name}/s/q" />
<property name="style_dir" value="d:\dev\ant\style" />
<property name="file_list" value="d:\dev\ant\flist.txt" />
<property name="log_file" value="d:\dev\ant\vlog.txt" />

<property name="message" value="Rebuilding and recreating collection ${coll_name}" />

<target name="main">

<echo>
${message}
</echo>

<!-- if coll_name exists, set property coll.exists -->
<available file="${coll_name}" property="coll.exists" />

<!-- Execute the ant routine ('target') eraseColl -->
<antcall target="eraseColl" />

<!-- Now create the new collection -->

<antcall target="newColl" />

<echo message="All done" />

</target>

<!-- routine ('target') to remove the collection
Only runs if the property 'coll.exists' is defined
in the availble tag in main -->

<target name="eraseColl" if="coll.exists" >
<echo>Killing collection ${coll_name}</echo>
<exec executable="${cmd_shell}" >
<arg line="${remove_parms}" />
</exec>
</target>


<!-- routine ('target') to recreate the collection
Only runs if the property 'coll.present' is defined
in the availble tag just before the call to newColl -->

<!-- target name="newColl" -->
<target name="newColl">
<echo>Build New Collection</echo>

<!-- build the parameters to run the program:
mkvdk -create
-collection d:\dev\ant\coll1
-style d:\dev\ant\style
-extract @d:\dev\ant\flist.txt -->

<exec executable="${k2_util}"
output="${log_file}"
failonerror="true" >

<arg line="-create" />
<arg line="-collection" />
<arg line="${coll_name}" />
<arg line="-style" />
<arg line="${style_dir}" />
<arg line="-extract" />
<arg line="@d:\dev\ant\flist.txt" />
<arg line="@d:\dev\ant\flist.txt" />

</exec>

</target>

</project>

Figure 1

Looks easy, doesn't it?

Figuring out the code

Java Ant uses an XML file to define targets - similar to a subroutine or a Java class. In Figure 1, the <project> tag defines the primary entry point as the target named "main". When you run Ant with the file in Figure 1, it will start execution with the target named "main". First, however, Java Ant will execute the <property> lines if finds. Properties are just that - in this case we are using properties as if they were variable names in more traditional scripts. Note we've defined properties names like k2_util to be mkvdk.exe, our favorite Verity build program; and cmd_shell as the Windows command shell cmd.exe. If you use a Linux/Unix/Solaris environment, choose the shell of your choice. Note the syntax in defining remove_parms to use the value of the coll_name property.

Once we've defined the properties we want to use, Java Ant starts at target main with a welcome message based on the value of the property message. Note the syntax in the echo block to output the value of the message property rather than simple string "message".

Our script next tests for the existence of a file or directory defined by the coll_name property. If the file/directory exists the property coll.exists is defined. The next line calls the Java Ant task eraseColl, which will run if the coll.exists property is defined. This is not unlike what you might do in a script when you test if a file or directory exists prior to trying to remove it.

If the collection exists, Java Ant uses the <exec> call to run the Windows command interpreter using the command:

cmd.exe /c rmdir coll1 /s/q

Here we know that cmd.exe is in our PATH; so we just use the rmdir to remove the entire collection directory and exit the eraseColl target.

Next, back in main, we call the newColl target to create the new collection. We again use the <exec> task, this time with two new properties: output to specify where the standard output from mkvdk will go; and failonerror which will cause newColl to fail if mkvdk returns an error code. Since we have no additional targets after the build in our simple example, we could leave it out; but I wanted you to see that you do have control over programs you execute.

The command line we are building to execute is:

mkvdk -create 
-collection d:\dev\ant\coll1
-style d:\dev\ant\style
-extract @d:\dev\ant\flist.txt

When mkvdk finishes, the newColl target returns to main and our Java Ant build is finished.

Running the Code

Once you create the file shown above - let's call it rebuild.xml - you are ready to go. You may need to make a few changes to actually get the build file to run on your system:

  • Make sure you have installed Java Ant

    You can install Java Ant in binary format from the Apache Ant Project.

  • Verify the Verity and Java Ant directories are in your PATH

    Open a command window to verify that typing mkvdk and ant both run the respective programs.

  • Define the correct properties in the build script

    Change the properties in thew rebuild.xml script so the collection name, style file, and file list match your setup. The properties in the above example use the same files and directories mentioned in the Entity Extraction example in the Fall 2005 issue of Enterprise Search.

Now you're ready to get started. Open a command window, and from the directory where you saved rebuild.xml, enter the command:

ant -f rebuild.xml

You should see something like the output shown in Figure 2.



d:\dev\ant>ant -f rebuild.xml
Buildfile: rebuild.xml

main:
[echo]
[echo] Rebuilding and recreating collection d:\dev\ant\coll1
[echo]

eraseColl:
[echo] Killing collection d:\dev\ant\coll1

newColl:
[echo] Build New Collection
[echo] All done

BUILD SUCCESSFUL
Total time: 3 seconds
d:\dev\ant>

Figure 2

For more or less information as you run Java Ant, you can use the quiet flag "-q" or the debug flag "-d" as shown here:

ant -q -f rebuild.xml

ant -d -f rebuild.xml

These may help you identify the source of any problems you might have. You can always get the usage options by entering:

ant -?

Summary

You've seen that you can use Java Ant as an alternative to shell or Perl scripts to manage Verity collections; and while the sample provided here is probably not going to be bulletproof enough for any production environment, I hope it gives you a good starting point for a new and interesting open source tool, Java Ant.

Feel free to send your enterprise search questions to Dr. Search. Every entry (with name and address) gets a free cup and a pen, and the eternal thanks of Dr. Search and his readers.