Running a QlikView Expressor Dataflow from a Command Window

    Originally published on 07-20-2011 09:00 PM

     

    This description applies to releases prior to Expressor 3.4.  The etask utility in Expressor 3.4 and later releases have additional features that are described at the end of this document.

    You develop and test an expressor data integration application from within Expressor Studio,  and whenever you run the application it runs from within Studio. But  when you move your applications to production, you will run them from  within a command window on a computer on which you have installed the Expressor Engine component.

    The key to understanding how to deploy and run your applications depends on your understanding of the organization of the deployment package.

    You create a deployment package from within Studio. The deployment package becomes another artifact within your project and can be committed to Expressor Repository. When you want to deploy an application, you use the eproject command line utility to check the project out from Expressor Repository onto the computer hosting the Expressor Engine.

    Let’s examine the directory structure of a deployment package created from a project named Project1 that was associated with a library named Library1. The deployment package includes two dataflows from the project and another dataflow from the library.

    deployment_package.png

    In this figure, the deployment package is named DeploymentPackage1 and it contains four sub-directories.  The sub-directory Project1...0...Dataflow1 contains the deployment artifact corresponding to Dataflow1 originally included in Project1; this artifact is named Project1...0...Dataflow1.rpx.  Note how the directory and deployment artifact names include the project name, the dataflow name, and the current version of the project (in this example, zero).  Similarly, the subdirectories Project1...0...Dataflow2 and Library1...0...Dataflow3 contain the deployment artifacts corresponding to Dataflow2 originally included in Project1 (Project1...0...Dataflow2.rpx) and Dataflow3 originally included in Library1 (Library1...0...Dataflow3.rpx). Finally, the sub-directory external contains all of the files that were originally included in the workspace explorer external directories under  both the project and library.

    Now that you have installed a deployment package onto a computer hosting the Expressor Engine, how do you use etask to run the data integration application?

    In order for etask to run a dataflow, you must provide enough  information for the utility to find the deployment package directory and  determine the name of the project (or library), its version number, and  the name of the dataflow to run. How you provide this information  depends on the file system location of the working directory in which  you run etask.

    The easiest approach is to use the Start > All Programs >  expressor > expressor3 > expressor command prompt menu item to  open a command window and then change directories to either the  deployment package directory or one of its sub-directories.  This immediately solves the issue of locating the deployment package directory.  Then you simply need to specify the project name, version number, and dataflow name as arguments to the etask utility.

    etask -x dataflowName -p projectOrLibraryName -V version#

    For example:

    etask -x Dataflow1 -p Project1 -V 0

    or if the dataflow is in a library.

    etask -x Dataflow3 -p Library1 -V 0

    What if you want to run the dataflow from a different file system location?  In this case, you must also provide the absolute or relative path to the deployment package directory as an argument to etask (be certain to enter the entire command on a single line).

    etask -x dataflowName -p projectOrLibraryName -V version# -D pathToDeploymentPackageDirectory

    Note that neither of these commands requires that you know the actual name of the sub-directories or the names of the deployment artifacts; you only need to know the path to the deployment package directory.

    Alternatively, you can provide the name of the deployment artifact as an  argument to etask, which relieves you of the need to supply the project  name and version number.  With this approach, you run a dataflow using  the following commands, where the second syntax can be used from any  file system location.

    etask -x deploymentArtifactName

    or

    etask -x deploymentArtifactName -D pathToDeploymentPackageDirectory

    For example:

    etask -x Project1..0..Dataflow1 [-D pathToDeploymentPackageDirectory]

    or

    etask -x Library1..0..Dataflow3 [-D pathToDeploymentPackageDirectory]

    Additional Expressor 3.4 Command Line Arguments

    Because Expressor 3.4 dataflows may have multiple steps, the -d and -s command line arguments give you control over which steps are executed. Follow the -d flag with the index of the step where you want execution to begin.  Steps are numbered sequentially from 1.  In a three step dataflow, for example, if you want to run only steps 2 and 3, enter -d 2 as a command line argument to etask.  Use the -s flag to list the specific order in which you want steps to execute. Again in the three step dataflow, to run only steps 1 and 3, enter -s 1,3 as a command line argument to etask.

    Expressor 3.4 also includes functionality that allows modifications to operator properties at runtime.  There are three ways you can override an operator's properties:

    1. Use the Write Parameters operator to create a listing of parameters and their values.
    2. Pass a new value for an operator property to the dataflow as an argument to etask.
    3. Pass the name of an operator property substitution file to the dataflow as an argument to etask.

    For further discussion of these options, see the knowledge base article Managing Dataflow Parameters.