JAVA
Pipelines is a JAVA application port of IBM's CMS
Pipelines by John P. Hartmann that runs on the IBM Virtual Machine
operating system platforms.
It attempts to closely emulate a significant amount of the
functionality
that has been provided for years to VM REXX language programmers and
CMS users.
Each filter's functionality and limitations are documented in the
filter's description
and use page.
Of course CMS, CP, TSO and Assembly filter stages will not be provided
due to their
specific VM and MVS operating system nature, but other filter
stages that remain operating
system independent will have ported functions.
JAVA Pipelines will operate on any platform that supports the JAVA Run
Environment,
and therefore JAVA and NetRexx programmers can now enjoy much of the
same programmatic
facilities and advantages found using Pipelines on the IBM VM platform.
What
Is a Pipeline?
JAVA
pipelines are like the pipelines used in plumbing. Instead of
water flowing through pipes, however, data flows through programs. Data
records enters the pipe from a device (such as a disk), flows through
the pipeline, and eventually exits to another device (such as a file or
console display).
Programs,
like pieces of pipe, can be fit together to solve complex
text manipulation problems. Each program,
or stage, in the pipeline changes or manipulates the data that passes through it. As
data flows through the stages it is transformed, step-by-step, into the
results you need. The data flows from left to right. JAVA Pipelines
lets you combine programs so that the output of one program serves as
the input to the next.
Pipe Stages
In
a pipeline, the output of one stage is the input to the next. The
data itself is in the form of discrete records. Note that it is records
that flow through the pipeline, not
a continuous stream of bytes. A record is simply a string of
characters; perhaps a line
of a file or a line entered at the terminal. A string of data that is
terminated by either
a newline character or carriage return/linefeed pair. Imagine a stage
as a small factory through which a conveyor moves records. Records
enter the stage on the left, and leave on the right.
While within the stage, the records can be modified, discarded, or
split apart. Practically any manipulation can happen to them. Precisely
what happens depends on the stage that
is being used. Many stages write one output record for each input
record. Some, however,
do not.
The records entering a stage are called its input stream. The records
leaving a stage
are called its output stream.
Stages can use more than one input stream or output stream. You can use
these secondary
streams to write complex multistream pipelines.
Execution
Characteristics
Since
each stage is assigned to run in it's own Thread of Execution,
the work processed
among the stages is performed CONCURRENTLY as records become available
to each stage.
So while the PIPE is reading the last records on its input stage, it
may be writing
the first records to its output stage, and processing the middle records in
the intermediate stages.
This multi-threading design exploits CPU cycles that may have otherwise
gone wasted while waiting. This design
greatly increasing the throughput performance over single-threaded
operations.
The
PIPE Interfaces
JAVA
Pipelines can be used from a supplied graphical User
Command
Interface or invoked from
JAVA or NetRexx programming using the Application
Programming
Interface (API).
A Primary and optionally 1 Secondary input (or output) is currently
supported for those filters that
are described as using such.
The Pipelines
Command Interface
Informational messages and
error messages will appear in their separate sub-windows on the Command
Interface. Informational messages will contain information
regarding the execution process. Error messages will contain
information about syntax, semantics, input or output errors.
The
Command Interface allows you to enter any length PIPE. The buttons
permit you to
start and stop the pipe execution; and clear the command
line.
Every pipe entered is archived in the pulldown combobox and also to a
disk file named PIPES.log.
Pipe output will appear on the "Terminal Command Screen".
Application
Programming Interface for JAVA programmers
pipe apipe = new pipe();
String pipestring = String "< test.data | filter1 | filter2 | ... | > output.file";
rc = apipe.apiprocess1(pipestring); --Return Code is rc
Application
Programming Interface for NetRexx programmers
apipe = pipe()
pipestring = "< test.data | filter1 | filter2 | ... | > output.file"
rc = apipe.apiprocess1(pipestring) --Return Code is rc
Pipe Return
Codes
API for NetRexx programmers using STEM
input.
LINE = "STEM." --Set Default ID for Rexx index string loop ix = 1 to 10 --Populate the STEM variables. LINE[ix] = 'This is record '||ix end LINE[0] = ix - 1 --Mark number of index strings in array apipe = pipe() --Get a new pipe program instance apiArg = Rexx '' --Rexx index string holds pipeline apiArg[0] = 5 --Number of stages (index 1 through n) apiArg[1] = "STEM" --Stage 1 is tells pipe that input is a STEM apiArg[2] = LINE --Name of Rexx index string serving as STEM apiArg[3] = "CHANGE /A/a/* --Change all 'A' to lowercase apiAarg[4] = "> OUTPUT.FILE" --Output rc = apipe.apiprocess2(apiArg) --Return Code is rc
API for NetRexx programmers using STEM
output.
OUTLINE = "STEM." --Set Default ID for Rexx output index string apipe = pipe() --Get a new pipe program instance apiArg = Rexx '' --Rexx index string holds pipeline apiArg[0] = 4 --Number of stages (index 1 through n) apiArg[1] = "LITERAL this is the time for all good men." --Stage 1 input apiArg[2] = "COUNT WORDS" --Stage 3 is tells pipe to count the words apiArg[3] = "STEM" --Stage 4 is tells pipe that output is a STEM apiArg[4] = OUTLINE --Name of Rexx index string serving as STEM OUTLINE = apipe.apiprocess2(apiArg) --STEM output will be returned as a result if OUTLINE < 0 then --Check return code do say 'Error, RC='OUTLINE end
API for NetRexx programmers using STEM
input and STEM output.
OUTLINE = "STEM." --Set Default ID for Rexx output index string LINE = "STEM." --Set Default ID for Rexx input index string loop ix = 1 to 10 --Populate the STEM variables. LINE[ix] = 'This is record '||ix end LINE[0] = ix - 1 --Mark number of index strings in array apipe = pipe() --Get a new pipe program instance apiArg = Rexx '' --Rexx index string holds pipeline apiArg[0] = 5 --Number of stages (index 1 through n) apiArg[1] = "STEM" --Stage 1 is tells pipe that input is a STEM apiArg[2] = LINE --Name of Rexx index string serving as STEM apiArg[3] = "HEXLATE" --Stage 3 is tells pipe to translate contents to hex apiArg[4] = "STEM" --Stage 4 is tells pipe that output is a STEM apiArg[5] = OUTLINE --Name of Rexx index string serving as STEM OUTLINE = apipe.apiprocess2(apiArg) --STEM output will be returned as a result if OUTLINE < 0 then --Check return code do say 'Error, RC='OUTLINE end
Programmers
that employ the API should ensure that both the
PIPELINES.JAR and NetRexxR.JAR
files appear in their CLASSPATH when compiling and executing their
program.
The
PIPE Command String
To
run a pipeline, use the PIPE command string. This is entered from
the PIPEGUI command line
. PIPE accepts one or more pipelines as operands. The PIPE command
string operands can consist of a single pipeline or multiple pipelines.
In a pipeline, stages are separated by a character called a Stage
Separator (the default is the |
):
(OPTIONS) stage_1 |
stage_2 | stage_3 | stage_4 | stage_5 |
... | stage_n [ another stream of stages ]
Examples:
(trace)
< test.data | locate
/LUCKY/ | change /LUCKY/Loser/* | console
< /home/temp/file1.txt | b: concat | > combined.txt ? < /home/temp/file2.txt | b:
Do
not place stage separators prior to the first stage or after the last
stage.
For the default stage separator, the PIPE command expects the character
X'4F' ( | ). You must determine which key on your terminal
generates the
character X'4F'. It is a solid vertical bar (|) on most computers
keyboards. Some workstation programs map the solid vertical
bar to the split vertical bar. The solid
vertical bar is the "LOGICAL OR" operator in JAVA and NetRexx programs. In
a pipeline, it indicates where one stage ends and another one begins.
Device
Drivers
Device
drivers are stages that interact with devices or other system
resources. The
simplest pipelines consist of two device drivers. Data read from one
device moves through the pipeline to the other device. For example, to
copy data from a file to your terminal, enter the following command
(change TEST.DATA to the name of an existing file):
<
test.data | LOCATE -mymatch- |
COUNT | console
< test.data | LOCATE /mymatch/ | count | > output.file
Filters
A
filter reads data records from its input stream, does some work using
that data, and writes the
results to its output stream. The difference between a filter and a
device driver is that a filter does not interact with devices or other
system resources, whereas a device driver
lets you get data in and out of a pipeline.
The filters are stages that work on
data records already in the pipeline. The COUNT stage used in the above
example pipeline is a filter. It counts every record that flows into it
from its input stream. Then it writes one record containing that count
to its output stream. The LOCATE stage is also a filter.
It examines the records from its input stream, looking for those that
match a specified string. It also provides for Regular
Expressions. If the record matches, LOCATE writes the
record to its output stream. LOCATE
discards records that do not match.
What is a Pipeline stall ?
The
first pipeline below may cause a stall. By placing the stage
ELASTIC in the second
pipe segment will remediate the possible stall.
(example 1)
<
test.data | a: fanout | b: fanin | console ? a: | b:
(example 2)
<
test.data | a: fanout | b: fanin | console ? a: | elastic | b:
Terminology
SOURCE
This
is a file, stem variables or
literal that serves as data input.
SINK
This
is an file or console that
final output records are directed towards.
The
currently available filter stages are:
Pipeline:
stagecmd stagesep stagecmd
Stage:
stagecmd
Label Group:
label:
label stagecmd
Opt A:
(ENDCHAR char)
(ESCAPE)
(STAGESEP char)
(TRACE)
Use the PIPE command or the PIPELINES script to invoke JAVA Pipelines.
Operands
stagesep
is the stage separator character, which separates a stage from a
following stage. By default, the stage separator character is the
character on your terminal with a value of X'7C' on ASCII systems
and X'4F' on EBCDIC systems. (It is the solid
vertical bar on most terminals.) However, you can use the STAGESEP or
SEPARATOR option to assign a different stage separator character. You
cannot specify left parenthesis, right parenthesis, asterisk (*),
period, colon (:), or blank for the stage separator character.
endchar
is the pipeline end character defined by the ENDCHAR option. Use
endchar
to separate multiple pipelines on a single
PIPE command. You
must specify the ENDCHAR option to use endchar.
You cannot
specify
left parenthesis, right parenthesis, asterisk (*), period, colon (:),
or blank for endchar.
escape
assigns an escape character, char, that can be used to override the
processing of a character that has special meaning to the PIPE command.
These special characters include the stage separator character, the
pipeline end character (if defined), and the escape character (if
defined). Left parenthesis,
right parenthesis, asterisk (*), period and colon (:) may have a
special meaning,
depending on their placement. You must place the escape character
IMMEDIATELY
before the character that you do not want treated as a special
character. The escape character must be specified if used as a single
character.
You cannot specify left parenthesis, right parenthesis, asterisk (*),
period,
colon (:), or blank for the escape character. There is no default
escape character.
You cannot specify the ESCAPE option for an individual stage.
label
is a label that identifies where a stream enters or leaves a particular stage that
has multiple input or output streams.
The first occurrence of a label is called a label definition.
It establishes the potential for intersecting pipelines to be attached at the position
in the pipeline where it is specified. Each subsequent use of the same label is
called a label reference.
Use label references to define additional input and output
streams for the stage. To use a label reference, specify a stage
containing only label
with no stage.
A label is a string of up to 8 alphanumeric characters. A label must
be immediately followed by a stream identifier or a colon with no
intervening blanks.
Example: the following will input two files into separate streams, then
after each have read a record the two records will be concatenated and
placed as a single record in the output file. This will continue for all records.
< /home/temp/file1.txt | b: concat | > combined.txt ? < /home/temp/file2.txt | b:
operands
are any operands valid for the specified built-in stage or
user-written stage.
I/O direction indicators
Use of both <, >, and >> must be space
delimited and
separated from any
operands on either side.
Options
You can specify options in two ways on a PIPE command:
1. You can specify options immediately after the command name, PIPE. In
this case, the scope of the options is global to the entire PIPE
command.
2. You can specify options at the beginning of a stage. In this case,
the scope of the options is limited to the stage on which the options
are specified. If a label definition is specified, the options must
follow the label definition.
The following options cannot be specified at the beginning of a stage:
ENDCHAR, ESCAPE, NAME, STAGESEP, and SEPARATOR.
Options specified on a stage override options specified globally on a
PIPE command. You must enclose options in parentheses.
ENDCHAR char
defines the pipeline end character. You can specify the character as a
single character, char,
or the 2-character hexadecimal representation of a character, hexchar.
Do not enclose the hexadecimal representation in quotation marks.
You cannot specify left parenthesis, right parenthesis, asterisk (*),
period, colon (:), or blank as the pipeline end character. You cannot
specify ENDCHAR as an option for an individual stage.
ESCAPE char
assigns an escape character, char,
that can be used to override the processing of a character that has
special meaning to the PIPE command. These special characters include
the stage separator character, the pipeline end character (if defined),
and the escape character (if defined). Left parenthesis, right
parenthesis, asterisk
(*), period and colon (:) may have a special meaning, depending on
their placement. You must place the escape character immediately before
the character that you do not want treated as a special character. The
escape character can be specified as a single character, char.
You cannot specify left parenthesis, right parenthesis, asterisk (*),
period, colon (:), or blank for the escape character. There is no
default escape character. You cannot specify the ESCAPE option for an
individual stage.
STAGESEP char
assigns the stage separator character. Use the stage separator
character to separate the specification of a stage from a subsequent
stage. The character can be specified as a single character, char,
or the 2-character hexadecimal representation of a character, hexchar.
Do not enclose the hexadecimal representation in quotation marks.
If you change the definition of the stage separator to a character
other than the default stage separator, you can use the default stage
separator as an argument of a stage.
You cannot specify left parenthesis, right parenthesis, asterisk (*),
period, colon (:), or blank as the STAGESEP or SEPARATOR character. You
cannot specify STAGESEP or SEPARATOR as an option for an individual
stage.
TRACE
displays trace information. TRACE is useful for debugging pipeline
application programs. This option can cause a large amount of data to
be displayed.
Usage Notes
1. Specifying a stage defines both the primary input and primary output
streams for that stage. Using label references defines additional input
and output streams for a stage.
2. The stages of a PIPE command can write records up to 2**(7) - 1
bytes in length.
3. If a PIPE command is too long to type conveniently on the command
line you could put each stage on a separate line adhering to the JAVA
or NetRexx continuation rules.
4. The escape character, stage separator character, and pipeline end
character have no effect within the specification of options. These
characters take effect only outside the parentheses in which the
options are enclosed.
For example, in the following PIPE command, only the third and fourth
occurrences of the @ character are treated as a stage separator
character.
(stagesep @) <
INPUT FILE @
locate /||/ @
console
This command displays any lines of the file INPUT FILE that contain the
character string, ||.
The first occurrence of the @
character defines @ as the stage separator. Because the second
occurrence of
the @ character appears within the specification of the options, the
second @ is treated as part of the name assigned to the pipeline,
@PIPE1.
5. When specifying the ENDCHAR, ESCAPE, SEPARATOR, or STAGESEP option,
you can use the 2-character hexchar
form to define the
character. However, when the character is subsequently used in the
pipeline, a single character must be specified.
(stagesep 6C)
< test.file % find
ABcd % console
Note that 6C is the hexadecimal value of the percent sign (%).
6. A colon may NOT be used in a file name.
For
more information about using pipelines,
see the
IBM CMS Pipelines Reference and Pipelines User's Guide
at the
IBM website .
[Return to Index]
[Return to Cullen
Programming Home Page]
|
Copyright © Cullen Programming 1987, 2017
All Rights Reserved
|
|
|
|