Super-Kamiokande Anti-Detector Event Building: "Anti-Sorter"

(also as postscript file available)

Larry Wai, rev. 26 Feb. 1997 [HGB, 4 Feb. 1998]


  1. Introduction
  2. Anti-Detector Data Acquisition and Data Format
  3. Overview of Anti-Sorter Code Structure
  4. Scanning of Fastbus Data
  5. Merging of VME and Fastbus Data
  6. Repacking of Fastbus TDC Data
  7. Summary of Data Integrity Checks

1. Introduction

In this report we shall describe how the "anti-sorter" online process builds the Super-Kamiokande anti-detector event from block data read out from 4 Fastbus crates and 1 VME crate. The current version at the time of the writing of this report is "sorter.c.sep23". However, the basic structure of the code is essentially the same as that used at the beginning of the data taking period, April 1, 1996.

The anti-sorter performs two important tasks:

1) converts block data as collected from the Fastbus and VME crates into complete anti-detector events, and
2) checks data integrity.

First we give an overview of the parts of the anti-detector data acquisition and describe the block data format which the anti-sorter uses as input, as well as the event data format used as output in subsection 2. We then give an overview of the coding structure of the anti-sorter process in section 3, which is then followed by a more detailed description of the individual components of the anti-sorter code in sections 4 through 6. A summary of how data integrity is enforced is given in section 7.


2. Anti-Detector Data Acquisition and Data Format

The anti-detector data acquisition consists of 4 Fastbus crates, one in each quadrant hut, and 1 VME crate in the central hut. The Fastbus crates each contain 5 LeCroy 1877 TDC modules, and one Struck STR137 latch module. The TDC modules produce time data which are converted into time and charge for each PMT channel, and the latch module obtains the event number from the TRG module. The VME crate produces universal time from a GPS module, "Busy In Progress" (BIP) information from the Fastbus crates, as well as the event number from the TRG module. As we shall see in section 5, the event number from the TRG module is essential for providing the reference point for merging the anti-detector data from the 5 crates.

The data from each crate is collected by the "anti-collector" (written, developed, and tested by Jeff George) in block form which is subsequently passed to the anti-sorter. The Fastbus crate data structure is shown in the figure below. Each module header contains two pieces of useful information: the number of words in the module event, the event buffer number (8 total for the Lecroy 1877 TDC). The number of words in the module event is essential for finding the next module header, and the event buffer number is used for checking the integrity of the data. All 5 TDC modules are required to have the same event buffer number for a particular event. The use of the event buffer number to check data integrity is described in subsection 4.2 and section 5.

FASTBUS CRATE DATA FORMAT:
module 1 header
ith channel number + time edge word
jth channel number + time edge word
...
module 2 header
ith channel number + time edge word
jth channel number + time edge word
...

Fastbus crate data structure (see also example)

The VME crate data, along with additional information, comes in a format which consists of 15 words per event. Not all 15 words are used, and the anti-sorter adds additional information from the Fastbus crates. These 15 words form the event header for each completed anti-detector event. In the next figure we show the event header which filled with VME crate and other data by the anti-collector, and additional information by the anti-sorter.

ANTI-DETECTOR EVENT HEADER:
word 1data_sizesize of event in bytes
word 2data_typedata type flag (50 for anti-detector data)
word 3run_numberrun number
word 4nodename name of online workstation;
8-byte char array containing the string "sukant" including null at end
word 5
word 6server_number(10 for sukant)
word 7ltcgpsLTC timestamp at last GPS update (uncorrected)
word 8event_numberevent number
word 9gps_sec/NSGPSUTC at last update to seconds (in seconds since 1/1/70)
word 10gps_usec/NUSGPShigher precision UTC (in microseconds)
word 11ltctrgLTC timestamp at TRG
word 12ltcbipLTC timestamp at end of BIP
word 13T0(1),T0(2)trigger times for huts 1 and 2 (16 bits each)
word 14T0(3),T0(4)trigger times for huts 3 and 4 (16 bits each)
word 15status_flagBusy bits and other event status bits (repacked in offline data)

Anti-detector event header (see also example);
all data is filled in by the anti-collector, except for words 13 and 14 which are filled in by the anti-sorter

The next figure gives an overview of how all the input block data is converted to the output anti-detector event format. The block data header consists of 7 words. The first word is the number of words in the buffer. The next 6 words are memory pointers indicating the beginning of 6 blocks of data. The first block is the VME data, the next four blocks are the Fastbus data, and the last block is currently unused.

The Fastbus data has to be digested carefully before it can be merged correctly with the VME data, and several data integrity checks are performed during this procedure. In the next few sections, we now turn to a detailed discussion of the anti-sorter code to see how this is done.

BLOCK DATA FROM ANTI-COLLECTOR:
block data header (7 words)
VME data
Fastbus crate 1 data
Fastbus crate 2 data
Fastbus crate 3 data
Fastbus crate 4 data
(empty block)

ANTI-SORTER

ANTI-DETECTOR EVENT TO ANTI-SENDER:
Event 1 header (15 words)
ith channel number, leading edge, trailing edge
jth channel number, leading edge, trailing edge
...
Event 2 header (15 words)
ith channel number, leading edge, trailing edge
jth channel number, leading edge, trailing edge
...

Input to and output from the "anti-sorter" online process.


3. Overview of Anti-Sorter Code Structure

The structure of the anti-sorter code can be broken down roughly as shown in the following figure. Note that in this diagram we have not shown the initialization procedure, nor the parts of the code which allow the run control to stop the anti-sorter at the end of a run.

Data is received in block form from the anti-collector in "NOVA" buffers of roughly 256 events. This number cannot be much larger because of memory limitations in the SUN workstation which serves as the anti-detector online computer. This number can be made smaller, but was originally chosen to be as large as possible due to lack of buffering, and consequent loss of events, during the dual port memory swiching in the VME crate. Two parameters in the anti-sorter code are relevant to the buffer size: NEVT_MAX_COLLECTOR (currently set to be 300) and NEVT_MAX_SAVE (currently set to be 40). NEVT_MAX_COLLECTOR sets the size of the internal arrays which buffer the crate data as it is sorted, and NEVT_MAX_SAVE sets the size of the internal arrays which buffer the crate data which are unmatched after finishing a particular buffer of data. This unmatched crate data at the end of a buffer is saved until the next buffer of data is received, when it is put at the front of the crate data to be matched. In this way, the 5 blocks of crate data do not need to be perfectly synchronized. In practice, zero to a few events in a buffer of 256 events must be saved across buffers to form complete events for data taking at 10 Hz. This number increases with the data taking rate.


get buffer of block data from anti-collector

scan Fastbus data (section 4)

merge VME and Fastbus data (section 5)

repack Fastbus data (section 6)

check data quality (see section 7)

send anti-detector events to anti-sender

next buffer

Overview of anti-sorter code (see also flow chart)


4. Scanning of Fastbus Data

A major part of the overall anti-sorter code consists of a scan through all Fastbus data. The figure below lists the important tasks of the Fastbus data scanning code, ordered as the code is actually found. Each buffer of data received from the anti-collector has four blocks of Fastbus data, each from a different Fastbus crate. There are roughly 256 events in each block of data. Each block of data is scanned separately, and the events from each crate are merged later with the VME code to form a complete anti-detector event (see section 5).

  • loop over TDC events in each block of Fastbus crate data

    • check for valid beginning of an event

    • loop over Lecroy 1877 TDC module in each crate TDC event

      • module header word

      • loop over data words in each TDC module event

        • check module number
        • get time edge sign, time, channel number
        • get trigger time
        • get 4 lowest bits of event number at trigger time

    • get Struck STR137 latch event number

    • take into account rollover of 16 bit event number counter

    • check module event buffer numbers

    • compare TDC event number bits with latch event number

    • loop over triggers in the TDC time window

      • calculate event number

      • copy TDC data from first event in time window

Overview of Fastbus scanning code in anti-sorter

4.1. Checking for Valid Beginning of an Event

The Fastbus data scanning loop over TDC events for a single crate begins with a check for valid beginning of an event. The first two words of a valid event consists of the DC2 (*1) byte count for the event (denoted by the variable dc2_head in the code) and the FSCC (*2) "header and counter" word (denoted fscc_head). The lowest 11 bits of the FSCC header and counter word contain a word count for the event. If these two word counts for the event match, then we consider the beginning of the event to be valid. If not, then we throw away the first word, and repeat the check for valid beginning of event with the next two words.

Note that a current hardware problem in the FSCC occurs on occasion so that the FSCC word counter is incorrect for all events after a certain point in a run. To remedy this, we consider a second criteria for valid beginning of event in case the DC2 byte count and FSCC word count do not match. By chance, the last two words in an event are zero at present. This is because the last module read out is the Struck STR137 latch, which produces 3 words of data. The first word is the 16 bit event number, and the last two words are unused at present. Thus, in case the DC2 byte count and FSCC word count do not match, we look at the last two words of the previous event. If they are both zero, then we assume a valid beginning of event.

Note that in the FSCC header and counter word there is a "bad event bit" (18th bit). We count the number of times this is set true for each crate, and the result is written to the log file. However, no further action is taken at the present time. The setting of the bad event bit seems to be correlated with the above mentioned FSCC word counter hardware problem.

4.2. Loop Through TDC Data

After a valid beginning of an event has been identified, we then proceed to loop through all of the TDC modules. The first word is the module header for the first TDC. This header contains two useful pieces of information:
1) the number of words for the module event, and
2) the module event buffer number.
The number of words for the module event allow us to know where the beginning of the next module event is located. The module event buffer number is used for comparison with other modules to ensure data integrity. If any TDC module event buffer number does not match the others in the crate, then we skip the crate event.

As we loop through the TDC data words we extract several pieces of information: the module number, the time edge sign, the 16 bit time, and the TDC channel number. The TDC channel number is converted into a global anti-detector PMT number (*3), defined as follows:

  crate*480 + module*96 + channel
where
  crate=0,1,2,3  module=0,1,2,3,4  channel=0,1,2,3,...,95
The module number is used as a simple check on data integrity, ie if the module number from the TDC data word does not match the TDC module number, we skip the event. The time edge sign, 16 bit time, and channel number are all saved for repacking into the final anti-detector event format (see section 6).

We also check a certain TDC channel in the crate for the trigger time Currently, this is in the 5th TDC module (out of 5 modules), 92nd channel (out of 96 channels). This is saved to calibrate the time of the event. Also, we check the 93rd through 96th channels in 50ns (parameter EVTNUM_DELTA) time window around the trigger time for edges which give us the lowest 4 bits of the event number from the TRG module. These 4 bits are used to obtain the event number, as well as check the integrity of the event number, described in the next subsection.

4.3. Fastbus Crate Event Numbers

The final step in processing the Fastbus crate event is to determine the event numbers for the data. Note that more than one trigger can occur during the several microsecond time window of the TDC. The Struck STR137 latch obtains the event number for the first trigger in the time window, which also serves as the STOP for the TDCs [After a 6 microsecond delay.]. Subsequent triggers which follow within a few microseconds will have new event numbers, which may or may not increase by units of 1, depending upon the triggering configuration. The lowest 4 bits of the event number are fed into TDC channels, and the event number is reconstructed using this information. The difference between the 4 bit TDC event number for subsequent triggers in the window and the 4 bit TDC event number for the trigger associated with the 16 bit Struck latch event number modulo 16 is used as an offset to be added to the Struck latch event number.

The event number is uniquely determined, as long as there are less than 16 triggers within a TDC time window (as is the case in current data taking at Super-Kamiokande).

The last three words of the event contain the information latched by the Struck 137 module in the Fastbus crate. The first word contains the 16 bit event number from the TRG module, and the last two words are currently unused. The 16 bit event number is checked for "rollover" by comparing the event number with the previous event number. If the difference is 65536 (variable MAX_EVTNUM_TDC) to within a margin of 1024 (variable EVTNUM_DELTA), then we increase all subsequent event numbers by 65536.

This completes the description of how Fastbus crate data is converted into crate events with event numbers. In the next section we turn to a discussion of how the 4 Fastbus crate and VME crate events are merged into a complete anti-detector event.


5. Merging of VME and Fastbus Data

The next figure summarizes the anti-sorter code which merges the events from the 4 Fastbus crates and the VME crate for each buffer of data. Note that the basic philosophy at present is to use the event number latched in the VME crate as the reference point. The Fastbus crate events are then matched to the VME crate events. The reason for this is that ideally we will have some dead time while the TDC data is being digitized in the Fastbus crates. This will lead to some missing events. However, there should in principle be no dead time in the VME crate. Thus for an ideal DAQ system, the event numbers latched in the VME crate should provide us with the complete set of anti-detector event numbers for the run.

The first task of this section is to determine the last event number to look at for this buffer. This is defined to be the lowest event number with all 5 crates of data. The events with event numbers higher than this are saved for matching with data from the next buffer. The reason why we need this is because the block data from the various crates are not perfectly synchronized. It is possible for events to have data spread across two input buffers from the anti-collector.

Next, we check for Fastbus crate event numbers which have no corresponding VME crate event numbers. This is recorded and put into the log file; no further action is initiated at the present time.

Now we loop through all VME crate events. First we check to make sure that the event number is higher than the previous event number, and that the event number has not incremented more than 10000 over the previous event number. If the event violates either of these conditions, we skip the event.

Next, we loop through each Fastbus crate to see if all 4 Fastbus crates have matching event data. Also, we check to see if all 4 Fastbus crates have matching TDC module event buffer numbers (*4) If either condition is violated, then we skip the event.

Finally, we are now ready to prepare the event for output to the "anti-sender" (written, developed, and tested by Andy Stachyra) which buffers, checks, and sends the final data to the event builder. First we write the trigger times for each crate into the 13th and 14th long words (32 bit words) in the header. The 16 bit times are packed sequentially by crate number. Next, we repack the Fastbus TDC data in short word (16 bit) format, described in section 6.

  • determine lowest event number with all 5 crates of data;
    this is the last event number to look at in this buffer

  • for each Fastbus crate event number check for missing VME crate event number

  • loop through all VME crate events

    • only use monotonically increasing event number
    • match Fastbus event numbers to VME event numbers
    • only write out event if all 4 Fastbus crates have event data
    • only write out event if all 4 Fastbus crates have matching TDC event buffer numbers
    • write trigger times and set bits in event header
    • repack Fastbus Data (see section 6)

Overview of Fastbus and VME crate event merging code in anti-sorter


6. Repacking of Fastbus TDC Data

There are three reasons for repacking the TDC data:
1) to reduce the size of the anti-detector data,
2) to take care of TDC data anomalies, and
3) to set bits for each PMT hit for online noise filtering.

6.1. Short Word Data Format

The TDC data format comes in long word (32 bit) format. The most significant 16 bits of the word contain information pertaining to the channel, module number, and time edge sign. The least significant 16 bits contain the time in TDC counts (0.5 nanosecond per count). Each logic pulse input represents a single PMT pulse, the width of the logic pulse being proportional to the integrated charge of the PMT pulse, and the leading edge of the logic pulse giving the time of the leading edge of the PMT pulse. Since the channel number information is repeated twice for each logic pulse, we can repack the two long words into 3 short words (16 bit), thus reducing the overall data size by roughly 30%.

The figure below summarizes the final PMT hit format of 3 short words. The first short word contains 4 noise filter bits and a 12 bit channel number (see formula in subsection 4.2). The second short word contains the 16 bit leading edge time, and the third short word contains the 16 bit trailing edge time.

  • first short word -- 4 bit filter and 12 bit channel

    • 15th bit is passthrough flag
    • 14th bit is trigger window flag
    • 13th bit is unused
    • 12th bit is sliding window noise filter flag
    • lowest 12 bits are the channel number (see formula in subsection 4.2)

  • second short word -- 16 bit leading time edge

  • third short word -- 16 bit trailing time edge

Short word format of repacked PMT hit

6.2. Online Noise Filtering Bits

The anti-sorter code allows for the possibility of online noise filtering. Each PMT hit has 4 bits set to determine if the hit should be written to the output buffer. If any of the 4 filter bits are set true, then the PMT hit is written to the output buffer. One filter bit is a passthrough flag, currently always true. Another filter bit is set true if the time of the PMT hit is after 5 microseconds (variable TRG_TIME_LOWER) before the trigger time, and before 1 microsecond (variable TRG_TIME_UPPER) after the trigger time. Another filter bit is set true if there are more than 5 hits (variable win_cut) in a 64 nanosecond window around the hit (*5). One filter bit is currently unused.

Since the passthrough filter bit is currently set to be always true, there is no filtering of the PMT hits at the present time.

6.3. Treatment of TDC Time Edge Anomalies

Examples of TDC data anomalies include multiple leading or trailing edges, and single leading or trailing edges. Multiple same sign edges can occur if the logic pulse input into the TDC is noisy or has reflections. Single leading or trailing edges can occur if the logic pulse input happens to fall across the end of the TDC window.

If a leading time edge for a channel in the data stream is not followed by a trailing edge, we write a short word of zero for the trailing edge time. Similarly, if a trailing edge is not preceded by a leading edge, we write a short word of zero for the leading edge time.

Note that size of the final output buffer passed to the anti-sender must be an integral number of long words. If the number of short words resulting from the repacking of the event is odd, then we write a "dummy" PMT hit at the end of the event, with channel number 0xffff followed by two times of zero.


7. Summary of Data Integrity Checks

The anti-sorter has several methods for enforcing data integrity. A few have already been mentioned: those implemented during the scan through the Fastbus data (section 4), and those implemented during the merging of the Fastbus and VME data (section 5). These data integrity checks involve throwing away individual events if they do not meet data quality standards.

After processing each buffer of data, the anti-sorter code also performs online data quality monitoring. Three quantities are monitored for each input data buffer received from the anti-collector:

1) number of TDC-latch event number mismatches in a Fastbus crate,
2) number of TDC event buffer number mismatches within a Fastbus crate, and
3) number of TDC event buffer number mismatches between Fastbus crates.
If the preset threshold of number of mismatches (variable N_MISMATCH_MAX, currently set to be 50) is exceeded for any of these quantities, then the data quality is determined to be abnormal and a signal is issued for the Fastbus crates to be reset (*6). After the Fastbus crate controllers are automatically reset (requiring approximately 2 minutes), the data quality should return to normal.

The following figure summarizes the quantities used for enforcing the integrity of the data.

checks performed during the scanning of Fastbus data (section 4):
  • skip words until valid beginning of an event is found
  • skip crate event if TDC module event buffer number is mismatched within the crate
  • count number of TDC-latch event number mismatches -- no action
  • count number of DC2-FSCC word count mismatches -- no action
  • item count number of FSCC bad event bits set -- no action
checks performed during the merging of Fastbus and VME data (section 5):
  • skip event if event number is not monotonically increasing
  • skip event if TDC module event buffer number is mismatched between Fastbus crates
online data quality monitoring performed after processing of a buffer of data
(to determine whether to issue Fastbus controller reset signal):
  • count number of TDC-latch mismatches in buffer
  • count number of TDC module event buffer mismatches within a crate in buffer
  • count number of TDC module event buffer mismatches between crates in buffer

Summary of data integrity checks performed by anti-sorter


Footnotes:

(*1)
The DC2 is a small computer which controls VME dual port memory boards. The software was written, developed, and tested by Mei-Li Chen.
(*2)
The FSCC is the Fastbus crate controller. The controller software was written, developed, and tested by Mei-Li Chen.
(*3)
Note that this hardware based number is converted offline into a final PMT number.
(*4)
Recall that we have already checked to make sure that all the TDC modules within a crate have matching TDC module event buffer numbers (see subsection 4.2).
(*5)
This algorithm was written, developed, and tested by Vladimir Chaloupka
(*6)
Custom hardware (designed and tested by Hans Berns) was developed to veto the global trigger during the reset, as well as to send the reset logic pulse to the Fastbus controller. The hardware interface is implemented through a custom VME board, which is accessed by additional controlling software (written by Jeff George).