Project

General

Profile

Actions

Development #2268

open

Development #2256: Demonstrator for digi-based trigger, STS-only

Development #2265: Migrate algorithms

Migrate STS unpacker

Added by Volker Friese 8 months ago. Updated about 2 months ago.

Status:
In Progress
Priority:
Normal
Target version:
Start date:
10/18/2021
Due date:
10/29/2021 (about 7 months late)
% Done:

80%

Estimated time:
62.00 h
Spent time:

Description

The STS unpacker algorithm is to be re-implemented, deriving from its base class, in the algo directory. The current software is to be kept and place until the new algorithm is verified.

The unpacker should treat the mCBM 2021 raw data format. Backward compatibility is not needed.

Target: algo/detectors/sts/CbmAlgStsUnpack2021


Related issues

Blocks Framework - Development #2272: Task for STS unpackingResolvedVolker Friese10/18/202111/05/2021

Actions
Follows Framework - Development #2264: Base class for (STS) unpackerClosedVolker Friese09/28/202110/15/2021

Actions
Actions #1

Updated by Volker Friese 8 months ago

  • Due date set to 10/18/2021
  • Start date changed from 09/28/2021 to 10/18/2021
  • Follows Development #2264: Base class for (STS) unpacker added
Actions #2

Updated by Volker Friese 8 months ago

  • Due date changed from 10/18/2021 to 10/29/2021
Actions #3

Updated by Volker Friese 8 months ago

Actions #4

Updated by Volker Friese 6 months ago

  • Status changed from Assigned to Feedback
  • Assignee changed from Dominik Smith to Pierre-Alain Loizeau
  • % Done changed from 0 to 10

Dear Pierre-Alain,

by inspecting the code it seems to me that addressing (hardware-software address matching) can and should be substantially simplified and moved largely outside of the algorithm.

The basic entities on the hardware side are the component and the eLink. CROB and FEB are intermediate entities which reflect neither on the hardware address nor in the software address (CbmStsAddress). Unless we want to use the side qualifier in CbmStsAddress for the FEB (there will be one FEB for the front side and one for the back side, I understand). But that will entail changes in the STS reconstruction software.

Each eLink connects to one ASIC (SMX), so logically, eLink and ASIC are identical. The time calibration parameters are ASIC-specific. The charge calibration constants are implemented as FEB-specific (why?), but that can also easily be done on ASIC level (each ASIC within a FEB then taking the same parameter value). I thus would go for dropping CROB ID and FEB ID completely from the algorithm

The task then is to convert the triplet (component, eLink, channelInElink) to the duplet (module, channelInModule).

The algorithm is specific to a certain component (will be called for a single component at a time). So, what it needs to know is
eLink -> module address (CbmStsAddress)
eLink -> ASIC number within module ( 0 - 15 )

I understand that within each component, eLinks are consecutively numbered, independent from their connection to a CROB. So, this mapping can be easily and efficiently realised by

std::vector<std::pair<address, asicNr>>, where the index of the vector is the eLink number.

Having that, the mapping could be written as

// --- Hardware-to-software address
uint32_t numChansPerAsic = 128; // required parameter
uint32_t numAsicsPerModule = 16.; // required parameter
vector<uint32_t> moduleAddress; // required parameter - Software address (CbmStsAddress)
vector<uint32_t> asicNr; // required parameter - ASIC number within module
uint32_t address = moduleAddress[elink];
uint32_t channel = 0;
if ( asicNr < numAsicsPerModule / 2 ) { // front side (p side)
channel = message.GetHitChannel() + numChansPerAsic * asicNr;
}
else { // back side (n side)
channel = (numAsicsPerModule * numChansPerAsic) - message.GetHitChannel() + 1;
}

Do you agree / can you verify this?

Of course, I am here just considering the core functionality (creating a digi from a hit message by expanding / correcting address, time and charge information.

-

Actions #5

Updated by Pierre-Alain Loizeau 6 months ago

I guess it would work. And a mapping class would allow to have a single place where the HW constants are defined. I would definitely not hard-code them in the Algo code, as that could bite us strongly if we ever get an upgrade of the detector or its readout chain.

We would still probably need to add some "converter" methods from {component + elink} or from {address + channel} to {CROB + FEB + channel} in the class doing the mapping abstraction, in order to get these values for the error printouts and in the low level monitoring class (the one working on the unpacker output): CROB and FEBs are the natural objects when you want to find HW problems or evaluate noise levels, mean raw rates or communication errors (ASIC would be too many plots and components do not show where some data are missing or to many).

Actions #6

Updated by Volker Friese 6 months ago

I do see that. In the above, I concentrated on the core functionality, neglecting monitoring. I think once the algo returns the proper monitoring entity, these data can be re-aggregated on the monitor level to CROB, FEB level or whichever makes sense.

Actions #7

Updated by Volker Friese 6 months ago

  • Status changed from Feedback to In Progress
  • Assignee changed from Pierre-Alain Loizeau to Volker Friese

I copy this information of Pierre-Alain's by email (23.11.2021) to here to have it in a common place. This concerns the currently used software stack (macro, task, algo and input files).

=============================
We are using the "run_unpack_tsa.C" macro in "macro/run".
This calls the reco/steer/CbmRecoUnpackTask

If used with the default mcbm_beam_2021_07_surveyed setup and data from run 1588, for example the /lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node8_1_0000.tsa file, if will call reco/detectors/sts/unpack/CbmStsUnpackConfig which in turns calls reco/detectors/sts/unpack/CbmStsUnpackAlgo (derieved from CbmStsUnpackAlgoBase).
Be aware that a single tsa file from this run does not contain consecutive TS as we were writing to two nodes (load balancing) with 5 HDD each (round-robin). Each tsa file contains around 28-30 TS. So to get a set of the first 280 consecutive TS, you would need to provide the unpacking macro with 10 files in such a way:
=> Single file (string type par): "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node8_1_0000.tsa"
=> Full set (vector of string type par): { "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node8_1_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node8_2_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node8_3_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node8_4_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node8_5_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node9_1_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node9_2_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node9_3_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node9_4_0000.tsa", "/lustre/cbm/users/ploizeau/mcbm2021/July2021/1588_node9_5_0000.tsa"}

There are two pending changes for the unpacker algo/config not yet in MRs but hopefully coming soon:
- mine preparing the config class for usage in MQ world (should not change anything in Algo logic)
- one by Dario bringing the StsAsicPar/StsModulePar instead of the long list of direct modifiers in the macro

=============================

Actions #8

Updated by Volker Friese 6 months ago

  • % Done changed from 10 to 30

First version is available in https://git.cbm.gsi.de/v.friese/cbmroot/-/tree/stsunpack. This neglects all monitoring and exception handling, except some assert statements. It is surely not complete w.r.t. CbmStsUnpackAlgo, but I think it catches and demonstrates the core functionality.

Actions #9

Updated by Volker Friese 6 months ago

  • Status changed from In Progress to Feedback
  • Assignee changed from Volker Friese to Pierre-Alain Loizeau

Dear Pierre-Alain, would you care to have a look, in particular w.r.t the logic of the handling of the time stamps. I have tried to express the time entirely on an integer base. For the multiplication with the clock cycle 3.125, I treat the latter as ratio 25 / 8 (many thanks to Jan for pointing me to that). The conversion clock cycle to unix time is done only once in order to avoid having rounding errors more than once.

Actions #10

Updated by Pierre-Alain Loizeau 6 months ago

I think your current operator() implementation will not work as it is: you are not converting the pointer to the MS content to a pointer on Messages, so all your accesses will return only 8 bits and grow more and more off (the messages are 64b). That was the purpose of line 312 in the current unpacker

I would also not make the clock and field properties some "runtime" parameters, as these are consequences of the data format and have to be always matching the bit pattern in the messages (consistent with it).
So I would initialize fEpochsPerCycle, fEpochLength, fClockCycleNom and fClockCycleDen with the corresponding constants from the StsXyterMessage file, something like:
  • Add in StsXyterMessage.h around line 159:
    static const uint32_t kulClockCycleNom = 25;            ///< Clock cycle nominator [ns], equivalent to 2*160 MHz clock
    static const uint32_t kulClockCycleDen = 8;             ///< Clock cycle denominator, equivalent to 2*160 MHz clock
    static constexpr double kdClockCycleNs = static_cast<double>(kulClockCycleNom)/kulClockCycleDen;  // ns, not rounded
    
  • Add in StsXyterMessage.h around line 164:
    static const uint32_t kuCycleInTsMsb =  1 << kusLenTsMsbValBinning;
      
  • In algo/detectors/sts/UnpackSts.h
        uint64_t fEpochsPerCycle    = stsxyter::kuCycleInTsMsb;        ///< TS_MSB epochs per epoch cycle
        uint64_t fEpochLength       = stsxyter::kuHitNbTsBinsBinning;  ///< Length of TS_MSB epoch in clock cycles
        uint32_t fClockCycleNom     = stsxyter::kulClockCycleNom;      ///< Clock cycle nominator [ns]
        uint32_t fClockCycleDen     = stsxyter::kulClockCycleDen;      ///< Clock cycle denominator
    

Otherwise it looks really nice.

What I am wondering now is whether we should not take the opportunity to rename the current StsXyterMessage class + namespace to StsXyterDpbXXXX (+all related usage update) and then have an StsXyterMessage cleaned up of all mentions of the old FW (Binning switch and non-binning fields, ...).
One caveat is that we may still have a pending revision of the data format: for backward compatibility we did not move fields around in the HIT message when going to the "binning" method and therefore have bit 9 of the timestamp (which is added by the FPGA) stored in bit 30 of the message instead of being added to the existing field (which would shift all other fields by 1 bit). That is the reason for the hardcoded 10 at line 161-162 of StsXyterMessage
Not sure if we will change this before having a final version of the full software chain as we typically want to be able to compare things with the same raw data, but it may be necessary if we are asked to declare a FW version close to final... somebody else from STS (and maybe MUCH) will have to decide on this...

Actions #11

Updated by Jan de Cuveland 6 months ago

Really nice indeed.

I also think that it would be good if the parameters that are defined by the data format (and not subject to change for the same format version) could be implemented as compile-time constants to enable compiler optimization. To my understanding, this would already be ensured by marking the parameters as constexpr, probably also in StsXyterMessage.h.

For style, I suggest to return the digi from ProcessHitMessage and call digiVec.emplace_back() in the main function instead. This might, however, collide with error counting.

UnpackStsPar::GetElinkPar() seems a bit redundant to std::vector::at(), which also performs the range check and also gives a nice exception message on failure. I would consider using it instead of the custom implementation.

Before moving this new implentation to production, I suggest to add some really minimalistic error handling. In my mind, only two things would need to be changed:
  • Messages of a type other than Hit or TsMsb should be counted and the counts should be returned in a monitoring output struct.
  • All assert statements that depend on the input data should be removed. I would simply default to counting the errors, one counter per assert statement. This might be something like numErrorElinkOutOfRange, numErrorInvalidFirstMessage, numErrorInvalidMicrosliceSize, numErrorTimestampOverflow added to the monitoring output struct.
Actions #12

Updated by Volker Friese 6 months ago

Dear Pierre-Alain and Jan,

thanks a bundle for the feedback and suggestions!

I think your current operator() implementation will not work as it is: you are not converting the pointer to the MS content to a pointer on Messages, so all your accesses will return only 8 bits and grow more and more off (the messages are 64b).

Right, of course. Shame on me.

I would also not make the clock and field properties some "runtime" parameters, as these are consequences of the data format and have to be always matching the bit pattern in the messages (consistent with it).

I also think that it would be good if the parameters that are defined by the data format (and not subject to change for the same format version) could be implemented as compile-time constants to enable compiler optimization. To my understanding, this would already be ensured by marking the parameters as constexpr, probably also in StsXyterMessage.h.

Yes, I agree. My original intention was to remove dependence from StsXyter.h at all, and have it in the configuring and calling instance, but this is hardly possible (need the message format and methods), nor probably desirable.

What I am wondering now is whether we should not take the opportunity to rename the current StsXyterMessage class + namespace to StsXyterDpbXXXX (+all related usage update) and then have an StsXyterMessage cleaned up of all mentions of the old FW (Binning switch and non-binning fields, ...).

I would really love that, since quite some legacy is in that class. I suggest then to leave the current class in state and create a new one called SmxMessage (following the new naming convention for the chip), preferentially adding a tag for the firmware version, such that a switch can possibly be done on user level.

For style, I suggest to return the digi from ProcessHitMessage and call digiVec.emplace_back() in the main function instead. This might, however, collide with error counting.

I wanted to do that. Currently, one digi is created per call to ProcessHitMessage, so a StsDigi could be the return value. However, if some error conditions are caught, it might be that no digi is created, so what should I return in that case? If I could return a (smart) pointer, that would of course be handable (return a null pointer).

UnpackStsPar::GetElinkPar() seems a bit redundant to std::vector::at(), which also performs the range check and also gives a nice exception message on failure. I would consider using it instead of the custom implementation.

I have hardly ever seen a nice exception message from STL :-) but I trust your opinion.

Before moving this new implentation to production, I suggest to add some really minimalistic error handling.

Yes, that is the next step here. With Pierre-Alain, I have to go through the current code and see what is debugging (moved to error counting) and what is in addition necessary for monitoring. I will start with implementing a minimal monitor structure as you suggested, to be easily expanded later.

Actions #13

Updated by Volker Friese 6 months ago

  • Status changed from Feedback to In Progress
  • Assignee changed from Pierre-Alain Loizeau to Volker Friese
Actions #14

Updated by Volker Friese 6 months ago

  • % Done changed from 30 to 60

Modifications according to comments from P.-A. Loizeau and J. de Cuveland.

Actions #15

Updated by Volker Friese 3 months ago

  • % Done changed from 60 to 80
  • Estimated time changed from 15.00 h to 62.00 h

I have a version now which I think can be committed. It took me quite a while to integrate the algorithm into the run/task framework; the main difficulty was to understand the addressing (DAQ/hardware to software). See MR 750.

Debugging and further development of the unpacker algorithm (which does not yet comprise all features of the currently used one) must be done by or at least with the help of the experts (Pierre-Alain).

Actions #16

Updated by Volker Friese about 2 months ago

  • Assignee changed from Volker Friese to Pierre-Alain Loizeau

The new unpacker algorithm still has to be verified against the old one, such as to produce identical output digis as the currently used one from the same input, I dare attribute this to Pierre-Alain. From our last correspondence, it might be that the hardware address mapping needs still corrections.

Actions

Also available in: Atom PDF