Project

General

Profile

Actions

Sim-Development #1088

closed

Problem with MvdDigitizer with FairSoft mar17 and FairRoot v-17.03 and later

Added by Volker Friese over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
-
Start date:
06/12/2018
Due date:
% Done:

100%

Estimated time:
Spent time:

Description

A problem exists with running the MVD digitizer in the new scheme, using the CbmDigitization class, in FairSoft and FairRoot versions of 2017 and later.

A segmentation fault happens when the digitizer processes the seond event (the first one goes smoothly). The issues seems to be related with the dynamic cast in CbmMvdDigitizer::Exec() (lines 128-133):

   TClonesArray* digis = fDetector->GetOutputDigis();
   for (Int_t index = 0; index < digis->GetEntriesFast(); index++) {
     CbmMvdDigi* digi = dynamic_cast<CbmMvdDigi*>(digis->At(index));
     SendDigi(digi);
     nDigis++;
   }

It is strange that there is no problem with earlier external software (2016 and before). CbmMvdDetector::GetOutputDigis() uses the TClonesArray::Absorb() method, which may be one source of the problem.

Stack trace:

The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum.
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs. Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#6  0x00007f70e5c2d640 in __dynamic_cast () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007f70d8350a95 in CbmMvdDigitizer::Exec (this=0x7fb7280) at /home/uhlig/cbm/NIGHTLY/mvd/CbmMvdDigitizer.cxx:130
#8  0x00007f70d019031b in FairTask::ExecuteTasks (this=0x7e4e7e0, option=0x7f70d0200e7f "") at /home/uhlig/fairroot/source/v-17.10a/base/steer/FairTask.cxx:209
#9  0x00007f70d018fe71 in FairTask::ExecuteTask (this=0x7e4e7e0, option=0x7f70d0200e7f "") at /home/uhlig/fairroot/source/v-17.10a/base/steer/FairTask.cxx:173
#10 0x00007f70d0182a5b in FairRunAna::Run (this=0x26b5a30, Ev_start=<optimized out>, Ev_end=3) at /home/uhlig/fairroot/source/v-17.10a/base/steer/FairRunAna.cxx:379
#11 0x00007f70d8a821cf in CbmDigitization::Run (this=0x7ffe135d99e8, event1=0, event2=3) at /home/uhlig/cbm/NIGHTLY/run/CbmDigitization.cxx:362
Actions #1

Updated by Volker Friese over 4 years ago

  • Tracker changed from Bug to Sim-Development
  • Project changed from CbmRoot to Simulation
  • Assignee changed from Volker Friese to Florian Uhlig

Florian: I can hardly test since I first have to upgrade my MacOS. Please try to debug.

Philipp: Maybe you can have a look, too; you probably have an idea what I messed up when adjusting your digitizer to the new scheme.

Actions #2

Updated by Florian Uhlig over 4 years ago

  • Status changed from Assigned to In Progress
  • % Done changed from 0 to 10

Checking with a debugger one finds a malloc error in TClonesArray::AbsorbObjects

malloc: *** error for object 0x15faf7d70: pointer being freed was not allocated
Actions #3

Updated by Volker Friese over 4 years ago

Indeed, there have been changes in TClonesArray::AbsorbObjects from version 6.06 (FairSoft may16p1) to the current ROOT.

The old code was (TClonesArray.cxx, lines 1050-1058):

    // move
    for (Int_t i = idx1; i <= idx2; i++) {
       Int_t newindex = oldSize+i -idx1;
       fCont[newindex] = tc->fCont[i];
       ::operator delete(fKeep->fCont[newindex]);
       (*fKeep)[newindex] = (*(tc->fKeep))[i];
       tc->fCont[i] = 0;
       (*(tc->fKeep))[i] = 0;
    }

This reads in the current ROOT 6.15:


    // move
    for (Int_t i = idx1; i <= idx2; i++) {
       Int_t newindex = oldSize+i -idx1;
       fCont[newindex] = tc->fCont[i];
       R__ReleaseMemory(fClass,fKeep->fCont[newindex]);
       (*fKeep)[newindex] = (*(tc->fKeep))[i];
       tc->fCont[i] = 0;
       (*(tc->fKeep))[i] = 0;
    }

I do not know what behind R__ReleaseMemory() is, nor have I an understanding why memory is released (or, in the old version, a delete is called) when objects are removed from a TClonesArray. This is contrary to what I though a TClonesArray does.

Actions #4

Updated by Volker Friese over 4 years ago

I think I have found the reasons why the MVD digitizer does not produce match objects.

CbmMvdDigitize calls in its Exec() the methods

CbmMvdDetector::GetOutputDigis() and then
CbmMvdDetector::GetOutputDigiMatchs().

The first method does a loop over all sensors and calls for each CbmMvdSensor::GetOutputArray(). This method, in turn, absorbs the digi and match objects from the respective CbmMvdSensorDigiPlugin arrays into the arrays of CbmMvdSensor; a pointer to CbmMvdSensor::fDigiArray is returned.

CbmMvdDetector::GetOutputDigis() then absorbs the digi objects from the sensor array into the "global" CbmMvdDetector::foutputDigis. All sensor arrays are empty after this operation.

Now, CbmMvdDetector::GetOutputDigiMatchs() calls for each sensor CbmMvdSensor::GetOutputArrayLen(), which checks the size of the sensor digi arrays (not match arrays) and returns -1 because all sensor digi arrays are now empty. Consequently, no match objects are transported from the sensor match arrays to the detector match array. The respective code is:

  Int_t nSensors = fSensorArray->GetEntriesFast();
  CbmMvdSensor* sensor;
  for(Int_t i=0; i<nSensors; i++){
    sensor=(CbmMvdSensor*)fSensorArray->At(i);
    fDigiPlugin = sensor->GetDigiPlugin();
    Int_t length = sensor->GetOutputArrayLen(fDigiPlugin);
    if(length >= 0)
    {

        foutputDigiMatchs->AbsorbObjects(sensor->GetOutputMatch(),0,length);

    }
  }

length is always -1, so no matches are in the output array foutputDigiMatchs.

This can be rather easily fixed; there is no good reason to call TClonesArray::AbsorbObjects() with the two index arguments if all objects in the array are meant to be moved.

Whether this is the source of the observed segmentation fault I dare doubt, since the AbsorbObjects method is never executed. Let's see.

Actions #5

Updated by Volker Friese over 4 years ago

No, same problem in CDash with newer ROOT, as expected. But at least, now there are match objects on file.

Actions #6

Updated by Florian Uhlig over 4 years ago

  • % Done changed from 10 to 100

It is not any longer possible to delete objects which are stored in a TClonesArray with newer versions of ROOT. The objects which are stored in the TClonesArray have to be removed by either calling the Delete() or Clear() functions of TClonesArray or they have to be removed explicitly by using the Remove() or RemoveAt() function. I haven't found any way to extract the object from a TClonesArray without doing a copy of the object.

The code in the CbmMvdDigitizer was changed such that the content which is stored in the CbmMvdDetector class is absorbed into a temporary TClonesArray in CbmMvdDigitizer. The content of this temporary TClonesArray is copied and stored in a std::vector to do proper cleanup after the data was written to the final TClonesArray. The data stored in this final TClonesArray is finally written to file. After the data is written to file the temporary and the final TClonesArray as well as the temporary std::vector are properly cleaned to release the used memory.

Actions #7

Updated by Florian Uhlig over 4 years ago

  • Status changed from In Progress to Resolved

Applied in changeset cbmroot:cbmroot|r13074.

Actions #8

Updated by Florian Uhlig over 4 years ago

  • Status changed from Resolved to Closed
Actions #9

Updated by Volker Friese over 4 years ago

Florian Uhlig wrote:

It is not any longer possible to delete objects which are stored in a TClonesArray with newer versions of ROOT. The objects which are stored in the TClonesArray have to be removed by either calling the Delete() or Clear() functions of TClonesArray or they have to be removed explicitly by using the Remove() or RemoveAt() function. I haven't found any way to extract the object from a TClonesArray without doing a copy of the object.

If that is true, I consider it a serious conceptual deficite of TClonesArray. This class return a TObject pointer by the method At(), but the user is not supposed to call delete for this pointer. If he does, you get a hardly debuggable error. No proper catching of the error, no sound ownership concept. Lesson for me is that we should give up the use of TClonesArray wherever dispensable (i.e. where not used for I/O), and use professional software instead (STL containers).

The code in the CbmMvdDigitizer was changed such that the content which is stored in the CbmMvdDetector class is absorbed into a temporary TClonesArray in CbmMvdDigitizer. The content of this temporary TClonesArray is copied and stored in a std::vector to do proper cleanup after the data was written to the final TClonesArray. The data stored in this final TClonesArray is finally written to file. After the data is written to file the temporary and the final TClonesArray as well as the temporary std::vector are properly cleaned to release the used memory.

Mightly complicated, isn't it? It would be so easy if std::vector would be used all over the place.

Actions

Also available in: Atom PDF