Fix mrich 2024 online crash
Try to catch/avoid the out of range
exception in a map.at()
call in the online unpacker algo for the mRICH.
For this the sequence error detection from the old unpacker was copied with the same treatment: skip only the faulty word
The current state of the MR works in the sense that it can process the tsa files 2906_node8_0?_0000.tsa
without crashing.
However it still is not 1:1 with the old unpacker as in addition to the errors found by this one for addresses 0x7321
and 0x7170
, it also finds errors for address 0x9600
, which as far as I know does not exist. This would hint that the new algo unpacker can sometime get seriously offset in the buffer.
As a more robust solution, I would propose to skip the full subsubevent
instead of a single word
, in both the old and the new unpacker algos.
I think this would provide the best chance to rescue the data of the other DiRICHes, including some coming after the corrupted ones in the MS buffer.
You can find an example of this for the new algo as comments in my [TEMP]
commit.
Let me know if this is ok, in which case I would implement this in this MR
PS: For testing purposes, this MR also includes the local commit from cbmfles01
which was kntroducing larger TOT ranges for the NCAL/FSD DiRICHes
Merge request reports
Activity
added BugFix Online Reconstruction mCBM labels
Dear @ma.beyer, @f.uhlig, @p.-a.loizeau,
you have been identified as code owner of at least one file which was changed with this merge request.
Please check the changes and approve them or request changes.
added CodeOwners label
assigned to @v.friese
requested review from @ma.beyer
mentioned in merge request !1742 (merged)
- Resolved by Martin Beyer
- Resolved by Martin Beyer
added 49 commits
-
269ed6ff...5cb9f2f5 - 45 commits from branch
computing:master
- 5eb12c7a - [mRICH] increased TOT range for NCAL/FSD in monitor
- 3f0f0ddd - [TEMP] In mRICH online unpacker, try to catch/skip invalid data
- 1640f04b - [TEMP] In mRICH online unpacker, handle data with less words than normal but proper header info
- e2d738d4 - [TEMP] In mRICH online unpacker ProcessTimeDataWord, fix return to avoid out of bounds access
Toggle commit list-
269ed6ff...5cb9f2f5 - 45 commits from branch
- Resolved by Martin Beyer
- Resolved by Martin Beyer
- Resolved by Martin Beyer
- Resolved by Martin Beyer
- Resolved by Martin Beyer
- Resolved by Martin Beyer
- Resolved by Martin Beyer
- Resolved by Martin Beyer
added 21 commits
-
e2d738d4...79173df5 - 18 commits from branch
computing:master
- c6d518ff - [mRICH] increased TOT range for NCAL/FSD in monitor
- 9971960b - In mRICH online unpacker, skip subsubevents with corrupt/too few data
- 1036c2c5 - In mRICH legacy unpacker, skip subsubevents with corrupt/too few data
Toggle commit list-
e2d738d4...79173df5 - 18 commits from branch
I applied the changes while
- keeping the logs as debug
- adding some monitor counter to keep track of how many blocks are skipped and why
I also applied the same change to the old unpacker.
Now checking with a "bad" run that both unpackers work fine, then I will rebase and remove the draft state
- Resolved by Pierre-Alain Loizeau
added 4 commits
Toggle commit listadded 1 commit
- e96bfa17 - In mRICH legacy unpacker, skip subsubevents with corrupt/too few data
- Resolved by Martin Beyer
For file
/lustre/cbm/prod/beamtime/2024/03/mcbm/2906/2906_node8_00_0001.tsa
I observe a difference of 5 digis between online and offline (offline has 5 more).Comparing RichDigi.fAddress and tree2.RichDigi.fAddress Different number of entries: 2943817 vs 2943812
added 6 commits
-
e96bfa17...939d833c - 2 commits from branch
computing:master
- bfa59558 - [mRICH] increased TOT range for NCAL/FSD in monitor
- 6ebbcd75 - In mRICH online unpacker, skip subsubevents with corrupt/too few data
- dbcc1ed3 - In mRICH legacy unpacker, skip subsubevents with corrupt/too few data
- bf9530f0 - [mcbm 2024] in rich only macro, allow switch on/off overlap MS
Toggle commit list-
e96bfa17...939d833c - 2 commits from branch
From my side the MR looks good. Many thanks @p.-a.loizeau for your effort.
Fixing the format and I will set it to approved so that @v.friese can merge it
added 9 commits
-
d697f0b7...6463a47e - 5 commits from branch
computing:master
- d4122fdd - [mRICH] increased TOT range for NCAL/FSD in monitor
- 7f7d3952 - In mRICH online unpacker, skip subsubevents with corrupt/too few data
- f9e77b11 - In mRICH legacy unpacker, skip subsubevents with corrupt/too few data
- b4e72d23 - [mcbm 2024] in rich only macro, allow switch on/off overlap MS
Toggle commit list-
d697f0b7...6463a47e - 5 commits from branch