Draft: cbm::algo cleanup

requested review from @d.smith

assigned to @d.smith

Issue 1: Doubly defined fContext in class Reco.

The class Reco contains the following member:

   ChainContext fContext;

It also inherits the following field from its base class SubChain:

   ChainContext* fContext = nullptr;

They are both in use apparently. In Reco::Init(const Options& opts) one is used to initialize the other. This seems to be redundant.

Issue 2: Do we need the ChainContext at all?

It is essentially a wrapper for the following objects:

    Options opts;
    RecoParams recoParams;
    std::unique_ptr<cbm::Monitor> monitor;  //! Monitor

The existence of this wrapper makes the passing of these between different instances of SubChain a bit leaner (one parameter is passed instead of three), at the expense of additional "indirection" (more files to keep track of, more headers; term stolen from @j.decuveland).

However, in cases where nested SubChain instances are used, the same could be achieved by having the above fields be class members, and copying them from parent to child by passing the this pointer of the parent chain to constructor of the the child chain.

Actually, in a lot of cases, we do not even need all of these objects. For instance, the STS unpacker will only need a small subset of what is contained in RecoParams and Options. And presumably the monitoring instance is only needed at the top level, i.e. in the Reco class. So what we are essentially doing is passing a large number of superfluous information to bottom-level object instances.

This brings me to:

Issue 3: Do we need the SubChain base class at all?

It seems to me, that the similarities between the derived classes are not very strong. We are mainly using the base class to distribute a set of parameters, many of which aren't really shared between the "Chains". We also aren't actually using a pointer to or vector of the base class type anywhere (this would be the strongest argument in favor of a base class).

It seems to me that we could easily drop the base class and simply pass the proper sub-set of RecoParams and Options as needed.

Issue 4: I'm conflicted over the use of Property.h.

The Property class attempts to streamline the reading and writing of yaml parameter files a bit. Basically it automates the generation of Save and Load functions for the members of some class or struct. Below are two examples of classes which support yaml files, one with the use of Property (RecoParams) and one without it (MainConfig).

The Property variant is definitely more compact and has the great advantage that one doesn't need to take care of "Save" and "Load" separately (they contain the same operations anyway, just in reverse). So I definitely see the value in having something like this (for the comparatively low complexity-cost of one additional header). I just find the way these "Properties" are added to each struct to be visually a bit awkward and far away from standard C++ style.

I wonder if there is a way this can be streamlined further. If we are going the template / auto-detect route, is there a reason these extra members are needed at all? They are really just a kind of clue given to the parser to help it decide what to do. Perhaps it is possible to extend the Property class so that it can deduce this directly from the struct definition? I mean all of the information is already there in principle. Presumably some STL magic might help here.

Actually, I'm kind of suprised all of this is needed at all with yaml. I was under the impression, that a proper yaml parser can map .yaml files to classes / structs out of the box, without the need of extra layers. Presumably this is part of the motivation to switch to some of the standard storage format in the first place.

In any case, the class name Property is a bit nondescript and should probably be changed.

@j.decuveland I wonder what is your opinion on this topic?

 struct RecoParams {
    enum class SortMode : u8
    {
      BlockSort        = 0,
      CUBSegmentedSort = 1,
    };
    enum class UnpackMode : u8
    {
      CPU,
      XPU,
    };

    struct STS {
      SortMode digiSortMode;
      SortMode clusterSortMode;

      UnpackMode unpackMode;
      u8 findClustersMultiKernels;

      f32 timeCutDigiAbs;
      f32 timeCutDigiSig;
      f32 timeCutClusterAbs;
      f32 timeCutClusterSig;

      struct MemoryLimits {
        u64 maxDigisPerTS;
        u64 maxDigisPerMS;
        u64 maxDigisPerModule;
        f64 clustersPerDigiTS;
        f64 clustersPerDigiModule;
        f64 hitsPerClusterTS;
        f64 hitsPerClusterModule;

        XPU_D u64 maxClustersPerTS() const { return maxDigisPerTS * clustersPerDigiTS; }
        XPU_D u64 maxClustersPerModule() const { return maxDigisPerModule * clustersPerDigiModule; }
        XPU_D u64 maxHitsPerClusterTS() const { return maxClustersPerTS() * hitsPerClusterTS; }
        XPU_D u64 maxHitsPerClusterModule() const { return maxClustersPerModule() * hitsPerClusterModule; }

        static constexpr auto Properties = std::make_tuple(
          config::Property(&MemoryLimits::maxDigisPerTS, "maxDigisPerTS", "Maximal number of digis per time slice"),
          config::Property(&MemoryLimits::maxDigisPerMS, "maxDigisPerMS", "Maximal number of digis per micro slice"),
          config::Property(&MemoryLimits::maxDigisPerModule, "maxDigisPerModule", "Maximal number of digis per module"),
          config::Property(&MemoryLimits::clustersPerDigiTS, "clustersPerDigiTS",
                           "Number of clusters per digi in a time slice"),
          config::Property(&MemoryLimits::clustersPerDigiModule, "clustersPerDigiModule",
                           "Number of clusters per digi in a module"),
          config::Property(&MemoryLimits::hitsPerClusterTS, "hitsPerClusterTS",
                           "Number of hits per cluster in a time slice"),
          config::Property(&MemoryLimits::hitsPerClusterModule, "hitsPerClusterModule",
                           "Number of hits per cluster in a module"));
      } memoryLimits;

    static constexpr auto Properties = std::make_tuple(
        config::Property(&STS::digiSortMode, "digiSortMode",
                         "Digi sort mode (0 = block sort, 1 = cub segmented sort))"),
        config::Property(&STS::clusterSortMode, "clusterSortMode", "Cluster sort mode"),

        config::Property(&STS::unpackMode, "unpackMode", "Unpack mode (0 = legacy, 1 = XPU)"),
        config::Property(&STS::findClustersMultiKernels, "findClustersMultiKernels",
                         "Split cluster finding into multiple kernels"),

        config::Property(&STS::timeCutDigiAbs, "timeCutDigiAbs",
                         "Time delta for neighboring digis to be considered for the same cluster. [ns]"),
        config::Property(
          &STS::timeCutDigiSig, "timeCutDigiSig",
          "Used if timeCutDigiAbs is negative. Time delta must be < 'value * sqrt2 * timeResolution'. [ns]"),
        config::Property(&STS::timeCutClusterAbs, "timeCutClusterAbs",
                         "Maximal time difference between two clusters in a hit [ns]."
                         " Setting to a positive value will override timeCutClustersSig."),
        config::Property(
          &STS::timeCutClusterSig, "timeCutClusterSig",
          "Time cut for clusters."
          " Two clusters are considered it their time difference is below 'value * sqrt(terr1**2 + terr2*+2)'"),
        config::Property(&STS::memoryLimits, "memoryLimits", "Memory limits for STS reco"));
    };

    STS sts;

    static constexpr auto Properties = std::make_tuple(config::Property(&RecoParams::sts, "sts", "STS reco settings"));
  };

  // -----   Load configuration from YAML file   --------------------------------
  void MainConfig::LoadYaml(const std::string& filename)
  {
    YAML::Node config = YAML::LoadFile(filename);

    // --- Digi trigger
    fTriggerDet       = ToCbmModuleIdCaseInsensitive(config["trigger"]["detector"].as<std::string>());
    fTriggerWin       = config["trigger"]["window"].as<double>();
    fTriggerThreshold = config["trigger"]["threshold"].as<size_t>();
    fTriggerDeadTime  = config["trigger"]["deadtime"].as<double>();

    // --- Event builder: (detector -> (tMin, tMax))
    if (auto eventbuilder = config["eventbuilder"]) {
      if (auto windows = eventbuilder["windows"]) {
        for (YAML::const_iterator it = windows.begin(); it != windows.end(); ++it) {
          auto det              = ToCbmModuleIdCaseInsensitive(it->first.as<std::string>());
          auto lower            = it->second[0].as<double>();
          auto upper            = it->second[1].as<double>();
          fEvtbuildWindows[det] = std::make_pair(lower, upper);
        }
      }
    }

    // --- Event selector parameters
    fSelectMinStationsSts = config["selector"]["minStationsSts"].as<size_t>();
    fSelectMinStationsTof = config["selector"]["minStationsTof"].as<size_t>();
    fSelectMinDigisBmon   = config["selector"]["minDigisBmon"].as<size_t>();

    // --- Branch persistence in output file
    fStoreTimeslice = config["store"]["timeslice"].as<bool>();
    fStoreTrigger   = config["store"]["triggers"].as<bool>();
    fStoreEvents    = config["store"]["events"].as<bool>();

    // --- QA publishing
    fHttpServerRefreshRate = config["qa"]["refreshrate"].as<int32_t>(fHttpServerRefreshRate);
  }
  // ----------------------------------------------------------------------------

  // -----   Save configuration to YAML file   ----------------------------------
  void MainConfig::SaveYaml(const std::string& filename)
  {
    YAML::Node config;

    // --- Digi trigger
    config["trigger"]["detector"]  = ToString(fTriggerDet);
    config["trigger"]["window"]    = fTriggerWin;
    config["trigger"]["threshold"] = fTriggerThreshold;
    config["trigger"]["deadtime"]  = fTriggerDeadTime;

    // --- Event builder: (detector -> (tMin, tMax))
    for (const auto& [key, value] : fEvtbuildWindows) {
      auto det = ToString(key);
      config["eventbuilder"]["windows"][det].push_back(value.first);
      config["eventbuilder"]["windows"][det].push_back(value.second);
    };

    // --- Event selector
    config["selector"]["minStationsSts"] = fSelectMinStationsSts;
    config["selector"]["minStationsTof"] = fSelectMinStationsTof;
    config["selector"]["minDigisBmon"]   = fSelectMinDigisBmon;


    // --- Branch persistence in output file
    config["store"]["timeslice"] = fStoreTimeslice;
    config["store"]["triggers"]  = fStoreTrigger;
    config["store"]["events"]    = fStoreEvents;
    // --- QA publishing
    config["qa"]["refreshrate"] = fHttpServerRefreshRate;
    // ---
    std::ofstream fout(filename);
    fout << config;
  }
  // ----------------------------------------------------------------------------

For issues 1–3:

I think we should try to avoid unnecessary global variables in all their various manifestations (or disguises). Clean and lean interfaces make the structure of the code clearer and can make it easier to reason about a block by just looking at its interface.

In this regard, we may have taken a few shortcuts to develop software that runs successfully in a short period of time, and I am very much in favor of continuing to improve the structure.

Yes, that is really the crux of the matter. The short summary of issues 1-3 is:

The main (or even only) effect of ChainContext and SubChain is to promote single instances of Options, RecoParams and cbm::Monitor to quasi-global variables. And really only a small subset of the data contained in them is actually used in each of the places where they are visible.

@fweig @j.decuveland @v.friese After yesterday's discussions, I was convinced that the SubChain base class, while looking like overkill at present time, should become useful in the future, as we will at some point be using constructions like

std::vector<SubChain> chains;
...
for( auto& chain: chains ){ chain.run(); }

This would be a strong justification for the existence of a base class.

The fact that the base class contains a shared RecoParams instance still worries me a bit. Presumably this class will at some point contain a very large number of parameters. The problem with this, is that any person attempting to make changes to RecoParams will have to look every class that derives from SubChain, to be sure that these changes don't break anything. A separate parameter container for each chain, that doesn't expose any unneeded information, would be safer.

Also, it appears that in the case we have here, most of the action happens in the derived classes. The canonical OOP pattern to use in such a case would be the interface, i.e. the purely abstract base class with only virtual and public members, preferably marked with a "I", i.e. ISubChain. Deriving from this represents a low level of commitment on the user side and is usually safe. Full base classes are preferred only when they contain large amounts of functionality.

To phrase this differently, I think a good rule of thumb is that base classes should either be as light as possible or as heavy as possible (and, as stated above, only exist at all if a pointer to / array of the base type is used).

Actually, after thinking about it again, i think we can get rid of the SubChains entirely and make the reconstruction parameters more modular. I still think we need something like these SubChains once we have more GPU code to share state between algorithms. But getting this right is actually kinda tricky and we don't have anyone to work on it right now. So for now I'd just use the Reco class to control the online algorithms. This also restores the idea of the algo library to have a set of algorithms that can be instantiated and used in different contexts. My subchains broke this.

@fweig Sorry for complaining so much. I just warmed up to the idea of having a base class

As an aside, I noticed that we are actually discussing a issue here in a merge request without code. That's probably because we've disabled Gitlab's issue tracker in favor of a link to the Redmine installation, and some developers apparently prefer the Gitlab approach. Maybe we should give in to that and allow issue tracking for online code in Gitlab?

Yes, I started out thinking I could comment on specific lines of the code base this way. I later realized that this wasn't possible. I stuck with using a MR regardless, as I figured I might want to directly push some changes to the code, if they could be agreed upon. Perhaps this discussion would have been better suited for Redmine.

For issue 4:

Unfortunately, C++ is lacking the kind of type introspection that would be required to implement something like this transparently. I feel the implementation using the Property template is as nice and clean as we will probably get. What I also like in this approach is that it allows us to add more information to the properties, such as the documentation string. It may also be possible to add default values and an optional/required flag using this mechanism.

The second example (MainConfig), which I wrote, is only comparable in size because it does not include these features. And it does not currently allow values to be optional. Anything like this would have to be implemented individually and increase the noise.

I also feel that the name Property is not that bad. It reminds me of the @property decorator in Swift, which serves a somewhat similar purpose (auto-generate accessor methods).

I agree that the benefits of the Property class probably outweigh the costs.

Issue 5: What is the deal with the "ca" folder?

This folder contains nothing except a single subfolder "simd", which contains three files: CaSimd.h, CaSimdPseudo.h, CaSimdVc.h.

Of these, the first one contains only an include for CaSimdVc.h and nothing else. Then CaSimdPseudo.h contains only a few typedefs. Then CaSimdVc.h contains some definitions for scalar and vector types, that aren't actually used anywhere in "algo".

My best guess is that this has something to do with tracking, which will be added to "algo" at a later stage. In any case, this can probably be cleaned up. The amount of subfolders and files seems excessive and this can probably be moved to some global include directory to be visible in all of CBMRoot. Maybe @se.gorbunov can comment.

Issue 6: Stub file algo/data/sts/Digi.h.

This file only contains the following:

#include "CbmStsDigi.h"

namespace cbm::algo::sts
{
  using Digi = CbmStsDigi;
}  // namespace cbm::algo::sts

Does this need to be its own file?

Issue 7: File algo/data/sts/Cluster.h.

This file contains a single struct:

  struct Cluster {
    real fCharge;         ///< Total charge
    i32 fSize;            ///< Difference between first and last channel
    real fPosition;       ///< Cluster centre in channel number units
    real fPositionError;  ///< Cluster centre error (r.m.s.) in channel number units
    u32 fTime;            ///< cluster time [ns]
    real fTimeError;      ///< Error of cluster time [ns]
  };

The only place this header file is included is in StsHitfinder.h. This seems to be a transient data type only used by the class StsHitfinder. Can probably be absorbed into StsHitfinder.h.

Edit: The same is probably true for algo/data/sts/Hit.h.

Issue 8: Commented-out lines of code.

I found large blocks in algo/detectors/sts/StsHitfinderChain.cxx but they might also exist in other places. If they aren't needed, they should be removed.

Issue 9: Purpose of algo/base/gpu/DeviceImage.* and algo/base/gpu/Params.*?

Each of the files contain a single "struct" declaration and/or a single XPU command.

It feels a bit like the introduction of global variables. They aren't actually used in a lot of places however.

Perhaps something like this is needed for XPU, this I cannot easily judge. @fweig please clarify. Having a single file for each might not be the most economical thing.

Issue 10: xpu_legacy.h

This header seems to enable compatibility with an older XPU version. It is used by StsHitfinderChain and UnpackStsXpu. We should probably convert these to the latest XPU version and remove the header file.

Issue 11: Purpose of data/ subfolder.

The purpose of of this folder is somewhat ambiguous. It contains some files with auxiliary data types (structs) which might better be absorbed into other (header) files (see Issue 7). It also contains parameter files for the STS hitfinder (Landau table and general configuration). Probably we should decide on a single role for this subfolder, or get rid of it entirely.

For the unpackers, he have until now used the convention that the ReadoutConfig files are stored in the same folder as the unpacker class itself. In my view, we should have a uniform convention, i.e., either all configuration or parameter classes should be moved to data/ or the contents of data/ should be moved into detectors/*. I'd favor the later option, to keep the subfolder structure simpler.

Draft: cbm::algo cleanup

Merge request reports

Activity