Skip to content

Conversation

mmusich
Copy link
Contributor

@mmusich mmusich commented Sep 17, 2025

backport of #48924

PR description:

From the original PR

In the context of the Next Generation Triggers Task 3.4 ("Optimal HLT Calibrations") we are interested in speeding up the calibration procedures with the aim of implementing real time calibrations for the phase-2 upgrade HLT.
A stepping stone towards that, consists in the NGT Demonstrator which we plan to deploy and operate during Run 3 (late 2025 and 2026).
The "calibration leg" of such demonstrator is planned to run a trimmed down version of the Prompt Calibration Loop delivering selected alignment and calibrations.
While looking at optimizing the performance of PCL (see also #48886) we noticed that a non negligible amount of resources is spent in running duplicated modules (those are different instances of the same EDModule configured in the exactly same way but with a different label, thus run by the framework).
An example in identifying such modules if provided below [*]. The resulting output is available here.
The goal of this PR is neutralize the CPU / timing penalty from the worst offenders (a more optimal configuration would likely need attention from AlCaDB and involvement of different groups).

PR validation:

We have run the following command:

#!/bin/bash

cmsDriver.py step3 --conditions 140X_dataRun3_Express_v3 \
	-s ALCAOUTPUT:SiStripCalZeroBias+SiStripPCLHistos,ALCA:PromptCalibProd+PromptCalibProdSiStrip+PromptCalibProdSiPixelAli+PromptCalibProdSiStripGains+PromptCalibProdSiStripGainsAAG+PromptCalibProdSiPixel+PromptCalibProdSiPixelLA+PromptCalibProdSiStripHitEff+PromptCalibProdSiPixelAliHG+PromptCalibProdSiPixelAliHGComb \
	--datatier ALCARECO --eventcontent ALCARECO --triggerResultsProcess RECO -n -1 \
	--filein file:step2_nomod.root --no_exec --fileout file:step3_nomod.root --python_filename step3_ALCAOUTPUT_ALCA.py

cat <<@EOF>> step3_ALCAOUTPUT_ALCA_nomod.py
process.options.numberOfStreams = 8
process.options.numberOfThreads = 8

process.load('HLTrigger.Timer.FastTimerService_cfi')

process.FastTimerService.writeJSONSummary = True
process.FastTimerService.jsonFileName = "timing_tracking_upperbound_s3_nomod.json"

@EOF

cmsRun step3_ALCAOUTPUT_ALCA.py

on around 40k events from run and measured the timing using either or not the changes in this branch:

w/o this PR w/ this PR
Screenshot from 2025-09-14 17-35-33 Screenshot from 2025-09-14 17-35-43

The timing reduction is of around 30% of the job total.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Verbatim backport of #48924 to CMSSW_15_0_X, intended for data-taking operations.


[*]

cmsrel CMSSW_15_0_14
cd CMSSW_15_0_14/src/
cmsDriver.py step3 --conditions 140X_dataRun3_Express_v3 \
    -s ALCAOUTPUT:SiStripCalZeroBias+SiStripPCLHistos,ALCA:PromptCalibProd+PromptCalibProdSiStrip+PromptCalibProdSiPixelAli+PromptCalibProdSiStripGains+PromptCalibProdSiStripGainsAAG+PromptCalibProdSiPixel+PromptCalibProdSiPixelLA+PromptCalibProdSiStripHitEff+PromptCalibProdSiPixelAliHG+PromptCalibProdSiPixelAliHGComb \
    --datatier ALCARECO --eventcontent ALCARECO --triggerResultsProcess RECO -n -1 \
    --filein file:step2.root --no_exec --fileout file:step3.root
edmConfigDump step3_ALCAOUTPUT_ALCA.py > dump.py
hltFindDuplicates dump.py

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 17, 2025

A new Pull Request was created by @mmusich for CMSSW_15_0_X.

It involves the following packages:

  • Alignment/CommonAlignmentProducer (alca)
  • Calibration/TkAlCaRecoProducers (alca)

@arunhep, @atpathak, @cmsbuild, @perrotta can you please review it and eventually sign? Thanks.
@mmusich, @pakhotin, @rsreds, @threus, @tlampen, @tocheng, @yuanchao this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 17, 2025

cms-bot internal usage

@mmusich
Copy link
Contributor Author

mmusich commented Sep 17, 2025

type ngt, performance-improvements

@mmusich
Copy link
Contributor Author

mmusich commented Sep 17, 2025

test parameters:

  • workflows = 1001.2,1001.3,1001.4,1002.3,1002.4,1002.5

@mmusich
Copy link
Contributor Author

mmusich commented Sep 17, 2025

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

+1

Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-aa52a5/48177/summary.html
COMMIT: fff845e
CMSSW: CMSSW_15_0_X_2025-09-16-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/48943/48177/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 1 lines to the logs
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 56
  • DQMHistoTests: Total histograms compared: 4150502
  • DQMHistoTests: Total failures: 33
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4150449
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 55 files compared)
  • Checked 247 log files, 194 edm output root files, 56 DQM output files
  • TriggerResults: no differences found

@perrotta
Copy link
Contributor

+alca

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_15_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_16_0_X is complete. This pull request will now be reviewed by the release team before it's merged. @ftenchini, @sextonkennedy, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2)

@mandrenguyen
Copy link
Contributor

+1
To be checked in replay

@cmsbuild cmsbuild merged commit 0cb0e73 into cms-sw:CMSSW_15_0_X Sep 30, 2025
9 checks passed
@mmusich mmusich deleted the mm_dev_trim_down_alcaprompt_15_0_X branch September 30, 2025 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants