FAQ

Frequently Asked Questions about pyhf and its use.

Questions

Where can I ask questions about pyhf use?

If you have a question about the use of pyhf not covered in the documentation, please ask a question on the GitHub Discussions.

If you believe you have found a bug in pyhf, please report it in the GitHub Issues.

How can I get updates on pyhf?

If you’re interested in getting updates from the pyhf dev team and release announcements you can join the pyhf-announcements mailing list.

Is it possible to set the backend from the CLI?

Yes. Use the --backend option for pyhf cls to specify a tensor backend. The default backend is NumPy. For more information see pyhf cls --help.

I installed an old pyhf release from PyPI, why am I getting an error from a dependency?

For old releases of pyhf that are not actively supported anymore you might need to manually constrain the upper bound of a dependency.

We work hard to make sure that pyhf is well maintained so that it installs correctly “out of the box” and have tested all of pyhf’s core dependencies to determine hard lower bounds for compatible dependency releases. However, as pyhf is a Python library we can only define lower bounds for its core dependencies, as defining upper bounds would make decisions for users on what versions of libraries they can use in Python applications they build with pyhfthat would be bad. If pyhf were to define upper bounds we could create situations in which pyhf and other libraries defined in an environment file (i.e., requirements.txt) could have directly conflicting dependencies that would result in pip failing to be able to install pyhf.

To give an explicit example, breaking changes in jsonschema v4.15.0’s behavior resulted in a KeyError if used with pyhf v0.6.3 or older. This problem was fixed (c.f. Pull Request #1979) in the next release with pyhf v0.7.0, but the intermediate solution for users was to install an older version of jsonschema that was still compatible with the pyhf release they were using:

# requirements.txt
pyhf==0.6.3
jsonschema<4.15.0

Does pyhf support Python 2?

No. Like the rest of the Python community, as of January 2020 the latest releases of pyhf no longer support Python 2. The last release of pyhf that was compatible with Python 2.7 is v0.3.4, which can be installed with

python -m pip install pyhf~=0.3

I only have access to Python 2. How can I use pyhf?

It is recommended that pyhf is used as a standalone step in any analysis, and its environment need not be the same as the rest of the analysis. As Python 2 is not supported it is suggested that you setup a Python 3 runtime on whatever machine you’re using. If you’re using a cluster, talk with your system administrators to get their help in doing so. If you are unable to get a Python 3 runtime, versioned Docker images of pyhf are distributed through Docker Hub.

Once you have Python 3 installed, see the Installation page to get started.

I validated my workspace by comparing pyhf and HistFactory, and while the expected CLs matches, the observed CLs is different. Why is this?

Make sure you’re using the right test statistic (\(q\) or \(\tilde{q}\)) in both situations. In HistFactory, the asymptotics calculator, for example, will do something more involved for the observed CLs if you choose a different test statistic.

I ran validation to compare HistFitter and pyhf, but they don’t match exactly. Why not?

pyhf is validated against HistFactory. HistFitter makes some particular implementation choices that pyhf doesn’t reproduce. Instead of trying to compare pyhf and HistFitter you should try to validate them both against HistFactory.

How is pyhf typeset?

As you may have guessed from this page, pyhf is typeset in all lowercase. This is largely historical, as the core developers had just always typed it that way and it seemed a bit too short of a library name to write as PyHF. When typesetting in LaTeX the developers recommend introducing the command

\newcommand{\pyhf}{\texttt{pyhf}}

If the journal you are publishing in requires you to use textsc for software names it is okay to instead use

\newcommand{\pyhf}{\textsc{pyhf}}

Why use Python?

As of the late 2010’s Python is widely considered the lingua franca of machine learning libraries, and is sufficiently high-level and expressive for physicists of various computational skill backgrounds to use. Using Python as the language for development allows for the distribution of the software — as both source files and binary distributions — through the Python Package Index (PyPI) and Conda-forge, which significantly lowers the barrier for use as compared to C++. Additionally, a 2017 DIANA/HEP study [faq-1] demonstrated the graph structure and automatic differentiation abilities of machine learning frameworks allowed them to be quite effective tools for statistical fits. As the frameworks considered in this study (TensorFlow, PyTorch, MXNet) all provided low-level Python APIs to the libraries this made Python an obvious choice for a common high-level control language. Given all these considerations, Python was chosen as the development language.

How did pyhf get started?

In 2017 Lukas Heinrich was discussing with colleauge Holger Schulz how it would be convenient to share and produce statistical results from LHC experiements if they were able to be created with tools that didn’t require the large C++ dependencies and tooling expertise as \(\HiFa{}\). Around the same time that Lukas began thinking on these ideas, Matthew Feickert was working on a DIANA/HEP fellowship with Kyle Cranmer (co-author of \(\HiFa{}\)) to study if the graph structure and automatic differentiation abilities of machine learning frameworks would allow them to be effective tools for statistical fits. Lukas would give helpful friendly advice on Matthew’s project and one night 1 over dinner in CERN’s R1 cafeteria the two were discussing the idea of implementing \(\HiFa{}\) in Python using machine learning libraries to drive the computation. Continuing the discussion in Lukas’s office, Lukas showed Matthew that the core statistical machinery could be implemented rather succinctly, and that night proceeded to do so and dubbed the project pyhf.

Matthew joined him on the project to begin development and by April 2018 Giordon Stark had learned about the project and began making contributions, quickly becoming the third core developer. The first physics paper to use pyhf followed closely in October 2018 [faq-2], making Lukas and Holger’s original conversations a reality. pyhf was founded on the ideas of open contributions and community software and continues in that mission today as a Scikit-HEP project, with an open invitation for community contributions and new developers.

Troubleshooting

  • import torch or import pyhf causes a Segmentation fault (core dumped)

    This is may be the result of a conflict with the NVIDIA drivers that you have installed on your machine. Try uninstalling and completely removing all of them from your machine

    # On Ubuntu/Debian
    sudo apt-get purge nvidia*
    

    and then installing the latest versions.

Footnotes

1

24 January, 2018

Bibliography

faq-1

Matthew Feickert. A study of data flow graph frameworks for statistical models in particle physics. Technical Report, DIANA/HEP, Oct 2018. URL: https://doi.org/10.5281/zenodo.1458059, doi:10.5281/zenodo.1458059.

faq-2

Lukas Heinrich, Holger Schulz, Jessica Turner, and Ye-Ling Zhou. Constraining A$_4$ leptonic flavour model parameters at colliders and beyond. JHEP, 04:144, 2019. arXiv:1810.05648, doi:10.1007/JHEP04(2019)144.