# Research

The NNPDF collaboration has pioneered the use of artificial intelligence and machine learning techniques in the context of high energy physics. More recently, the use of machine learning algorithms in theoretical and experimental high-energy physics has become widespread, with applications from trigger selection to jet substructure classification and detector simulation among many others. Some of the basic machine learning tools used by NNPDF for PDF determination are illustrated here. These include multi-layer feed-forward neural networks for the model-independent parametrization of parton distributions and fragmentation functions, genetic and covariance matrix adaptation algorithms for training and optimization, and closure testing for the systematic validation of the fitting methodology:

**> GLOBAL QCD ANALYSIS AND MACHINE LEARNING**

NNPDF performs determinations of polarized and unpolarized parton distributions (PDFs) and fragmentation functions, using a variety of machine learning tools.

**> GENERAL STRATEGY**

The NNPDF methodology is based on representing the probability in the space of PDFs via the Monte Carlo method, thereby propagating the uncertainty of underlying data.

**> NEURAL NETWORKS**

In the NNPDF approach, parton distribution functions (and fragmentation functions) are parameterized using neural networks as unbiased interpolants.

**> MINIMIZATION**

Once the PDFs have been parametrized, the optimal fit is obtained by varying the parameters of the neural network in such a way that some chosen figure of merit is minimized.

**> CROSS-VALIDATION**

The use of a highly redundant parametrization guarantees the absence of bias, but entails that the best fit is not the absolute minimum of the figure of merit, which would correspond to fitting noise. In order to avoid this, cross-validation is used in order to determine the optimal fit.

**> CLOSURE TESTING**

In closure tests, one assumes that PDF are known and then generates pseudo-data based on this, one carries out the PDF determination, and finally one compares the final answer to the known input, thereby providing a validation of the methodology.

**For a more in-depth discussion of parton distributions see “The Partonic Content of Nucleons and Nuclei” by Juan Rojo**

For a more detailed discussion of past and current NNPDF machine learning algorithms see “**Parton distribution functions****“**** by Stefano Forte and Stefano Carrazza**.

The technical details of the methodology can be consulted in the code documentation.