Reproduction of Patel and Desmond (2024) claims for non-reproducibility of previous experiments


This page provides code and data for the reproduction of the analysis and results of Patel and Desmond, arxiv:2404.06617v1 2024.
In Section 4.3 in ,(Patel and Desmond, 2024), the authors argue that they applied Equation 1 from (McAdam & Shamir, 2023) to data they downloaded from https://people.cs.ksu.edu/~lshamir/data/sparcfire. They argue that "our attempt at using this equation on the McAdam & Shamir (2023b) dataset did not yield the results quoted in that paper, so we were unable to reproduce their analysis." They do not disclose what results they did get.

Here I provide the code that implements the equation, and apply it to the exact same data used by Patel and Desmond (2024). The analysis is fully transparent, can be inspected easily, and can also be easily applied to generate the results shown in (McAdam & Shamir, 2023) directly from the data. It proves beyond any doubt that the experiments shown in (McAdam & Shamir, 2023) are fully reproducible.


The code that implements the analysis is dipole.cpp. It is written in C so that it can run fast, but it can very easily be use even by those who are not familiar with C. It is easiest to run it in Linux.

To compile the code type in the Linux terminal: g++ -o dipole dipole.cpp

Patel and Desmond (2024) used two datasets taken from (McAdam & Shamir, 2023). The first is the dataset of mirrored galaxy images annotated by Sparcfire (Dataset "GAN M" in ,(Patel and Desmond, 2024)).
The dataset can be downloaded at ganalyzer_mirrored_4d.csv.

To run the code on the data type: ./dipole ganalyzer_mirrored_4d.csv

The program will print the results in the standard output. The program takes several hours to run, and could be more than a day, depends on the processor. To make it run faster you can increase the value of the variable increment, or reduce the number of simulations (line 169) to a smaller number.

The output of applying the code to the data is in the file ganalyzer_mirrored_4d_results.csv
The output is a table that shows the statistical significance (in sigma) to have a dipole axis in the different (ra,dec) coordinates. Here is the comparison between the results of the reproduction and the values in (McAdam & Shamir, 2023), which are also the values mentioned in (Patel & Desmond, 2024).

SigmaLocation of maximum axis
Reproduction4.02(alpha=185, delta=15)
(McAdam & Shamir, 2023)3.97(alpha=184, delta=16)
The 0.05 sigma difference is due to using random value in the simulation, which is the nature of any simulation and is completely expected. The code tests all axes in intervals of 5 degrees to make it faster, and therefore the very minor differences that are anyway within statistical error.

The Mollweide visulaization also looks the same as the one in the paper.
A visualization of the results of the reproduction using Mollweide projection.



Figure 2(a) in (McAdam & Shamir, 2023).




Another experiment Patel and Desmond (2024) argue is not reproducible is the dataset of non-mirrored galaxy images annotated by Sparcfire (Dataset "GAN NM" in ,(Patel and Desmond, 2024)).

The dataset can be downloaded at ganalyzer_non_mirrored_4d.csv.

To run the code on the data type: ./dipole ganalyzer_non_mirrored_4d.csv

The output of applying the code to the data is in the file ganalyzer_non_mirrored_4d_results.csv

Here is the comparison between the results of the reproduction and the values in (McAdam & Shamir, 2023), which are also the values mentioned in (Patel & Desmond, 2024).

SigmaLocation of maximum axis
Reproduction2,34(alpha=190, delta=20)
(McAdam & Shamir, 2023)2.33(alpha=192, delta=24)
The visualization is the following:

A visualization of the results (ganalyzer_non_mirrored_4d_results.csv) using Mollweide projection. (McAdam & Shamir, 2023) does not include a figure of that dataset, so it cannot be compared to a figure in the paper.




Another study that Patel and Desmond (2024) argue was not reproducible is (Longo, 2011). That dataset can be downloaded at longo_z.csv.

To run the code on the data type in the Linux terminal: ./dipole longo_z.csv
The output is in the file longo_z_results.csv

The analysis shows a dipole axis at 2.92 sigma. That is somewhat weaker compared to the 3.16 sigma reported in the (Longo, 2011) paper, but is close and definitely statistically significant.

A visualization of the results of Longo (2011). The location of the dipole axis is also close to the location reported by (Longo, 2011) at (alpha=217, delta=32).