Here, we provide the input data and analysis results of our manuscript “Topology of synaptic connectivity constrains neuronal stimulus representation, predicting two complementary coding strategies” (https://www.biorxiv.org/content/10.1101/2020.11.02.363929v2). The inputs comprise descriptions of the neurons and connectivity of the model of Markram et al., 2015 (https://www.sciencedirect.com/science/article/pii/S0092867415011915) the identifiers of stimuli injected into the model during a simulation and the responses of excitatory neurons to the stimuli. You can find them under the “Input data” dashboard below.
In our code repository (https://github.com/BlueBrain/topological_sampling) you can find the pipeline we used to analyze these inputs and write the results to files. Alternatively, you can find the analysis result files on these pages under the “Analysis results” dashboard. These pages also contain analysis configuration files that specify parameters such as the number of samples to obtain or the number of random controls to generate. They are under the “Configurations” dashboard. Finally, these pages also contain jupyter notebooks that read the analysis results and use them to generate the figures of the manuscript, under the “Notebooks” dashboard.
A mirror of these data can be found on Zenodo (https://zenodo.org/record/4317336). Zenodo also provides these data a citable DOI.
The data is provided in hdf5, json and pickle formats. Readers and writers are provided in out code repository (https://github.com/BlueBrain/topological_sampling). It is recommended to use them to access the data. For examples of this, refer to the jupyter notebooks under the “Notebooks” dashboard below.
The data is contained in a large number of files that have to be placed into specific location. We have decided to provide individual files instead of one single download, because a user might be interested in only a specific, small portion of the results. The downside is, that you have to create the expected directory structure manually. The structure is relative to a working directory that you can place anywhere in you file system. In the following sections we will call that location “root”.
The expected structure with ALL files is as follows:
root/config:
classifier_config.json
common_config.json
featurization_config.json
input_data_config.json
manifold_config.json
sampling_config.json
structural_analysis_config.json
struc_volumetric_config.json
topo_db_config.json
triad_config.json
root/data/analyzed_data:
classifier_features_results.json
classifier_manifold_results.json
community_database.pkl
extracted_components.json
features.json
split_spike_trains.npy
structural_parameters.json
structural_parameters_vol.json
triads.json
tribes.json
root/data/input_data:
connectivity.npz
neuron_info.pickle
raw_spikes.npy
stim_stream.npy
root/data/other/classifier:
all_results_components.h5
all_results_features.h5
root/data/other/manifold_analysis:
all_results.h5
root/data/other/topological_featurization:
all_results.h5
root/notebooks:
'Figure 1.ipynb'
'Figure 2 - D, S6.ipynb'
'Figure 2 , S2.ipynb'
'Figure 3, S1.ipynb'
'Figure 4, S9, S10.ipynb'
'Figure 5.ipynb'
'Figure 6 - A.ipynb'
'Figure 6 - B, C, D.ipynb'
'Figure 6 - F, G, H.ipynb'
'Figure 7.ipynb'
component_reacts_to_novelty.py
figure_helper.py
helper_functions.py
pandas_helper.py
plot_helpers.py
The formatting of the input data files is described in the readme file of the code repository (https://github.com/BlueBrain/topological_sampling ). The analysis results are best understood in the context of the analysis pipeline step that generated them. To find out which step generated a file, you can refer to the high-level overview plot on the code repository page. Alternatively, this information is also encoded in the configuration:
"gen_topo_db": {
"outputs": {
"database": "analyzed"
}
means that the pipeline stage “gen_topo_db” is the one we are interested in for the purpose of this example.
Note that depending on what you want to achieve, you might not need all these files, but only a subset of them. In the following sections we will describe two use cases and how to achieve them
“I want to run the in-depth analysis the same way you did for the manuscript”
In this case you do not need the analysis results, because in running the analysis pipeline you will generate them yourself!
“I want to generate the figures you published in the manuscript / understand how you generated them”
In that case you won’t need the analysis pipeline code from our github. And depending on which figure you are interested in, you might not even need all the analysis results.
"""
Paths to relevant data.
"""
output_spikes_fn = cfg._cfg['inputs']['raw_spikes']
stim_fn = cfg._cfg['inputs']['stimuli']
tribes_fn = cfg._cfg['analyzed']['tribes']
This indicates you need from the dashboard “Input data” the files in the “raw_spikes” and “stimuli” dataset, and from the “Analysis results” dashboard the files in the “tribes” dataset. The “tribes” dataset refers to the selected neighborhoods of neurons in the model. The neighborhoods were unfortunately called “tribes” in an early version of the associated manuscript. While we were able to change the manuscript, the unfortunate term is too deeply embedded in the code to change without breaking things. We apologize for the confusion and any offense given.