| References | Link |
|---|---|
| Methods | |
| Application | |
| Source Code | |
| Updates (Latest Release) | |
| Media | |
| How to use | Start here |
| Implementation (Link) | Requires | Containerizes |
|---|---|---|
The accelerating growth of scientific literature overwhelms our capacity to manually distil complex phenomena like molecular networks linked to diseases. Moreover, biases in biomedical research and database annotation limit our interpretation of facts and generation of hypotheses. ENQUIRE (Expanding Networks by Querying Unexpectedly Inter-Related Entities) offers a time- and resource-efficient alternative to manual literature curation and database mining. ENQUIRE reconstructs and expands co-occurrence networks of genes and biomedical ontologies from user-selected input corpora and network-inferred PubMed queries. The integration of text mining, automatic querying, and network-based statistics mitigating literature biases makes ENQUIRE unique in its broad-scope applications. For example, ENQUIRE can generate co-occurrence gene networks that reflect high-confidence, functional networks. When tested on case studies spanning cancer, cell differentiation and immunity, ENQUIRE identified interlinked genes and enriched pathways unique to each topic, thereby preserving their underlying diversity. ENQUIRE supports biomedical researchers by easing literature annotation, boosting hypothesis formulation, and facilitating the identification of molecular targets for subsequent experimentation.
- If you find ENQUIRE useful to pursue your research, please cite us
INSTALLATION
ENQUIRE can currently run on LINUX systems and LINUX virtual machines using Apptainer/Singularity and on Linux, MacOS, and Windows using Docker. If you would rather use Docker instead of Singularity, please follow the dedicated README available here. Please check the implementation table for the latest available images and requirements.
If you want to use ENQUIRE with Apptainer/Singularity, please install the latter following the steps for Linux or Windows/Mac. The file called ENQUIRE.sif is a compressed Singularity Image File (SIF) that already contains all the code, dependendencies and stable metadata needed to run ENQUIRE, so no further installation steps are needed. The original and latest SIF files are available on Figshare - see implementation table. We recommend adding the path to the apptainer executable to your PATH variable (e.g. by editing your .bashrc file). This allows to directly execute ENQUIRE.sif as any other executable (./ENQUIRE.sif).
To follow the next steps in the tutorial, clone the repository:
git clone https://github.com/Muszeb/ENQUIRE.git
cd ENQUIREthen, download the SIF image file ENQUIRE.sif from FigShare and place it in the repository. We provided checksum files (md5sum_ENQUIRE_sif.txt and md5sum_original_ENQUIRE_sif.txt) to ensure the download completed successfully. Remember to also make the SIF file executable.
md5sum -c md5sum_ENQUIRE_sif.txt
chmod +x ENQUIRE.sif
You can then place the ENQUIRE directory or ENQUIRE.sif wherever you wish to, and possibly add its location to your PATH variable for an easier calling.
USAGE
The exemplary code snippets assume that apptainer location is added to your PATH variable, and that you're running the commands from the ENQUIRE main directory (do cd /path/to/ENQUIRE to test them).
Here is how you call ENQUIRE scripts using ENQUIRE.sif:
# assuming the `apptainer` location is in your PATH variable and you did `cd ENQUIRE` or `ENQUIRE.sif` is in your working directory
Usage: ./ENQUIRE.sif <script_name> [script_argument]Where <script_name> is one of:
efetch_references.pyENQUIRE.shcontext_aware_gene_sets.Rcontext_aware_pathway_enrichment.R
INPUT FILE
A valid input file should consist of a list of PubMed Identifiers (PMIDs) stored in plain text files, one PMID per line.
The easiest way to generate a valid ENQUIRE input file is to generate a PubMed query on the NCBI's website. Use of MeSH terms and exclusion of review articles is recommended but not mandatory. Then, click on Save, choose Selection: All results and Format: PMID, and Create file:
Alternatively, we also offer a Python script to extract the PubMed identifiers of all papers cited in a reading of interest (e.g. a review paper of a particular topic). From the ENQUIRE folder and virtual environment, type on the command line:
# assuming the `apptainer` location is in your PATH variable and you did `cd ENQUIRE` or `ENQUIRE.sif` is in your working directory
./ENQUIRE.sif efetch_references.py tag ref1 ref2 ref3 ...where tag is the name of the plain text output file, while ref1 ref2 ref3 ... are the PMIDs of the papers you want to extract the references from. The output will look like the example from the previous section and is therefore ready to be used as ENQUIRE input.
DISCLAIMER: if the references are not annotated into the Pubmed's API, The script will silently return no match - this may go unnoticed when fetching references from multiple articles. As a rule of thumb, look for "References" in the "page navigation" menu on the Pubmed page of the article of interest to tell the web-annotation status of an article.
LAUNCHING ENQUIRE
-
Before running an actual task, take a look at
ENQUIRE_methods_overview.png: the figure briefly illustrates the main steps of the algorithm. -
In the next exemplary code snippet, we assumed you cloned this repository and
ENQUIREis your current working directory. -
IMPORTANT NOTE: it is highly recommended to get an NCBI API_KEY before running ENQUIRE. Getting one is very easy. You can then copy the API key and enter it as an environmental variable on the command line, like so:
export NCBI_API_KEY=your_api_key_hereThis will ensure your API KEY is passed as an environmental variable to all ENQUIRE runs within the same terminal session.
- you can inspect the code Help section by running (from the
ENQUIREdirectory)./ENQUIRE.sif ENQUIRE.sh -h:
####################################################################################
Expanding Networks by Querying Unexpectedly Inter-Related Entities
####################################################################################
####################################################################################
Usage: ./ENQUIRE.sif ENQUIRE.sh [script_arguments]
Legend: [-flag_short|--flag_long|config file variable, if available]:
[-p|--path|wd] = the path to the working directory (wd), where the output directory will be written in.
It must be the ENQUIRE main folder, with ./code and ./input as subfolders.
The default is the current working directory.
[-i|--input|to_py] = input.txt: a 'seed' input text file containing one PMID per line.
It can be obtained from a PubMed querying specifying 'PMID' as the download format option.
A minimun of 3 entries is required, but a list at least a few dozens articles is highly recommended.
[-t|--tag|tag] = A tag definining the task.
It must be an alphanumeric string (underline_spaced_words are accepted).
[-j|--ncores|ncores] = The max number of CPU cores to be used.
Default is 6.
[-c|--combine-set|comb] = how many k entities should be intersected to construct a query?
3: loose searches, 4: moderate (default), 5: very strict queries.
[-r|--representativeness|thr] = representativeness threshold (%) for a subgraph to be included in the network expansion steps (default: 1 %).
Example: if a subgraph contains nodes exclusively mentioned in 1 paper out of a total of 100, that subgraph has a 1% representativeness.
[-a|--attempts|A] = how many query attempts (i.e. k-sized graphlets) should be run to connect any two network communities?
1: conservative, 2: moderate (default), 3: greedy.
[-k|--connectivity|K] = minimal community connectivity (K), which applies to any expansion-derived entities:
each gene/MeSH term must be connected to at least K original communities to be incorporated in the expanded network - default: 2.
[-e|--entity|etype] = which entity type ('gene','MeSH') are you interested into? Omit or 'all' to textmine both entities.
[-f|--config] = if a config file is being used, specify its full path (e.g. input/textmining_config.txt).
This option overwrites any parameter set by a different option.
[-w|--rscript|rscript] = path to the Rscript compiler. If using 'ENQUIRE.sif', it defaults to the containerized version of R.
[-d|--inputdata|sd] = path to the input data folder. If using 'ENQUIRE.sif', it defaults to the containerized input folder.
WARNING: this option is still under development, to allow users to set different species targets
and subsequently change the H.s. specific metadata.
[-m|--cellentitymodule|CELLTAGSBOOL] = Boolean, enable removing of character spans tagged as cell lines or types (e.g. 'CD8+ T-cell')?
Default: False.
[-h|--help] = print this help message.
You might be seeing this Help because of an input error.
####################################################################################
Let's set up an example: we want to extract biomedical information from publications dealing with chemically-induced colitis in melanoma patients undergoing checkpoint-inhibitors therapy. Our ENQUIRE job might then look something like
# assuming the `apptainer` location is in your PATH variable and you did `cd ENQUIRE` or `ENQUIRE.sif` is in your working directory
./ENQUIRE.sif ENQUIRE.sh -t ICI_and_Colitis -i test_input/pmid-ICI_and_Colitis.txtWhere all the other parameters described in the Help message of ENQUIRE.sh are set to default values. The passing of the parameters could be easen by using the ENQUIRE_config.txt file that resides in the main ENQUIRE directory: the left hand side of each variable assignment must be kept unchanged, while the right hand side can be tweaked according to one's needs. Additional information on the parameters are given in ENQUIRE_flowchart.png. Then, the program can be launched by running:
# assuming the `apptainer` location is in your PATH variable and you did `cd ENQUIRE` or `ENQUIRE.sif` is in your working directory
./ENQUIRE.sif ENQUIRE.sh -f ENQUIRE_config.txtEXPLANATION OF THE OUTPUT DATA STRUCTURE
-
Provided a recognisable
taghas been passed to textmining algorithm, a typical output would produce a folder namedtmp-tag, which in turn contains as many subdirectories as the number of steps/iterations performed. For example, if the algorithm performed- Reconstruction of a Gene/Mesh network from the original set of papers;
- One query expansion and network reconstruction as the Gene/Mesh network was not fully connected yet;
- One query expansion and network reconstruction as the gene-gene network was not fully connected yet, then stopped;
Then there will be three subfolders, namely
tag,tag_subgraph_expansion1,tag_subgraph_expansion2. The counter attached to folders and file names records the subsequent attempts to the expansion and reconstruction of co-occurence networks.Typically, within each of these sub-folders/iterations, three pairs of edge and node tables can be found, respectively corresponding to "Complete" (Gene/Mesh), "Gene"- and "Mesh"-only networks (TSV files). These files can be easily imported in Cytoscape or similar graph visualization tools.
Whenever it wasn't possible to obtain one or more of the aforementioned networks, the pipeline should print a message with information on the most meaningful files to look at. It is worth mentioning that the file
tag...Complete_literature_links.tsvwithin each subfolder allows fast retrieval of specific edge-associated papers by means of encoded hyperlinks.The batch of queries that were tested in each iteration is stored in
tag...ordered_queries.tsvwithin each respective subfolder. Additional meta-data can be explored under thedata/subfolder. Besides node and edge tables for individual subgraphs (i.e. gene/MeSH of gene-only connected components), here you could also explore how the original co-occurrence multigraph looked like, before the network-based test statistics (tag...edge_list_allxall.tsv).Furthemore, under
tmp-tag, the filesource_pmids.txtcontains all the inspected articles for the given ENQUIRE job. These can also be consulted specifically for each iteration undertmp-tag/efetch_inputs. Starting from release v4.0.0, this subdirectory also contains literature metadata for all ierations underCitationToPMID_record.tsv.Please don't hesitate to contact us for any clarification on the purposes of any file.
-
Interactive .html networks
It is also possible to visually inspect Gene-MeSH networks and the reduced networks containing only cliques in two .html files, respectively stored within each iteration's subfolder as
tag...interactive_Gene-MeSH_Network.htmlandtag...interactive_Cliques_Network.html.
EXECUTING POST-HOC ANALYSES
- Run
./ENQUIRE.sif context_aware_gene_sets.R [options]to perform automatic annotation of gene sets, using ENQUIRE-generated, Gene/MeSH edge and node tables and Fuzzy-C-Means (FCM). See the original manuscript for further information.
Usage: ./ENQUIRE.sif context_aware_gene_sets.R [options]
Options:
-w PATH, --directory=PATH
Output directory [default to current working directory]
-e PATH, --edgetable=PATH
Path to an ENQUIRE-generated, Gene/MeSH edge table file (required)
-n PATH, --nodetable=PATH
Path to an ENQUIRE-generated, Gene/MeSH node table file (required)
-t TAG, --tag=TAG
tag prefix for all output files (default to 'ENQUIRE')
-o MODALITY, --modality=MODALITY
node embedding modality used for clustering.
Default is node2vec+ (Liu et al. 2023), using `ztPois.cdf` as weights, as implemented in https://github.com/krishnanlab/PecanPy.
Type 'invlogweight' to reproduce the method described in ENQUIRE's original publication (Musella et al. 2025).
--num-walks=NUMWALKS
node2vec parameter. Number of walks per source. (default: 150)
--walk-length=WALKLENGTH
node2vec parameter. Length of walk per source. (default: 150)
--n2vp=N2VP
node2vec parameter. Return hyperparameter. (default: 1)
--n2vq=N2VQ
node2vec parameter. Inout hyperparameter. (default: 2)
--window-size=WINDOWSIZE
node2vec parameter. Context size for optimization. (default: 10)
--dimensions=DIMENSIONS
node2vec parameter. Number of dimensions. (default: 32)
-d PARAMETER, --membdeg=PARAMETER
minimal membership degree for gene-to-cluster association (default: 0.05), range [0-1]
-r PARAMETER, --round=PARAMETER
Should membership degrees be rounded to the first significant digit (helps the stability of the results)?
default: True [T,F]
-s PARAMETER, --setsize=PARAMETER
minimal gene set size (default: 2)
-v VARIANCE, --varthreshold=VARIANCE
Dimensionality reduction based on the chosen proportion of Variance
observed upon PCA-transforming the inverse-log-similarity between nodes (default: 0.99. range [0-1]).
Set it to 1 to use untrasformed, scaled node similarities.
-m MESH, --meshxgs=MESH
How many MeSH terms which are closest to the cluster centroids should be used to describe a gene set? (default:3)
-p PATH, --netpathdata=PATH
Path to 'ENQUIRE-KNet_STRING_RefNet_Reactome_Paths.RData.gz' (required).
If using the ENQUIRE.sif singularity image, the default path should point to the containerized copy of the file.
-h, --help
Show this help message and exit
- You can use the exemplary output files contained in
tmp-Ferroptosis_and_Immune_Systemto test the script. As of release v4.0.0 the default node embedding modality isnode2vec+(Liu et al. 2023). Set-o invlogweightfor original behaviour. For comparison, both modalities have been precomputed and distributed undertmp-Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System/.
# assuming the `apptainer` location is in your PATH variable and you did `cd ENQUIRE` or `ENQUIRE.sif` is in your working directory
./ENQUIRE.sif context_aware_gene_sets.R -e tmp-Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System_Complete_edges_table_subgraph.tsv
-n tmp-Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System_Complete_nodes_table_subgraph.tsvThe output will be saved in the default-tagged spreadsheet file ENQUIRE_context_aware_gene_sets.xlsx as well as a plot showing the reconstructed gene sets as a PNG image. Please note that the script might last quite long, due to the FCM algorithm.
- Run
./ENQUIRE.sif context_aware_pathway_enrichment.R [options]to perform topology-based, pathway enrichment analysis using SANTA, Reactome H. sapiens pathways, and STRING's H. sapiens, physical PPI network, using ENQUIRE-generated, gene-gene edge table. See the original manuscript for further information.
Usage: Rscript code/context_aware_pathway_enrichment.R [options]
Options:
-w PATH, --directory=PATH
Working directory (default to current working directory)
-o PATH, --outdirectory=PATH
Output directory (default to current working directory, and must preexist)
-n PATH, --netpathdata=PATH
Path to 'ENQUIRE-KNet_STRING_RefNet_Reactome_Paths.RData.gz' (required).
If using the ENQUIRE.sif singularity image, the default path should point to the containerized copy of the file.
-e PATH, --edgetable=PATH
Path to an ENQUIRE-generated, gene-gene edge table file (required).
-c PARAMETER, --cores=PARAMETER
max number of cores used (PSOCK parallelization) (default: 4), >1 recommended.
-t TAG, --tag=TAG
tag prefix (default to 'ENQUIRE').
-s PARAMETER, --setsize=PARAMETER
maximum Reactome pathway size (default: 100, minimum 3).
-p PARAMETER, --permutations=PARAMETER
number of permutations to infer KNet null distribution
(default: 100, the higher the more accurate the test statistics).
-f PARAMETER, --padjust=PARAMETER
P-value adjustment method, must be one of [holm, hochberg, hommel, bonferroni, BH, BY, fdr, none].
Default: holm.
-q QSCORENET, --qscorenet=QSCORENET
Do you want to save a copy of the STRING network in GRAPHML format with ENQUIRE-inferred QScores as node weights?
default: False [T,F]
-h, --help
Show this help message and exit
- You can use the exemplary output files contained in
tmp-Ferroptosis_and_Immune_Systemto test the script (we reduce the number of tested pathways with thesparameter to speed up the process):
# assuming the `apptainer` location is in your PATH variable and you did `cd ENQUIRE` or `ENQUIRE.sif` is in your working directory
./ENQUIRE.sif context_aware_pathway_enrichment.R -e tmp-Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System_Genes_edges_table_subgraph.tsv -s 30The output will be saved in the default-tagged spreadsheet file ENQUIRE_context_aware_pathway_enrichment.xlsx, together with two PNG images showing the test statistics p-value distribution and the correlation between the Node score and degree. Please note that the script might take quite long to finish, and it benefits from a high performance computer, if available.
UPDATE (September 2025): TRANSFORM ENQUIRE NETWORKS INTO GRAPH DATABASES
The latest Apptainer and Docker images also retrieve bibliographic data associated to queried PMIDs and are shipped with Community Edition (v5.25), allowing for easy graph database construction starting from ENQUIRE's *_Complete_* TSV files. The SIF image is complemented with the shell script ENQUIRE2KG.sh(also available in the GitHub repository),orchestrating the database construction and initiation. If you downloaded the script from FigShare, remember to make ENQUIRE2KG.sh executable via chmod +x. In short, the ENQUIRE2KG.sh
- creates (if not previously existing) a
enquire2kg-tagdirectory and mounts it under a containerized path in which the graph database will outputed; - converts ENQUIRE's Complete edge and node files into Neo4j-friendly CSV files;
- uses
neo4j-adminto establish a graph database and test its functionality; - runs
neo4j consoleto establish a (remote) connection via http://localhost:7474/.
Unfortunately, ENQUIRE2KG does not work with ENQUIRE output generated using the original image!.
############# TURN ENQUIRE NETWORKS INTO KNOWLEDGE GRAPHS USING NEO4J - UTILITY SCRIPT ##############
Path to code: /path/to/ENQUIRE2KG.sh
####################################################################################
Expanding Networks by Querying Unexpectedly Inter-Related Entities
####################################################################################
####################################################################################
Usage: ENQUIRE2KG.sh [script_arguments]
Legend: [-flag_short|--flag_long|config file variable, if available]:
[-i|--image|image] = the path to the singularity image file (.sif). Defaults to 'ENQUIRE.sif'.
[-p|--path|wd] = the path to the working directory (wd), where the output directory will be written in.
It must be the ENQUIRE main folder, with ./code and ./input as subfolders.
The default is the current working directory.
[-t|--tag|tag] = A tag definining the task.
It must be an alphanumeric string (underline_spaced_words are accepted).
[-d|--inputdir|input] = path to the input data folder. It must point to an ENQUIRE-generated directory containing co-occurrence network data
(e.g https://github.com/Muszeb/ENQUIRE/tree/main/tmp-Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System).
[-f|--config] = if a config file is being used, specify its full path (e.g. input/textmining_config.txt).
This option overwrites any parameter set by a different option.
[-h|--help] = print this help message.
You might be seeing this Help because of an input error.
####################################################################################
Here is how you can test this with the example output data tmp-Ferroptosis_and_Immune_System available in the GitHub repository.
# assuming the `apptainer` location is in your PATH variable, you did `cd ENQUIRE`, and `ENQUIRE.sif` is in your working directory
./ENQUIRE2KG.sh -i ENQUIRE.sif -t Ferroptosis_and_Immune_System -d tmp-Ferroptosis_and_Immune_System/Ferroptosis_and_Immune_System_subgraphs_expansion2/ Eventually, it should print the following:
[...previously printed messages and log data...]
[...here neo4j console is executed...]
Starting Neo4j.
2025-09-23 16:15:59.033+0000 INFO Logging config in use: File '/etc/neo4j/user-logs.xml'
2025-09-23 16:15:59.048+0000 INFO Starting...
2025-09-23 16:15:59.650+0000 INFO This instance is ServerId{68729ca1} (68729ca1-634f-409d-83b9-0a41c2ce8fc2)
2025-09-23 16:16:00.448+0000 INFO ======== Neo4j 2025.04.0 ========
2025-09-23 16:16:01.398+0000 INFO Anonymous Usage Data is being sent to Neo4j, see https://neo4j.com/docs/usage-data/
2025-09-23 16:16:01.517+0000 INFO Bolt enabled on ------0-01vpnedf0-0002d-admin-nocnocnoc-us-uforms.gbc.criteo.com:7687.
2025-09-23 16:16:01.989+0000 INFO HTTP enabled on 0.0.0.0:7474.
2025-09-23 16:16:01.990+0000 INFO Remote interface available at http://localhost:7474/
2025-09-23 16:16:01.991+0000 INFO id: D5F5F166C7623343039979E7682345DA2E4F9E9D41BB6A402A579BAB34544889
2025-09-23 16:16:01.991+0000 INFO name: system
2025-09-23 16:16:01.991+0000 INFO creationDate: 2025-09-23T16:15:27.007Z
2025-09-23 16:16:01.992+0000 INFO Started.
As long as the session stays open (or detached via screen or tmux), the local HTTP port http://localhost:7474/ is pointing to Neo4j Browser, allowing for inspection and querying of the ENQUIRE-derived graph database .
You can also use Neo4j Desktop - here's how:
- Initialize a "New Project", then add a "Remote connection";
- Keep everything as default and hit "Next";
- Set a username and password;
- Click on "Connect", wait for the Remote DBMS to be active, the click "Open" to access Neo4j Browser.
Suppose you want to know which entities are related to the concept of neoplasms within a broader search concerning the interrelation between ferroptosis and immune system. As a proxy, we can write a query that matches MeSH terms containing the word "neoplasm" and that returns genes (orange), MeSH (turquoise), and Literature (red) nodes from the example ENQUIRE network like so:
MATCH (m:MeSH)-[:HAS_SOURCE]-(l:Literature)-[:HAS_SOURCE]-(g:Gene)
WHERE m.ENTITY =~ '.*neoplasm.*'
RETURN m,l,gyielding
Suppose you have conducted a differential expression analysis and obtained a list of differentially expressed genes (DEGs). Researchers often want to compare their DEG list with findings from previously published studies to contextualize their results. However, traditional literature searches that explicitly include specific DEGs as search terms are susceptible to cherry-picking bias, where curators may (unconsciously) select papers that confirm their expectations.
With ENQUIRE, you can first query for all papers relevant to your experimental topic without specifying individual genes, then extract significantly co-occurring entities, and finally examine the literature support and co-occurrence patterns of your DEGs. This workflow is reproducible and it minimizes selection bias. We employed such validation strategy in this publication. Here's how to construct such a query, using genes contained in the example ENQUIRE network (we also demonstrate additional filtering options such as Year of publication):
MATCH (g1:Gene)-[:CO_OCCURS]-(g:Gene)-[:HAS_SOURCE]-(p:Literature)
WHERE any(x IN g.ENTITY WHERE x IN [
'CD36',
'FAM126A',
'ROS1',
'SLC7A11',
'GPX4',
'IFNA1',
'ACSL3', // will not appear in the output network
'ACSL4',
]) AND p.Year > 2023
RETURN g,pyielding
POSSIBLE SOURCES OF ERRORS
-
Test the command
which apptainer: ifapptainerlocation is not in yourPATHvariable, you need to invoke it by specifying its path, that is doing/path/to/apptainer run /path/to/ENQUIRE.sif ...instead of./path/to/ENQUIRE.sif ... -
Test the command
awk '/MemAvailable/ {print $2}' /proc/meminfoon your command line: this is the way ENQUIRE checks the available RAM on Linux systems, in order to avoid overflows. Make sureawkis installed on your system. If you witness a non-awk related issue, contact us with information on your system and possible solutions to alternatively track the available memory on your OS. -
When computing large networks, an error related to the default
Stack Sizecan potentially appear, especially when running R scripts, such asError: C stack usage is too close to the limit. In this case, one shall set a higher stacksize to allow the script to complete, viaulimit -s NWhere
Nshall be a size expressed in Kb to set as the maximum stack size. You could first check the number returned byCstack_info()in an active R shell. You can read more about the issue here and here. -
If you get a
curl-related error of the form
HTTP/1.1 400 Bad Request
WARNING: FAILURE ( Thu Feb 15 10:24:24 AM CET 2024 )It means that NCBI is not willing to process your request. Sometimes, this can be due to a server hiccup, but most times using an API KEY fixes the issue. Getting one is very easy. You can then copy the API key and enter it as an environmental variable on the command line, like so:
export NCBI_API_KEY=your_api_key_hereThis will ensure your API KEY is passed as an environmental variable to all ENQUIRE runs within the same terminal session.
REPRODUCIBILITY
Two identical runs of ENQUIRE should produce identical co-occurrence networks and query formulations, as long as NCBI made no updates on the MeSH indexing of PubMed articles involved during the time that separates the two runs. In that case, the later run should produce queries that are supersets of the earlier one.
The exemplary output directory tmp-Ferroptosis_and_Immune_System was generated between 10.10.23 and 11.10.23 and has been used to generate the results illustrated in the ENQUIRE manuscript. The output was found to be reproducible on 3 different Linux Machines (2 Ubuntu and 1 ARCH-LINUX distributions).
The use of a containerized image (the SIF file) should guarantee the reproducibility irrespective of the host operating system. While several other tests on different operating systems show consistency in the network reconstruction steps, we cannot rule out the possibility that the network expansion step might diverge in some cases, irrespective of the internally coded, fixed seeds.
IMPORTANT INFORMATION ON PUBMED ACCESSIBILITY
As of 21.11.22, important changes have been applied to NCBI's e-utilities. In particular, it is now impossible to stream all records exceeding 10,000 PMIDs from any particular query to the PubMed database. This required to redesign the use of the e-utilities. While it's overall functionality was still preserved, we cannot guarantee the retrieval of all matching records, if the network-based queries obtained by intersecting relevant entities match more than 10,000 records (typically, this is a rare event when intersecting at least 4 distinct entities).
TESTED OPERATING SYSTEMS
Below is a list of operating systems tested for installation and running of Singularity/Apptainer and ENQUIRE:
- Linux 6.4.12-arch1-1 #1 SMP PREEMPT_DYNAMIC (x86_64 GNU/LINUX)
- Linux 5.15.0-84-generic #93~20.04.1-Ubuntu SMP (x86_64 GNU/LINUX)
- Virtual Machine created using Oracle Virtual Box and running Ubuntu 20 LTS
- MacOS Catalina 15.7 (Docker implementation, mid-2012 MacBook Pro)
- Windows 10 (Docker implementation)












