Skip to content
Snippets Groups Projects
README.md 2.41 KiB
Newer Older
Mouhamadou Ba's avatar
Mouhamadou Ba committed
>> This is work in progress, contact us if you have any questions
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
# About

Mouhamadou Ba's avatar
Mouhamadou Ba committed
This project is designed to extract entities (i.e., `taxa`, `phenotypes`, `habitats`, `disease names`, `hosts`, `pathogen`, `vector`, `dates` and `geographic names`) from textual data for the purpose of scientific watch.
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
The project contains a workflow based on Framework [AlvisNLP](https://github.com/Bibliome/alvisnlp) and uses the Ontobiotope Ontology and NCBI taxonomy.
Mouhamadou Ba's avatar
Mouhamadou Ba committed
## Usage
Mouhamadou Ba's avatar
Mouhamadou Ba committed
The workflow works on command line (e.g., `GNU bash, version 4.4.x`) with `singularity version 3.4.x` 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
installed on your computer ([how to install singularity ?](https://sylabs.io/guides/3.4/user-guide/quick_start.html#quick-installation-steps)). 
It is compatible with [`AlvisNLP version 0.7.1`](https://github.com/Bibliome/alvisnlp/tree/0.7.1) provided via [singularity](https://sylabs.io/) images/containers. 
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
Run the following steps to test the workflow,
a test corpus is provided here `corpus/pesv/Xylella-test/txt/`, `16Go` RAM is required to process the test corpus).
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
1. clone the project.
Mouhamadou Ba's avatar
Mouhamadou Ba committed

```
git clone https://forgemia.inra.fr/mandiayba/pesv-tm.git
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
cd pesv-tm
Mouhamadou Ba's avatar
Mouhamadou Ba committed
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
2. pull the singularity image of AlvisNLP. 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
> `login` and `password` are required to pull the AlvisNLP singularity image from forgemia, please contact the maintainer if you don't have permissions.
Mouhamadou Ba's avatar
Mouhamadou Ba committed

```
cd pesv-tm/softwares
Mouhamadou Ba's avatar
Mouhamadou Ba committed

Mouhamadou Ba's avatar
Mouhamadou Ba committed
singularity pull --docker-login alvisnlp.sif oras:registry.forgemia.inra.fr/migale/tm-tools-packages/sif/alvisnlp:v0.0.4
Mouhamadou Ba's avatar
Mouhamadou Ba committed
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
3. run the workflow. 
Mouhamadou Ba's avatar
Mouhamadou Ba committed
> execute the workflow with the test corpus `corpus/pesv/Xylella-test/txt/`, results are stored into `corpus/pesv/Xylella-test/`
Mouhamadou Ba's avatar
Mouhamadou Ba committed

```
Mouhamadou Ba's avatar
Mouhamadou Ba committed
cd pesv-tm/

Mouhamadou Ba's avatar
Mouhamadou Ba committed
softwares/alvisnlp.sif -J-Xmx32G -verbose -cleanTmp \
-alias input corpus/pesv/Xylella-test/txt/ \
-outputDir corpus/pesv/Xylella-test/ \
-entity ontobiotope resources/BioNLP-OST+EnovFood \
-feat inhibit-syntax inhibit-syntax \
plans/PESV_workflow.plan
Mouhamadou Ba's avatar
Mouhamadou Ba committed
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
4. See results from `corpus/Xylella/visualisation_html`

Mouhamadou Ba's avatar
Mouhamadou Ba committed
*. You may browser the results by using option `-browser`: run the following command, check the logs and goto [http://localhost:8878](http://localhost:8878) 
Mouhamadou Ba's avatar
Mouhamadou Ba committed

```
cd pesv-tm/

softwares/alvisnlp.sif -J-Xmx32G -verbose -cleanTmp \
-browser
-alias input corpus/pesv/Xylella-test/txt/ \
-outputDir corpus/pesv/Xylella-test/ \
-entity ontobiotope resources/BioNLP-OST+EnovFood \
-feat inhibit-syntax inhibit-syntax \
plans/PESV_workflow.plan
```

Mouhamadou Ba's avatar
Mouhamadou Ba committed
## Maintainer
Mouhamadou Ba's avatar
Mouhamadou Ba committed
Mouhamadou Ba : mouhamadou.ba@inrae.fr