-
Notifications
You must be signed in to change notification settings - Fork 28
Open
Description
Thank you for the development of ProFET!
I wanted to try it out but I ran into some trouble. It would be great if you could point me towards where I am going wrong.
I am using python 3.4 and I have have installed all the dependencies mentioned in the README.md. I have the following folder structure where feat_extract is my working directory:
feat_extract/
|_pipeline.py
|_other ProFET files...
|_test_seq/...
|_train/
| |_A/
| | |_train_sequences_A.fasta
| |_B/
| |_train_sequences_B.fasta
|_test
|_A/
| |_test_sequences_A.fasta
|_B/
|_test_sequences_B.fasta
The fasta files were created with the following set of commands:
cd ./test_seq/Extracellular/
tail -n 1000 location-secreted_keyword-AKW-0964_reviewed_taxon-Tetrapoda_fragment-no_id-0.9.fasta > ../../train/A/train_sequences_A.fasta
tail -n 1000 NOT-secreted_NOT-extracellular_reviewed_taxon-Tetrapoda_fragment-no_id-0.5.fasta > ../../train/B/train_sequences_B.fasta
head -n 1000 location-secreted_keyword-AKW-0964_reviewed_taxon-Tetrapoda_fragment-no_id-0.9.fasta > ../../test/A/test_sequences_A.fasta
head -n 1000 NOT-secreted_NOT-extracellular_reviewed_taxon-Tetrapoda_fragment-no_id-0.5.fasta > ../../test/B/test_sequences_B.fasta
cd ../../
When running the command:
python pipeline.py --trainingSetDir ./train --testingSetDir ./test --trainFeatures True --testFeatures True --classType dir
I get the following error message:
<cProfile.Profile object at 0x107745db0>
Starting to extract features from training set
dirr change to: ./train
Multiclass fasta_files list found: []
Features generated
Removing any all zero features
df.shape: (0, 0)
df_cleaned shape: (0, 0)
Done
Extracted training data features
Training predictive model
Traceback (most recent call last):
File "pipeline.py", line 171, in <module>
res = profiler.runcall(pipeline)
File "/Users/charles/anaconda/envs/py34/lib/python3.4/cProfile.py", line 109, in runcall
return func(*args, **kw)
File "pipeline.py", line 90, in pipeline
model, lb_encoder = trainClassifier(filename=trainingDir+'/trainingSetFeatures.csv',normFlag= False,classifierType= classifierType,kbest= 0,alpha= False,optimalFlag= False) #Win
File "/Users/charles/Downloads/feat_extract/Model_trainer.py", line 114, in trainClassifier
features, labels, lb_encoder,featureNames = load_data(filename, 'file')
File "/Users/charles/Downloads/feat_extract/Model_trainer.py", line 36, in load_data
df = pd.read_csv(dataFrame, index_col=[0,1]) # is index column 0 in multiindex as well?
File "/Users/charles/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py", line 474, in parser_f
return _read(filepath_or_buffer, kwds)
File "/Users/charles/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py", line 250, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/Users/charles/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py", line 566, in __init__
self._make_engine(self.engine)
File "/Users/charles/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py", line 705, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/Users/charles/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py", line 1072, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 350, in pandas.parser.TextReader.__cinit__ (pandas/parser.c:3173)
File "pandas/parser.pyx", line 594, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:5912)
OSError: File b'./train/trainingSetFeatures.csv' does not exist
It complains that ./train/trainingSetFeatures.csv' does not exist. I see that a file with this name is being created in the train folder, however it is a table with only column names (no rows).
Thank you for your help.
Metadata
Metadata
Assignees
Labels
No labels