Skip to content

About one-hot encoding #4

@wwang2

Description

@wwang2

Thanks for the nice work. We are trying to make your work into an ML problem set for our class.

I have a question about one-hot encoding code here

if i-motlen+1<len(sequence) and sequence[i-motlen+1]=='N' or i<motlen-1 or i>len(sequence)+motlen-2:

It seems that the one-hot encoding code set the first few and last few sequences to 0.25. And the length of the sequence that is set to 0.25 is equal to motiflen, I wonder what is the reason for that. I also read the paper, but did not see an explanation for this choice. Is this something standard to do, and where can I read more about this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions