Self-Attention Mechanism is Incredibly Inaccurate

**Description**
As of now, the Self-Attention Mechanism is using a simple Cosine Similarity/Dot Product Similarity Algorithm with a selectivity threshold. This is proven to be very inaccurate and cannot be reverse searched as the incorrect selections will invalidate the integrity of the database.

**To Reproduce**
To reproduce the behavior:
```python
from MAGIST.NLP.SelfAttention import TextPreprocessing

t = TextPreprocessing("config.json")

out = t.__call__("Hello, my name is John. I am a dummy script.")

for i in out:
    print(i)
```

**Output**
```python
[5.967583262815608, 'hello', 'Good']
[3.7432159225461947, 'my', 'Not']
[2.520566459677965, 'name', 'Not']           ---> Incorrect; This should be "Good"
[5.6983463875519735, 'is', 'Not']
[4.848795399908668, 'john', 'Not']
[6.083478457022617, 'i', 'Good']
[9.443521265161667, 'am', 'Good']
[8.284217064260607, 'a', 'Good']             ---> Incorrect; This should be "Not"
[8.485852410408823, 'dummy', 'Good']
[2.466104715281189, 'script', 'Not']         ---> Incorrect; This should be "Good"
```

**Expected behavior**
A clear and concise description of what you expected to happen.

**Additional context**
This was expected since this algorithm is very primitive. Perhaps, a better positional embedding or an end-to-end LSTM-Dense neural network would improve its performance.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Self-Attention Mechanism is Incredibly Inaccurate #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Self-Attention Mechanism is Incredibly Inaccurate #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions