
You are trying to make a model that use numerical value as data. We can directly use the data value because it is already numerical, but we don't. We

We talked about attention method. Now let's construct the architecture of transformer!First, it uses Multihead Self-Attention method to get the contex