Photo by Sven Brandsma on Unsplash
word embedding and internet vs deep work
This is part of 'I just knew it today!' series. Episode 3 | 9th of August 2022
2 min read
Machine can't understand strings of words, they understand number that represent the word and its implicit information. To enable deep learning on literature/language, we need word embedding to convert sequence of words into vector of number that can be understood by the machine.
Here is a combination of layer that can be utilized to yield word embeddings.
credit to: Orhan G. Yalcin in Mastering word embeddings in 10 minutes with tensorflow
text vectorization layer
A text vectorization layer will standardize the input, and ultimately give number into token of the words in the input. To put in a perspective, token defined as one word or combination of a word its adjacent words. This technique limits the amount of token generated from the input. This allow special characteristic of languages, grammar, to be embed to the vector.
This layer will see the connection between the tokens, and generate another vector from it. This vector will then be the exact representation of the words.
Dense layer is a fully connected neural network that will optimize the result of the embedding to the label.
Internet vs Deep work
I am currently reading Carl Newport's Deep Work. I've just recently read chapter 2, Deep Work is Rare. Carl reference two authors (which names I forgot), that predict and judge that in the modern world, internet will become the standard of every business that exist. But, internet enables platform that make us prone to avoid deep work, such as email. Email gives some sense of productivity and certainty to workers. Led them to work without deeply immersed to the subject that they are working on, and being drawn in a life that set by other people responses or requests only. This is not advantageous as in chapter 1, Carl points out that deep work is highly valuable skill in this modern world.