We will tokenize this collection of documents and represent them as vectors (rows) of a matrix with |D| x F shape, where |D| is the cardinality of the document space, or how many documents we have and the F is the number of features, in our example it is represented by the vocabulary size.
So the matrix representation of our vectors above is:
I ommited the zero-values elements of the row.
If we would decide to check the most relevant words for this place, by using the tf-idf I could see that the place has a nice hot chocolate drink (0.420955 <= chocolate quente ótimo), the soft drink nega maluca is also delicious (0.315716 - nega maluca uma delicia), its Cheese bun is also quite good (0.252573 - pao de queijo muito bom).
The source code of this example is also available.