Sorry if I'm jumping from one topic to another. I'm that kind of person who dig just one in one topic a day and continue it tomorrow. So I wanted to figure out what is the best way to represent the date data as the input of a categorisation. I learned that there are different ways. You can turn the date to an ordinal number as the elapsed days. The better if you bound it and count the days between the min and max dates. You can turn it into binary categorise. I don't really understand that but maybe if you read the article (link above) you will do. My favourite is when you turn the date to multiple ordinal number. Turn the days a number which is itself a number, turn the month and years. I like it because I assume that with this representation you can detect trends in the time dimension. The link above is an answer for a more generic question. The date is a continuous value not a discrete that is why you should turn it into discrete. Otherwise the algorithms will not catch the meaning of it because there is no way to calculate distance between two dates. I have to prove and learn more about this topic. ;)

















